Sales Time Series Data Analysis Using a Combination of Supervised andUnsupervised Learning for Business Strategy Optimization at PT X
Abstract
PT X is an industrial gas company with an extensive operational
network, yet it faces challenges in analyzing sales performance
and price inconsistencies across regions. The lack of a
data-driven analytical system makes it difficult for the company to
identify sales trends, product performance, and branch
contributions objectively. This study aims to develop a data-driven
analytical system to support business decision-making.
Sales trend analysis was conducted using SARIMAX, compared
with SARIMA and XGBoost, with SARIMAX showing the highest
accuracy. Evaluation using MAPE and RMSE indicated error
rates of 18% for SARIMA, 14% for XGBoost, and 5% for
SARIMAX. Branch clustering was tested using K-Means and
GMM, with GMM selected due to faster execution and better
clustering quality (Silhouette Score 0.623 vs 0.622). Additionally,
Linear Regression and Random Forest were applied for price
standardization across branches, with Linear Regression
providing more consistent predictions, high R², and relatively low
errors. The results demonstrate an effective analytical system that
supports more accurate and consistent business decision-making.