Sales Time Series Data Analysis Using a Combination of Supervised andUnsupervised Learning for Business Strategy Optimization at PT X

Authors

Abstract

PT X is an industrial gas company with an extensive operational 
network, yet it faces challenges in analyzing sales performance 
and price inconsistencies across regions. The lack of a 
data-driven analytical system makes it difficult for the company to 
identify sales trends, product performance, and branch 
contributions objectively. This study aims to develop a data-driven 
analytical system to support business decision-making. 
Sales trend analysis was conducted using SARIMAX, compared 
with SARIMA and XGBoost, with SARIMAX showing the highest 
accuracy. Evaluation using MAPE and RMSE indicated error 
rates of 18% for SARIMA, 14% for XGBoost, and 5% for 
SARIMAX. Branch clustering was tested using K-Means and 
GMM, with GMM selected due to faster execution and better 
clustering quality (Silhouette Score 0.623 vs 0.622). Additionally, 
Linear Regression and Random Forest were applied for price 
standardization across branches, with Linear Regression 
providing more consistent predictions, high R², and relatively low 
errors. The results demonstrate an effective analytical system that 
supports more accurate and consistent business decision-making.

Published

2026-06-15