Analisa Audio Features dengan Membandingkan Metode Multiple Regression dan Polynomial Regression untuk Memprediksi Popularitas Lagu

Billy Faith Susanto(1*), Silvia Rostianingsih(2), Leo Willyanto Santoso(3),


(1) Program Studi Informatika
(2) Program Studi Informatika
(3) Program Studi Informatika
(*) Corresponding Author

Abstract


Songs are artistic works that expresses ideas and emotion in the forms of rhythms, melodies, and harmonies. Songs are the source of huge profit for musicians or artists from commercial view-point. Based on the data from IFPI, the earnings from the music industry in 2019 reached US$20.2 billion, in which 56.1% of them came from streaming revenue. Spotify is one of the largest and most well-known streaming services in the world today.

This research aims to make predictions of popularity from each song according to the audio feature data taken from Spotify's API. The process of prediction will use 2 regression methods, which are Linear Regression and Polynomial Regression. The model will be made using those 2 methods and will be tested with the R2, Adjusted R2, MAE, and MSE metric systems.

From the analysis of the implementation to the program, the Linear Regression method had garnered the average results as follows: 0.23614 for R2, 0.23536 for Adjusted R2, and had average errors 17.38129 for MAE method, 442.31700 for MSE method. Using the Polynomial Regression method, the average results were: 0.31496 for R2, 0.25880 for Adjusted R2, and had average errors 16.47367 for MAE method, 409.76242 for MSE method.


Keywords


linear regression; polynomial regression; popularity predictor; Spotify audio features

Full Text:

PDF

References


Abhigyan. 2020. Understanding Polynomial Regression. Retrieved February 23, 2021. URI = https://medium.com/analytics-vidhya/understanding-polynomial-regression-5ac25b970e18

Armstrong, M. 2020. The world's most popular music streaming services. Retrieved January 6, 2021. URI = https://www.statista.com/chart/20826/music-streaming-services-with-most-subscribers-global-fipp/

Brownlee, J. 2020. How to Perform Feature Selection for Regression Data. URI = https://machinelearningmastery.com/feature-selection-for-regression-data/

da Silva, M. A. S., & Seixas, T. M. 2017. The Role of Data Range in Linear Regression. The Physics Teacher, 55(6), 371–372. DOI = https://doi.org/10.1119/1.4999736

Duke.edu. 2021. Testing the assumptions of linear regression. URI = http://people.duke.edu/~rnau/testing.htm

Grant, P. 2019. Understanding Multiple Regression. URI = https://towardsdatascience.com/understanding-multiple-regression-249b16bde83e

Herremans, D., Martens, D., & Sörensen, K. 2014. Dance Hit Song Prediction. Journal of New Music Research, 43(3), 291–302. DOI = https://doi.org/10.1080/09298215.2014.881888

Institute for Digital Research and Education (n.d.). Coding Systems for Categorical Variables in Regression Analysis. URI = https://stats.idre.ucla.edu/spss/faq/coding-systems-for-categorical-variables-in-regression-analysis-2/#SIMPLE%20EFFECT%20CODING

International Federation of the Phonographic Industry. 2019. annual report. URI = https://www.ifpi.org/our-industry/industry-data/

JMP.com (n.d.). Regression Model Assumptions. URI = https://www.jmp.com/en_us/statistics-knowledge-portal/what-is-regression/simple-linear-regression-assumptions.html

McCarty, K. 2018. Interpreting Results from Linear Regression – Is the data appropriate? Accelebrate.com; Accelebrate. Retrieved February 23, 2021. URI = https://www.accelebrate.com/blog/interpreting-results-from-linear-regression-is-the-data-appropriate

Nijkamp, R. 2018. Prediction of product success: explaining song popularity by audio features from Spotify data. URI = https://essay.utwente.nl/75422/1/NIJKAMP_BA_IBA.pdf

Palmer, P. B., & O’Connell, D. G. 2009. Regression analysis for prediction: understanding the process. Cardiopulmonary Physical Therapy Journal, 20(3), 23–26. URI = https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2845248/

PennState: Eberly College Of Science. 2018. Detecting Multicollinearity Using Variance Inflation Factors | STAT 462. Retrieved February 21, 2021. URI = https://online.stat.psu.edu/stat462/node/180/

Sciandra, M., Spera, I. C. 2019. A model based approach to Spotify data analysis: a Beta GLMM. Journal of Applied Statistics. DOI = https://doi.org/10.2139/ssrn.3557124

Spotify for developers (n.d.). api documentation. URI = https://developer.spotify.com/documentation/web-api/reference/tracks/

Statistics How To. 2016. Breusch-Pagan-Godfrey Test: Definition. URI = https://www.statisticshowto.com/breusch-pagan-godfrey-test/


Refbacks

  • There are currently no refbacks.


Jurnal telah terindeks oleh :