Analisa Audio Features dengan Membandingkan Metode Multiple Regression dan Polynomial Regression untuk Memprediksi Popularitas Lagu

Authors

  • Billy Faith Susanto Program Studi Informatika
  • Silvia Rostianingsih Program Studi Informatika
  • Leo Willyanto Santoso Program Studi Informatika

Keywords:

Program Ruang, Komunitas, Pendidikan, Fasilitas Umum, Surabaya Utara

Abstract

Songs are artistic works that expresses ideas and emotion in the forms of rhythms, melodies, and harmonies. Songs are the source of huge profit for musicians or artists from commercial view-point. Based on the data from IFPI, the earnings from the music industry in 2019 reached US$20.2 billion, in which 56.1% of them came from streaming revenue. Spotify is one of the largest and most well-known streaming services in the world today.

This research aims to make predictions of popularity from each song according to the audio feature data taken from Spotify's API. The process of prediction will use 2 regression methods, which are Linear Regression and Polynomial Regression. The model will be made using those 2 methods and will be tested with the R2, Adjusted R2, MAE, and MSE metric systems.

From the analysis of the implementation to the program, the Linear Regression method had garnered the average results as follows: 0.23614 for R2, 0.23536 for Adjusted R2, and had average errors 17.38129 for MAE method, 442.31700 for MSE method. Using the Polynomial Regression method, the average results were: 0.31496 for R2, 0.25880 for Adjusted R2, and had average errors 16.47367 for MAE method, 409.76242 for MSE method.

References

[1] Abhigyan. 2020. Understanding Polynomial Regression. Retrieved February 23, 2021. URI = https://medium.com/analytics-vidhya/understanding-polynomial-regression-5ac25b970e18

[2] Armstrong, M. 2020. The world's most popular music streaming services. Retrieved January 6, 2021. URI = https://www.statista.com/chart/20826/music-streaming-services-with-most-subscribers-global-fipp/

[3] Brownlee, J. 2020. How to Perform Feature Selection for Regression Data. URI = https://machinelearningmastery.com/feature-selection-for-regression-data/

[4] da Silva, M. A. S., & Seixas, T. M. 2017. The Role of Data Range in Linear Regression. The Physics Teacher, 55(6), 371–372. DOI = https://doi.org/10.1119/1.4999736

[5] Duke.edu. 2021. Testing the assumptions of linear regression. URI = http://people.duke.edu/~rnau/testing.htm

[6] Grant, P. 2019. Understanding Multiple Regression. URI = https://towardsdatascience.com/understanding-multiple-regression-249b16bde83e

[7] Herremans, D., Martens, D., & Sörensen, K. 2014. Dance Hit Song Prediction. Journal of New Music Research, 43(3), 291–302. DOI = https://doi.org/10.1080/09298215.2014.881888

[8] Institute for Digital Research and Education (n.d.). Coding Systems for Categorical Variables in Regression Analysis. URI = https://stats.idre.ucla.edu/spss/faq/coding-systems-for-categorical-variables-in-regression-analysis-2/#SIMPLE%20EFFECT%20CODING

[9] International Federation of the Phonographic Industry. 2019. annual report. URI = https://www.ifpi.org/our-industry/industry-data/

[10] JMP.com (n.d.). Regression Model Assumptions. URI = https://www.jmp.com/en_us/statistics-knowledge-portal/what-is-regression/simple-linear-regression-assumptions.html

[11] McCarty, K. 2018. Interpreting Results from Linear Regression – Is the data appropriate? Accelebrate.com; Accelebrate. Retrieved February 23, 2021. URI = https://www.accelebrate.com/blog/interpreting-results-from-linear-regression-is-the-data-appropriate

[12] Nijkamp, R. 2018. Prediction of product success: explaining song popularity by audio features from Spotify data. URI = https://essay.utwente.nl/75422/1/NIJKAMP_BA_IBA.pdf

[13] Palmer, P. B., & O’Connell, D. G. 2009. Regression analysis for prediction: understanding the process. Cardiopulmonary Physical Therapy Journal, 20(3), 23–26. URI = https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2845248/

[14] PennState: Eberly College Of Science. 2018. Detecting Multicollinearity Using Variance Inflation Factors | STAT 462. Retrieved February 21, 2021. URI = https://online.stat.psu.edu/stat462/node/180/

[15] Sciandra, M., Spera, I. C. 2019. A model based approach to Spotify data analysis: a Beta GLMM. Journal of Applied Statistics. DOI = https://doi.org/10.2139/ssrn.3557124

[16] Spotify for developers (n.d.). api documentation. URI = https://developer.spotify.com/documentation/web-api/reference/tracks/

[17] Statistics How To. 2016. Breusch-Pagan-Godfrey Test: Definition. URI = https://www.statisticshowto.com/breusch-pagan-godfrey-test/

Downloads

Published

2021-10-13

Issue

Section

Articles