Platform Big Data Analytic Berbasis Apache Spark Bagi Pemula Dalam Menyusun Data Analysis Workflow

Daniel Jeremia(1*), Henry Novianus Palit(2), Andre Gunawan(3),


(1) Program Studi Teknik Informatika, Universitas Kristen Petra Surabaya
(2) Program Studi Teknik Informatika, Universitas Kristen Petra Surabaya
(3) Program Studi Teknik Informatika, Universitas Kristen Petra Surabaya
(*) Corresponding Author

Abstract


Data is a concrete foundation for decision-making. The development of technology, in turn, creates a problem in the number and complexity of data as it requires sophisticated methods to analyze. This calls for the need of big data analytics. Analyzing data quickly, simply, and robustly is now a very high requirement, especially for beginners.

To combat this problem, a platform for big data analytics that is beginner-friendly is proposed in this research. This platform is created with the purpose of simplifying the process of analyzing data easily without the use of programming for beginners. Diagrams/workflows are designed to manipulate data in a drag-and-drop fashion to make it easier for beginners. Furthermore, this platform uses industry-leading technology such as Apache Spark to deal with the problems of big data analytic without being known by the user at all.

A survey/demo of 12 people with 3 different backgrounds, namely commoners, beginners, and experts, is held. The obtained result indicates a positive experience in doing data analysis without programming. An average score of 4.4 out of 5 is given by the participants for how much this platform can simplify the work of data analysis. This big data analytic platform has a huge potential for beginners and professionals alike.


Keywords


big data analytics; data science workflow; apache spark; beginner-friendly data analytics

Full Text:

PDF

References


Chellappan S., and Ganesan D. 2018. Introduction to Apache Spark and Spark Core. In Practical Apache Spark (Berkeley, CA, December 13, 2018). Apress, Berkeley, CA. DOI= https://doi.org/10.1007/978-1-4842-3652-9_3

Ford, A. 2017. Big Data Workflow Automation is Closer Than You Think. URI= https://www.nintex.com/blog/big-data-workflow-automation-closer-think/

Kashyap, V. 2019. Data Science Workflow: From Research Experiments to Business Use-Cases. URI= https://www.progresstalk.com/threads/progress-news-progress-openedge-abl-data-science-workflow-from-research-experiments-to-business-use-cases.192803/

Silva, B. N., Diyan, M., and Han, K. 2018. Big Data Analytics. In Deep Learning: Convergence to Big Data Analytics (Springer, Singapore, December 31, 2018). SpringerBriefs in Computer Science. Springer, Singapore. DOI= https://doi.org/10.1007/978-981-13-3459-7_2

Ulunma. 2020. Coca Cola Leverages Data Analytics to Drive Innovation. URI= https://digital.hbs.edu/platform-digit/submission/coca-cola-leverages-data-analytics-to-drive-innovation/


Refbacks

  • There are currently no refbacks.


Jurnal telah terindeks oleh :