Evaluating The Scalability of Big Data Frameworks


David Sanchez
Oswaldo Solarte
Victor Bucheli
Hugo Ordonez


The aim of this paper is to present a method based on the Isoefficiency for assessing the scalability in big data environments. The programs word count and sort were implemented and compared in Hadoop and Spark. The results confirm that isoefficiency presented a linear growth as the size of the data sets was increased. It was experimentally confronted that the evaluated frameworks are scalable and a model of the form Y (s) = β X(s)$ where β ≈[0.47-0.85] <1 was obtained. The paper discuss how the scalability in big data is governed by a constant of scalability (β).


Overview paper