Scalable Computing: Practice and Experience <p>&nbsp;<strong>Welcome to SCPE</strong></p> <p><span style="text-decoration: underline;"><em>Topics of interest</em></span>.&nbsp;The area of scalable computing has matured and reached a point where new issues and trends require a professional forum. SCPE provides this avenue by publishing original refereed papers that address the present as well as the future of parallel and distributed computing. The journal focus on algorithm development, implementation and execution on parallel and distributed architectures, as well on application of parallel and distributed computing to the solution of real-life problems.</p> <p><span style="text-decoration: underline;"><em>Electronic journal</em></span>.&nbsp;SCPE provides immediate open access to its content following the principle that making research freely available to the public supports a greater global exchange of knowledge. &nbsp;We invite you to have a look to the open content of the volumes and to consider to interact with this publication willing to promote your results and achievements. Publication or access fees are not requested.</p> <p><span style="text-decoration: underline;"><em>Special calls for papers</em></span><em>&nbsp;</em>. SCPE editorial board welcomes initiatives to publish <a title="Special Issues" href="/index.php/scpe/pages/view/CallSpecialIssue">special issues</a> on topics aligned with the SCPE ones.</p> <p><span style="text-decoration: underline;"><em>History</em></span><em>&nbsp;</em>. The journal has evolve with the community that it represents.&nbsp;Initiated in 1998 under the name of Parallel and Distributed Computing Practices (PDCP) journal, first 5 volumes have been published by Nova Science Publishers.&nbsp;Next 4 volumes were sustained by published by Warsaw School of Social Psychology, Poland, only as electronic journal, as an expression of the current tendencies in digital libraries.&nbsp;The electronic journal is currently published by West University of Timisoara, Romania, with the support of highly-devoted international editorial team.More information about changes that took place can be found in the Editorial published in <a href="/index.php/scpe/issue/view/56" target="_self">SCPE 6(1)</a> and <a href="/index.php/scpe/issue/view/79">SCPE 9(4)</a>.</p> <p><span style="text-decoration: underline;"><em>Indexing. </em></span>The journal is indexed by several organizations (see a complete list <a title="Journal Indexing" href="/index.php/scpe/pages/view/JournalIndexing">here</a>). Current impact factors are the followings:</p> <ul> <li class="show">in 2016,&nbsp;<a title="Journal Metrics" href="">SNIP</a>&nbsp;(Scopus Source Normalized Impact per Paper)&nbsp; was 0.581,&nbsp;<a title="SCPE's SJR" href=";tip=sid&amp;clean=0">SJR</a> (Scimago Journal Ranking) was 0.211 (Q3 of quality) and was classified no. <a href="">88 from 191 journals (54% percentile) </a>&nbsp;in Computer Science in Scopus</li> <li class="show">in 2015,&nbsp;<a href="">Global impact factor</a> was 0.876 and&nbsp;<a title="ICV" href="" target="_blank" rel="noopener">ICV</a>&nbsp;(Index Copernicus Value)&nbsp;was 93.25 points from 100;</li> <li class="show">the current h-index computed by Publish and Perish is 24.</li> </ul> <p>SCPE is included in the <a title="SCPE in extended list of Thompson Reuters" href=";ISSN=1895-1767" target="_blank" rel="noopener">Clarivate Analytics (former Thompson Reuters) Emerging Sources Citation Index</a>, and appears from 2015 in the Web of Science collection.</p> <p><span style="text-decoration: underline;"><em>Publicity</em></span>. SCPE flyer is available <a href="">here</a>.</p> <p class="signature">Editor-in-chief, Prof. Dana Petcu</p> en-US (Dana Petcu) (Silviu Panica) Sun, 11 Mar 2018 16:11:55 +0200 OJS 60 A Solution to Image Processing with Parallel MPI I/O and Distributed NVRAM Cache The paper presents a new approach to parallel image processing using byte addressable, non-volatile memory (NVRAM). We show that our custom built MPI I/O implementation of selected functions that use a distributed cache that incorporates NVRAMs located in cluster nodes can be used for efficient processing of large images. We demonstrate performance benefits of such a solution compared to a traditional implementation without NVRAM for various sizes of buffers used to read image parts, process and write back to storage. We also show that our implementation benefits from overlapping reading subsequent images while processing already loaded ones. We present results obtained in a cluster environment for three parallel implementation of blur, multipass blur and Sobel filters, for various NVRAM parameters such as latencies and bandwidth values. Artur Malinowski, Pawel Czarnul ##submission.copyrightStatement## Sun, 11 Mar 2018 00:00:00 +0200 SCALE-EA: A Scalability Aware Performance Tuning Framework for OpenMP Applications HPC application developers, including OpenMP-based application developers, have stepped forward to endeavor the future design trends of exa-scale machines, such as, increased number of threads/cores, heterogeneous architectures, multiple levels of memories, and so forth; and, they have initiated procedures to address application level challenges, such as, data-driven scalability issues, energy consumption requirements, data availability needs, and so forth. Despite the existence of manual performance tuning solutions, users still deem it to be an intricate process. This paper proposes a scalability aware autotuning framework (SCALE-EA) that automatically identifies an efficient number of threads for OpenMP parallel regions using a Firefly Algorithm (FA) and a newly designed Modeling Assisted Firefly Algorithm (MAFA). MAFA of SCALE-EA was implemented in two approaches: Modeling Assisted Firefly Algorithm with Random Forest Modeling support (MAFA-RFM) and Modeling Assisted Firefly Algorithm with Linear Regression Modeling support (MAFA-LRM). The modeling and prediction algorithms of the proposed MAFA of SCALE-EA were based on the execution time and the hardware performance events of code regions of OpenMP applications. Experiments were conducted on two machines, namely, a Haswell based machine and an AMD Opteron based 48 core machine. The experimental results of the MAFA of SCALE-EA manifested the energy efficiencies of 31.21 to 77.3 percentage and the search time efficiencies of 5.53 to 32.56 percentage for candidate OpenMP applications such as CoMD, Arraybench, Taskbench, and Syncbench. Shajulin Benedict ##submission.copyrightStatement## Sun, 11 Mar 2018 00:00:00 +0200 GPU-based Acceleration of Methods based on Clock Matching Metric for Large Scale 3D Shape Retrieval In this paper, we exploit the potential of the GPU in order to accelerate the process of 3D shape retrieval in large databases. Indeed, the massive parallelism of the GPU offers a huge performance in much high-performance computing (HPC) applications. Our solution consists to accelerate the shape matching process of methods that use a specific similarity metric called Clock Matching (CM). This CM measure is used by view-based methods as an efficient solution to compare two 3D models even if they are not presented in same pose and orientation by taking into account all possible poses in the matching phase. However, the increase in the number of comparisons has a strong influence on the execution time. Our challenge is to exploit the maximum benefit of GPU computing resource by considering the difficulty of implementing the CM metric on GPU. Indeed, the descriptor of a given 3D object is organized using a specific data structure (hash table), where only the information whose values are not equal to zero appears in the feature vector, which makes the parallelization on GPU to be not trivial. Experiment results show a reasonable benefit from the GPU approach. Mohammed Benjelloun, El Wardani Dadi, El Mostafa Daoudi ##submission.copyrightStatement## Sun, 11 Mar 2018 00:00:00 +0200 Round Robin with Load Degree: An Algorithm for Optimal Cloudlet Discovery in Mobile Cloud Computing Mobile devices have become essential in our daily lives but it has limited resources such as battery life, storage, and processing capacity. Offloading resource intensive task into the cloud is an efficient approach to improve battery utilization, storage capacity and processing capabilities. Efficiently computing using cloud resources to process offloaded task in order to improve response time and reduce both tasks waiting time and latency problems is one of the main goals in mobile cloud computing (MCC). In order to improve user satisfaction and performance of the mobile application, a cloudlet framework concept has been developed to reduce latency problems which improve response time. The cloudlet brings the cloud closer to the user to perform a computational task. This article proposes a new balancing model among cloudlets in mobile cloud computing environment to find the required resources and create an impact on performance. The efficient load balancing model makes mobile cloud computing more attractive and improves user satisfaction. This paper introduces a Round Robin with Load degree algorithm for public cloudlets in mobile cloud computing using a switch mechanism to choose different approaches for different situations. This algorithm uses game theory based load balancing approach to improve application response time in public mobile cloud environments. Ramasubbareddy Somula, Sasikala R ##submission.copyrightStatement## Sun, 11 Mar 2018 00:00:00 +0200 AutoAdmin: Automatic and Dynamic Resource Reservation Admission Control in Hadoop YARN Clusters Hadoop YARN is an Apache Software Foundation open project that provides a resource management framework for large scale parallel data processing, such as MapReduce jobs. Fair scheduler is a dispatcher which has been widely used in YARN to assign resources fairly and equally to applications. However, there exists a problem of the Fair scheduler when the resource requisition of applications is beyond the amount that the cluster can provide. In such a case, the YARN system will be halted if all resources are occupied by ApplicationMasters, a special task of each job that negotiates resources for processing tasks and coordinates job execution. To solve this problem, we propose an automatic and dynamic admission control mechanism to prevent the ceasing situation happened when the requested amount of resources exceeds the cluster resource capacity, and dynamically reserve resources for processing tasks in order to obtain good performance, e.g., reducing makespans of MapReduce jobs. After collecting resource usage information of each work node, our mechanism dynamically predicts the amount of reserved resources for processing tasks and automatically controls running jobs based on the prediction. We implement the new mechanism in Hadoop YARN and evaluate it with representative MapReduce benchmarks. The experimental results show the effectiveness and robustness of this mechanism under both homogeneous and heterogeneous workloads. Zhengyu Yang, Janki Bhimani, Yi Yao, Cho-Hsien Lin, Jiayin Wang, Ningfang Mi, Bo Sheng ##submission.copyrightStatement## Sun, 11 Mar 2018 00:00:00 +0200 An Optimized Density-based Algorithm for Anomaly Detection in High Dimensional Datasets In this study, the authors aim to propose an optimized density-based algorithm for anomaly detection with focus on high-dimensional datasets. The optimization is achieved by optimizing the input parameters of the algorithm using firefly meta-heuristic. The performance of different similarity measures for the algorithm is compared including both L1 and L2 norms to identify the most efficient similarity measure for high-dimensional datasets. The algorithm is optimized further in terms of speed and scalability by using Apache Spark big data platform. The experiments were conducted on publicly available datasets, and the results were evaluated on various performance metrics like execution time, accuracy, sensitivity, and specificity. Adeel Shiraz Hashmi, Mohammad Najmud Doja, Tanvir Ahmad ##submission.copyrightStatement## Sun, 11 Mar 2018 00:00:00 +0200