Parallel and Distributed Computing Techniques, Selection of papers from ISPDC 2008

Main Article Content

Marek Tudruj


Dear SCPE Reader,

We present a selection of papers which are extensions of papers presented at the 7-th International Symposium on Parallel and Distributed Computing, 1–5 July 2008, in Krakow, Poland. The motivation for publishing the selection in the SCPE Journal was, on the one hand, to present the flavour of the research reported at the conference and on the other hand to present some of the most relevant topics currently focused on the research on parallel and distributed computing in general. The selection contains only 6 papers out of about 60 presented at the conference, and thus, is far from covering all relevant topics represented at the ISPDC 2008. This is because not all of the invited authors were patient enough to accept a fairly long paper publishing process. Nevertheless, we hope that the presented papers will bring you closer to the research covered by the ISPDC conferences and will encourage you to participate in future ISPDC editions.

The first paper ``The Impact of Workload Variability on Load Balancing Algorithms'' is by Marta Beltrán and Antonio Guzmán from King Juan Carlos University in Spain. It concerns an important topic of load balancing in cluster systems, namely adaptativity of the load balancing algorithms to changes of the workload in the system. Adequate accounting for additional load in the hosting system is of great relevance for correct optimization effects. The paper presents a thorough formal analysis of the workload variability metrics and their influence on the quality of load balancing algorithms. Four basic activities appearing in load balancing algorithms are identified, and based on them some algorithmic solutions are proposed to correctly deal with workload variability in system load balancing. The problem of dynamic load balancing algorithms robustness has been discussed. Two different robustness metrics sensitive to the applied type of opimization: local task-oriented or a global one enable selecting task remote execution or migration as load balancing operations. The proposed approach is illustrated with experiments.

The second paper ``Model-Driven Engineering and Formal Validation of High-Performance Embedded Systems'' is by Abdoulaye Gamatié, Éric Rutten, Huafeng Yu, Pierre Boulet, Jean-Luc Dekeyser, from University of Lille and INRIA in France. The paper is concerned with a very advanced methodology of designing correct parallel embedded systems for intensive data-parallel computing. In their previous research, the authors of the paper designed the GASPARD embedded system design framework. It is based on the hardware/software co-design approach through model-driven engineering. The framework is based on an UML-like model specification language in which hardware and software elements are modelled using a component approach with special mechanisms for repetitive structures. This paper tries to combine the modelling framework of GASPARD with the mechanisms of synchronous languages to achieve design verifiability provided for such languages. The paper shows how GASPARD models can be translated into synchronous models based on data flow equations in order to formally check their correctness. The proposed approach is illustrated with an example of a video processing system.

The third paper ``Relations Between Several Parallel Computational Models'' is by Stephan Bruda and Yuanqiao Zhang from Bishop’s University in Canada. The paper is concerned with theoretical aspects of shared memory systems described by the parallel random access machine PRAM model and aims in studying performance properties of different types of PRAM systems. The attention is focussed on analysing the computational power of two more sophisticated PRAM models (Combining CRCW and Broadcast Selective Reduction), which include data reduction in case of concurrent writes. The paper shows that these two models have equivalent computational power, which is a new result comparing the existing literature. The performance of both models applied to reconfigurable multiple bus machines was studied as a possible architectural solution for current VLSI processor implementations. It was shown that in such systems under reasonable assumptions concurrent-write does not enhance performance comparing the exclusive-write model. Another result important for the VLSI technology is that the Combining CRCW PRAM model (in which data of concurrent writes are arthmetically or logically combined before write) and the exclusive-write on directed reconfigurable busses perform in equivalent way under strong real-time requirements.

The fourth paper ``Experiences with Mesh-Like Computations Using Prediction Binary Trees'' is by Gennaro Cordasco, Biagio Cosenza, Rosario de Chiara, Ugo Erra and Vittorio Scarano from the University ``degli Studi'' of Salerno and the University ``degli Studi della Biasilicata'' of Potenza in Italy. The paper concerns optimization methods for mesh-like computations in clusters of processors. The computations are perfomed assuming a phase-like program execution control using a tiling approach which reduces inter-processor communication. A temporal coherence is also assumed, which means that task sizes provide similar execution times in consecutive phases. Temporary coherent computations are structured in a Prediction Binary Tree, in which leaves represent computing tiles to be mapped to processors. A phase-by-phase semi-static load balancing is introduced to the scheduling algorithm. The scheduling algorithm is equipped with a predictor, which estimates the computation time of next phase tiles based on previous execution times and modifies the tiles to achieve balanced execution in phases. For this, two heuristics are used to leverage on data locality in processors. The proposed approach is illustrated by the example of interactive rendering with Parallel Ray Tracing algorithm.

The fifth paper ``The Influence of the IBM pSeries Servers Virtualization Mechanism on Dynamic Resource Allocation in AIX 5L'' is by Maciej Mlynski from ASpartner Limited in Poland. The paper concerns a very up-to-date problem of system virtualization and presents the results of research carried on IBM pSeries servers. IBM is strongly developing the virtualization technique especially on IBM pSeries servers enabling an improved and flexible sharing of system resources between applications. The paper investigates novel facilities for dynamic resource management such as micro-partitioning and partition load manager. They enable dynamic creation of workload logical partitions of system resources and their dynamic mangement. It includes run-time resource re-alocation between logical partitions including setting of sharing specifications as well as run-time adding/removing/setting parameters of resources in the system. It remains an open question how to properly tune parameters of the operating system using the provided virtualization facilities to obtain the best efficiency for a given application program. The paper presents the results of experiments which study the effects of tuning the disk subsystem parameters under the IBM AIX 5L operating system with the use of the provided virtualization facilities on the resulting application execution performance. The results show that even small deterioration in the resource pool status requires an immediate adaptation of the operating system parameters to maintain the required performance.

The sixth paper ``HeteroPBLAS: A Set of Parallel Basic Linear Algebra Subprograms Optimized for Heterogeneous Computational Clusters'' is by Ravi Reddy, Alexey Lastovetsky and Pedro Alonso from University College Dublin in Ireland and Polytechnic University of Valencia in Spain. The paper concerns the methodology for parallelization of linear algebra computations for execution in heterogeneous cluster environments. The design of the HeteroPBLAS library (Parallel Basic Linear Algebra Subprograms) for heterogeneous computational clusters is presented. The main contribution of the paper is the automation of the parallelization and optimization of the PBLAS, which is done by means of a special user interface and the underlying set of functions. An important element is here a performance model that is based on program code instrumentation, which determines parameters of the application and the executive heterogeneous platform relevant for execution performance of parallel code. The parameter values specified for or returned by execution of the performance model functions are next used for generation and optimal mapping of the parallel code of the library subroutines. The proposed approach is illustrated by experimental results of execution of optimized HeteroPBLAS programs on homogeneous and heterogeneous computing clusters.

Marek Tudruj

Article Details

Introduction to the Special Issue