On Processing Extreme Data

Dana Petcu; Gabriel Iuhasz; Daniel Pop; Domenico Talia; Jesus Carretero; Radu Prodan; Thomas
 Fahringer; Ivan Grasso; Ramon Doallo; Maria J. Martin; Basilio B. Fraguela; Roman Trobec; Matjaz Depolli; Francisco Almeida Rodriguez; Francisco de Sande; Georges Da Costa; Jean-Marc Pierson; Stergios Anastasiadis; Aristides Bartzokas; Christos Lolis; Pedro Goncalves; Fabrice Brito; Nick Brown

doi:10.12694/scpe.v16i4.1134

PDF

Published: Jan 30, 2016

DOI: https://doi.org/10.12694/scpe.v16i4.1134

Dana Petcu

Gabriel Iuhasz

Daniel Pop

Domenico Talia

Jesus Carretero

Radu Prodan

Thomas Fahringer

Ivan Grasso

Ramon Doallo

Maria J. Martin

Basilio B. Fraguela

Roman Trobec

Matjaz Depolli

Francisco Almeida Rodriguez

Francisco de Sande

Georges Da Costa

Jean-Marc Pierson

Stergios Anastasiadis

Aristides Bartzokas

Christos Lolis

Pedro Goncalves

Fabrice Brito

Nick Brown

Abstract

Extreme Data is an incarnation of Big Data concept distinguished by the massive amounts of data that must be queried, communicated and analyzed in near real-time by using a very large number of memory or storage elements and exascale computing systems. Immediate examples are the scientific data produced at a rate of hundreds of gigabits-per-second that must be stored, filtered and analyzed, the millions of images per day that must be analyzed in parallel, the one billion of social data posts queried in real-time on an in-memory components database. Traditional disks or commercial storage nowadays cannot handle the extreme scale of such application data. Following the need of improvement of current concepts and technologies, we focus in this paper on the needs of data intensive applications running on systems composed of up to millions of computing elements (exascale systems). We propose in this paper a methodology to advance the state-of-the-art. The starting point is the definition of new programming paradigms, APIs, runtime tools and methodologies for expressing data-intensive tasks on exascale systems. This will pave the way for the exploitation of massive parallelism over a simplified model of the system architecture, thus promoting high performance and efficiency, offering powerful operations and mechanisms for processing extreme data sources at high speed and/or real time.

Issue

Vol. 16 No. 4 (2015)

Section

Overview Papers

Article Sidebar

Main Article Content

Abstract

Article Details