What's in a name
Main Article Content
Abstract
Over the last few years I have been a member of the IEEE Task Force on Cluster Computing (TFCC). I am not a very active member, as many other things keep me from getting too drawn in, but I read the material posted on the TFCC mailing list and get involved in conferences devoted to cluster computing. As the time passes I ask myself, why people get so passionate about cluster computing? Do not get me wrong, I have nothing against cluster computing as such. I believe that this approach to parallel computing has a lot of advantages, price/performance ratio being one of them. What bothers me are the claims that cluster computing is so different from parallel and/or distributed computing and should be a category in itself on equal par with them. I just cannot see this being the case and this is what lead to this Editorial.
Let us start from a simple observation that we do not have an IEEE Computer Society Technical Committee (TC) on Sequential Computing and/or conferences devoted to Sequential Computing. We never had them and the emergence of parallel/distributed computing also did not result in their creation. One could claim that this is because sequential computing always was there and the research questions were concentrated on what can we do with it (applications). Thus we have, for instance, TC's on Computer Languages and Real-Time Systems as they are independent of how the computing is done. In other words, sequential computing is just a backdrop against which the research questions are asked. Obviously, there exists a TC on Computer Architecture but its interest encompasses sequential as well as parallel computing. The conclusion that can be drawn is that the mode of computing is just the context for our research. Obviously, the emergence of parallel computing has resulted in a large number of research questions and I will return to this point in a moment. Let us first look into the divided world of parallel and distributed computing.
In IEEE Computer Society we have Technical Committees on Parallel Processing, Distributed Processing, Supercomputing Applications, Internet and a Task Force on Cluster Computing. While the TC on Supercomputing Applications has web-site updated for the last time in 1997 (sic!), other committees are rather active. In addition to these, a serious research effort is devoted to Grid Computing. While, obviously, there seem to be some natural differences between each of these areas, there is also a substantial overlap. And, in the final analysis, I see all these as just different ways of talking about the same thing; about applying multiple computational units to solve problems. On the high end we may talk about SETI@home distributed application which is running, on average, at 20 Tflops or parallel computations performed at about 10 Tflops on the ASCI Pacific While supercomputer. We may also be talking about clusters of all shapes and sizes (and this includes also clustering for fail-over used, for instance, in transaction processing). Finally, we may talk about 4-processor servers and dual-processor workstations. Let us look into the latter case. A few years back, when a friend of mine in the Department of Mathematics at the USM purchased a new dual-processor workstation, he paid more than $10K. Today a dual-processor Xeon computer with 1 Gbyte of memory will cost about $5K. We have reached the point when, for those of us who need some additional power, buying a machine with the second processor and twice as much memory is only slightly more expensive than buying a single processor machine. But there is more to this story. These two processors can be effectively utilized. Software technology that, about 20 years ago, was only available on supercomputers is now incorporated in standard software. For instance, both Linux and Win2000 can relatively effectively utilize shared memory dual- and quad-processor machines. I am willing to extrapolate from this that in the near future SMP-multiprocessors and/or connected computers will provide the infrastructure of computing. Multiprocessing will become the same context for research that single processor computing was in the past. Thus we can say (knowing that this is an oversimplification) that a sequential machine is a multiprocessor with one node; SMP is a multiprocessor with extremely fast connection; cluster is a multiprocessor with whatever characteristics do the people in the cluster community want it to have; network/internet/grid is a multiprocessor of extremely loosely connected (single- or multi-processor) machines.
Let us now come back to the issue of research questions resulting from the emergence of parallel and distributed computing. I would like to suggest that these are really not questions about parallel and distributed computing, but rather questions about software engineering techniques, operating systems or programming languages for modern multiprocessor systems. In view of this I tend to believe that parallel and distributed computing represents a semantic mix-up, the same way that a vegetable does. As we know there is no such thing as a vegetable, as various plants and fruits that we combine into the common name vegetables belong to the separate biological species. The same way, there is no parallel and distributed computing. Rather, there are compiling techniques for SMP systems, operating systems for cluster architectures or distributed databases. Assuming that I am right, maybe it is time to stop fighting turf wars, time to fold all parallel and distributed TC's and TF's and do what is really important? For instance, instead of trying to define a cluster in such a way that will make it different and unique, maybe we should concentrate on developing tools that will help us use the existing and future multiprocessor-based computational environments. If not for other reason, just because this is what the users expect from us.
Marcin Paprzycki
Oklahoma State University