the power and promise of parallel computing

The power and promiseof parallel computing

by I. Wladawsky-Berger

The use of parallel computing is reaching intoboth the technical and the commercial computingenvironments. The success of IBM in this broadarea is based on the introduction of flexibleand general-purpose families of scalableprocessors. This essay highlights the historyof one of those families that includes the IBMscalable POWERparallel™ SP1™ and SP2™systems. I also describe how these systemswere rapidly developed and moved to market,and illustrate how some customers are usingthem in their businesses.

I n little over a decade, we have progressed froma rather primitive understanding of how to lash

processors together to a deep appreciation of to-day's agile architectures. We have watched the in-dustry embrace parallel computers as the preferredarchitecture for high- performance computing, andat IBM we have delivered two large-system paral-lel families, the System/390* enterprise servers andthe SPfamily of Scalable rowsaparallel Systems".

We are also seeing the application of parallel com-puting broaden beyond the supercomputing cen-ters into the mainstream of business. And we canlook into the future and anticipate that the powerand affordability of parallel computing will gener-ate revolutionary new applications.

This essay highlights the history of the SPI* andSP2* rowaaparallel" machines, briefly discussesthe approach used to quickly create and deliverthese machines, and illustrates how some custom-ers are using them in their businesses.

Parallel processing at IBM

IBM'sprogram in parallel computing squarely aimsat moving parallel processing into the mainstream

ofscience and business. IBM's System/390 Divisionand the POWER Parallel Division have introducedtwo lines of powerful yet affordable parallel com-puters. Both are built with the kind of inexpensive,CMOS (complementary metal oxide semiconductor)technology used in the microprocessors that runpersonal computers and workstations. And bothessentially redefine the world oflarge systems com-puting.

One family is the System/390 line of parallel serv-ers, introduced for the first time in April 1994. TheSystem/390 servers run enhanced versions of theMultiple Virtual Storage (MVS) and our other main-frame operating systems. The CMOS microproces-sors that run these machines are inscribed with aSystem/390 "personality" that allows users to runtheir existing mainframe applications (in which IBMcustomers have invested $1 trillion over the years),without recoding.

Aimed primarily at our traditional mainframe cus-tomers, these new machines allow users to gainthe performance advantages of parallel computingand participate in open, client/server networks.

The second family of IBM parallel products is againbuilt with CMOS chips, but with the "personality"of the chips found in the popular RISC System/6000*line of workstations. These scalable rowsaparal-

©Copyright 1995 by International Business Machines Corpo-ration. Copying inprinted form for private useispermitted with-outpayment ofroyalty provided that (1) eachreproduction isdone without alteration and(2) theJoumal reference and IBMcopyright notice are included on the first page. The title andabstract, butnootherportions, ofthispaper maybecopied ordistributed royalty free without further permission bycomputer-based andotherinformation-service systems. Permission tore-publish anyotherportion of thispaper must be obtained fromthe Editor.

146 WLADAWSKY-BERGER 0018-8670/95/$3.00 © 1995 IBM IBM SYSTEMS JOURNAL, VOL 34, NO 2, 1995

leI (st) machines use the same AIX* operating sys-tem that powers the RIse System/6000 worksta-tions, and they run almost all of the 10000 or soapplications written for those machines.

Together, IBM's two families of parallel computersoffer an affordable way to scale to nearly unlim-ited computing power. And, regardless of the cus-tomer's choice of operating system, they offer ac-cess to open, distributed computing and a widevariety of applications.

A brief yet distinguished history

With this issue of the IBM Systems Journal de-voted to the details of the roweaparallel SP ma-chines, it seems a good time to reflect on the prin-ciples guiding our march into the marketplace, andon a little bit of the history that brought us to wherewe are today.

We made the decision to launch what became ourrowtnparallel business in the fall of 1991,because,at that point, a number of factors led us to believethe time was right.

First, the RIse Systern/6000workstation, which hadbeen announced the year before, was already be-coming a big success. In avery short time, the RIseSystem/6000 workstation established itself as a su-perb, attractively priced system with the ability torun many jobs that previously had required a muchmore expensive supercomputer.

People inside and outside IBM were connectingtheir RIse Systern/6000 workstations together withlocal area networks into "clusters" and using themin place of much more expensive vector supercom-puters. Looking at future plans for the RIse Sys-tern/6000microprocessors, it was clear that the mi-cros, already very powerful, would over timeapproach the performance of the fastest vector ma-chines, at a fraction of the cost. We became con-vinced that, before long, the entire area of scien-tific and technical computing would move in thedirection of RISe-based parallel machines. And thesuccess of the RIse System/6000 family gave us astrong marketing base to build on.

Second, investigations into parallelism conductedby IBM's Research Division were advancing rap-idly. Our scientists had begun projects in parallelcomputing-projects called the Yorktown Simu-lation Engine (YSE), 1 the Engineering Verification

IBM SYSTEMS JOURNAL, VOL 34, NO 2, 1995

Engine (EVE), 2 RP3,3 GFll,4 and Vulcan" -in theearly 1980s. By 1991 they had developed lots ofsoftware, plus very powerful switch and intercon-nect mechanisms to allow microprocessors to workefficientlytogether. These technologies would formthe cornerstone of our new products.v"

And, third, United States federal government sup-port of parallelism was advancing through theHigh-Performance Computation and Communica-tion Initiative, which aimed to advance the adop-tion of highly parallel supercomputers, high-band-width networks, and other advanced technologies.We were strongly encouraged to participate, bothby IBM groups and by potential partners, includ-ing the Cornell Theory Center and Argonne Na-tional Laboratory.

In December 1991,we successfully made the caseto establish our new supercomputing organization.As our first official action, we established our de-velopment facility, which we called the Highly Par-allel Supercomputing Systems Laboratory. Agroup of experts from around IBM was quickly as-sembled to recommend our product and market di-rections. After severalweeks of hard work, the taskforce members, representing research, develop-ment, and marketing organizations, set the direc-tion we are still following today. As a group wemade four important recommendations.

First, because technology and products change sorapidly, time-to-market was critical. We decidedthat whatever products we developed should bemade available to the marketplace in the shortestpossible time, in no more than 12 to 18 months.

Second, we recommended that our parallel prod-ucts should be positioned as an integral part of theRIse System/6000 family. We should use as manycommon components as possible, including themicroprocessors, the AIX operating system, andothers.

Third, we felt our parallel computer should bebased on the architecture and other technologiesfrom our Research Division's Vulcan project. Weadopted the project's distributed memory architec-ture," which was highly flexible and elegant in itssimplicity, and its high-performance interconnectmechanisms.

Finally, we concluded that our AIX-basedparallelsystem should be a general-purpose machine. And,

WLADAWSKY-BERGER 147

although we would initially aim the system at sci-entific and technical applications, it should evolveover time to support awidervarietyof applications.

These recommendations became the principles thatguided-and still guide---our parallel development.And they form the basis of a very effective strat-egy for running our business.

Speeding technology to market

Time-to-market guided every decision we made.It was clear that we just did not have time to de-sign every component. Consequently, we drew onthe resources of the IBM Research Division, theproduct division producing the RISC System/6000,and the overall technical community inside andoutside IBM.

In February 1992,we set out a significant challengeto our development team: we asked them to bringour first product to market in no more than 18months. In February 1993, exactly one year afterwe put the team together, we announced our firstproduct, the SPl. Then, by April of that year weshipped the first SPlsto our development partners,the Cornell Theory Center and Argonne NationalLaboratory. We began volume shipments of theSPl in September 1993.

These first, and subsequent, development partner-ships have proven invaluable. They advise us onwhat works and what does not, on which applica-tions are essential, and which are not. These part-nerships are with many of the nation's preem-ininent research centers, including the CornellTheory Center, Argonne National Laboratory,Fermilab, the Department of Defense (000) Super-computing Center at Maui, and most recently, theNational Aeronautics and Space Administration(NASA) laboratory at Ames and at Langley. Thesepartnerships also include the CERN high-energyphysics laboratory in Switzerland and the NationalCancer Center in Japan. They have all installed ourpowerful supercomputers, and they are givingbackto us information that has proved invaluable inserving the scientific and technical markets.

And the breathtaking pace continued. We an-nounced our follow-on SP2 machine in April 1994,began shipping it in July and, last October, an-nounced further enhancements to the hardware andsoftware. By the end of March 1995,we had installedmore than 350 machines in customer locations

148 WLADAWSKY-BERGER

around the globe, including universities, industries,and businesses.

Part of the RiSe System/6000 family

UNIX* * workstations and servers had been quitedistinct from parallel systems. There was no com-patibility between the two and there was a verylarge price gap. By positioning our SPl machinesas an integral part of the RISC System/6000 family,and by designing them to run all existing RISC Sys-tem/6000 applications, we "bridged the gulf" be-tween workstations and parallel systems.

Now customers could invest in one architectureand one set of applications for workstations, serv-ers, distributed cluster systems, and parallel sys-tems. And they could invest in an SPlparallel serverwith just a few processors (at a very attractive en-try price) and later scale up their computing op-erations by adding many more processors to theparallel server. Essentially, we created a new seg-ment within the global information technology mar-ket-scalable parallel computing.

Our parallel machines are part of IBM's strategyto offer a "palmtops to teraflops" family, rangingfrom the smallest hand-held computer to massivelyparallel systems with hundreds of processors.All those machines use our RISC-based POWERArchitecture* and Powerrc Architecture* micro-processors.

Using the distributed memory architecture

The distributed memory architecture we use in ourmachines seems almost elegant in its simplicity andflexibility, but choosing that architecture was any-thing but simple. In fact, selecting the architecturewas the most difficult decision we made.

Parallel architecture has been a subject of great in-terest in the computer science communityfor manyyears now. Many different kinds of parallel archi-tectures, switches, programming models, algo-rithms, and the like have been investigated in uni-versities and research laboratories around theworld. Not surprisingly, different individuals andgroups had widely different-and strongly held-opinions on how to best develop a parallel com-puter product.

For quite a while, the big debate was whether tobuild a single-instruction, multiple-data (SIMD)


stream or multiple-instruction, multiple-data(MIMD) stream computer, two widely different ap-proaches. SIMD is only applicable to certain classesof applications, but for those applications, it is eas-ier to program. The MIMD approach applies to avery large number of applications, but it is morecomplex to program. The debates raged for sev-eral years, until the early 1990s,when the SIMD ap-proach faded and MIMD was accepted as the direc-tion to go.

Another major debate, one that still rages, iswhether to use a distributed memory approach, ashared-memory approach, or one of severalvariations in-between. Shared-memory scalablearchitectures make the programming easier, butseverely limit the scalability of the system. Dis-tributed memory architectures make scaling up thesystem much easier (since nothing beyond theswitch is shared among all the processors), butthe programming is harder. That is because soft-ware from shared-memory systems, like unipro-cessors or symmetric multiprocessors, has to berewritten or restructured. Sometimes this restruc-turing is easy, but sometimes it is very difficult.Many commercial applications written forclient/server configurations (includingthose appli-cations written for the RIse System/6000 servers)workwell in distributed memory parallel systems,since they have already been broken up to workacross distributed processors.

Inside IBM we have pursued a variety of parallelcomputing research projects in a number of lab-oratories. Scientists at the Research Division'sThomas J. Watson Research Center organized afirst-rate effort in parallel computing, and over theyears had explored most major architectures inprojects like RP3, GFll, and Vulcan. Most of theResearch Division experts strongly recommendedthat we use the distributed memory architectureof the Vulcan project as we built our scalable par-allel family of products. Because the Vulcan proj-ect scientists had designed and started develop-ment of a number of hardware and softwarecomponents, adopting the Vulcan approach wouldhelp us immensely in our time-to-market objective.This was an added bonus.

In the end, we adopted the Vulcan distributedmemory architecture as well as a number of itshardware and software technologies for our SPproduct. While other architectures offer advan-tages in one area or another, overall, the SP dis-


tributed memory architecture has served us verywell, for two main reasons: its simplicity and flex-ibility.

The simple, elegant architecture of our SP systemmade it possible for us to bring our first product

The Research Divisionexperts recommended that

we use the distributedmemory architecture.

to market in record time. And it has allowed us tokeep this intense focus on time-to-market eversince, by making it possible for us to quickly in-corporate the breathtaking technology advancesgoingon in the industry. For example, we are ableto introduce the latest microprocessors into ourproducts in just a few months of their first appear-ance. And, by being relatively simple, our archi-tecture helps us considerably in being able to offera very attractive price and performance.

In addition, distributed memory architecture hasallowed us to build a very flexible system, easilyadapted to a variety of applications and configu-rations. This flexibilityis very important. The onlycertainty about the future of the computer indus-try is that things will keep changing at a rapid pace,and our products will be called upon to do thingswe had not anticipated, or even dreamed of.

A system for every purpose

From the very beginning, one of our key objectiveswas to gain as much market presence as possible.We wanted our systems to be used in the researchenvironmentswhere so much experimentationwasgoingon, but we alsowanted them to quickly moveto the production environments necessary toachieve wide acceptance and business success. Wewanted to support as many parallel applications aspossible, including the very large, "grand chal-lenge" applications- problems of national impor-tance so complex that they command the comput-ingcapacity of hundreds of processors. But we also


wanted to be able to support the mixed workloadsfound in most computing centers-even in mostsupercomputing centers---consisting of interactiveusers and batch jobs, scalar and parallel applica-tions.

We therefore decided to build a general-purpose,scalable AIX system, capable of handling a varietyof workloads and applications. Even for our ini-tial scientific and technical customers, we knew

We decided to builda general-purpose,

scalable AIX system.

that we had to satisfy a wide variety of require-ments, for the SP family to become a very useful,and, thus, successful supercomputer system.Therefore, while we have invested considerable ef-fort in tools and features specifically aimed at par-allel applications, we have also invested a lot insystems management, workload managers, filesys-tems, and other features needed to build a success-ful production-worthy, general-purpose computer.

Our general-purpose base served us verywell whenwe decided to expand beyond technical applica-tions and support a variety of data-intensive ap-plications, as well as commercial applications ofall sorts. To do this, we made sure that we signif-icantly enhanced the system's input/output capa-bilities; we expanded the capability to include re-lational databases and transaction monitors; andwe worked to enable a variety of applications ofinterest to commercial customers. In addition, wecontinued to enhance the scientific and technicalcapabilities of the SP machines, and also the gen-eral-interest functions like systems management.

Today our SP2 machines are working around theglobe, solving technical and commercial problems.We are already seeing evidence that parallel com-puting is revolutionizing the way our customers usetheir computers, primarily because the affordabil-ity of our machines now makes it practical to ap-ply large processing power to all kinds of problems.

150 WLADAWSKY-BERGER

For example, at Dow Chemical Co. in Midland,Michigan, chemists are using a 24-processor SP2to determine whether a chemical compound theycreate in the lab can "dock" with disease agentslike enzymes, viruses, and bacteria to stop theirharmful action. The calculations involved are com-plex, partly because the interaction of these com-pounds is not governed by strict laws of nature.

At Western Geophysical in the United Kingdom,explorationists are using their SP machine to cre-ate three-dimensional models of underground oilreservoirs by analyzing billions of pieces of data.Those data are collected in seismic tests in a tech-nique where sound waves are bounced off the un-derground contours of the earth. Computer mod-els are used to predict where oil is likely to liehidden before incurring the expense of drilling awell.

Hyundai Electronics Industries Co., Ltd. in Ko-rea uses its SP machine to perform crash simula-tion and analysis and several American auto mak-ers have purchased SP machines for the samepurpose. The automobile manufacturers can nowpredict the crash-worthiness of new designs whilethe cars are still on the drawing board, saving timeand money.

And John Alden Life Insurance Company in Mi-ami is using its SP machine to find the hidden pat-terns in mounds of health care data. It then advisesits clients about which health care plans offer themost reasonable rates, and which hospitals dis-charge patients in the most reasonable period oftime.

Now we are expanding our partnerships with com-mercial customers, companies like Citibank (a sub-sidiary of Citicorp), Revlon, Inc., and MorganStanley Group Inc., to gain information about whatworks and what does not work. Inconducting theirown businesses, these companies are using our SPsystems as information servers that will help themattain a real competitive advantage.

Summary

IBM is firmly committed to parallel computing, bothwith our SP computers based on the POWER Archi-tecture and the AIX operating system, and the com-panion program that is reworking traditional main-frames into a new line of parallel System/390computers that run the MVS operating system,


while protecting the huge investment in System/390applications the world has generated during the last25 years.

As we look into the future, we see whole newclasses of applications for our scalable parallel sys-tems. Multimedia and video-on-demand will lever-age the vast input/output and storage potential ofthe SP machines. Networking applications of allsorts will reach out to provide information andcomputing to vast numbers of people around theworld. Object-oriented technologies will help usdevelop and modify complex applications withhuge productivities. Virtual reality and other ad-vanced user interfaces will help us create wholenew classes of applications and will help us reachfar more people than ever before.

As we turn the promise of parallel computing intoa reality, it will indeed be a "giant leap" that willbenefit us all.

'Trademark or registered trademark of International BusinessMachines Corporation.

"Trademark or registered trademark of XlOpen Co. Ltd.

Cited references

1. G. F. Pfister, "The IBM Yorktown Simulation Engine,"Pro-ceedings of the IEEE 74 (1986), pp. 850-860.

2. D. K. Beece, G. P. Papp, and F. Villante, "The IBM En-gineering Verification Engine," Proceedings ofthe 25thDe-sign Automation Conference (June 1988).

3. G. F. Pfisteret aI., "An Introduction to RP3," ExpennentalParallel Computing Architectures, J. J. Dongarra, Editor,North-Holland Publishers, Amsterdam, Netherlands (1987),pp. 123-140.

4. J. Beetem, M. Denneau, andD. Weingarten, "GFll-ASu-percomputer for Scientific Applications," Proceedings ofthe 12th International Symposium on Computer Architec-ture, Boston, IEEE Computer Society (1985).

5. C. B. Stunkel, D. G. Shea, B. Abali, M. G. Atkins, C. A.Bender, D. G. Grice, P. Hochschild, D. J. Joseph,B. J. Nathanson, R. A. Swetz, R. F. Stucke, M. Tsao, andP. R. Varker, "The SP2 High-Performance Switch," IBMSystems Journal 34, No.2, 185-204 (1995, this issue).

6. T. Agerwala, J. L. Martin, J. H. Mirza, D. C. Sadler,D. M. Dias, and M. Snir, "SP2 System Architecture," IBMSystems Journal 34, No.2, 152-184 (1995, this issue).

7. M. Snir, P. Hochschild, D. D. Frye, and K. J. Gildea, "TheCommunication Software and Parallel Environment of theIBM SP2," IBM Systems Journal 34, No.2, 205-221 (1995,this issue).

8. P. F. Corbett, D. G. Feitelson, J.-P. Prost, G. S. Almasi,S. J. Baylor, A. S. Bolmarcich, Y. Hsu, J. Satran, M. Snir,R. Colao, B. D. Herr, J. Kavaky, T. R. Morgan, andA. Zlotek, "Parallel File Systems for the IBM SP Comput-ers," IBM Systems Journal 34, No.2, 222-248 (1995, thisissue).


Accepted for publication November 8, 1994.

IrvingWladawsky-Berger IBMCorporation, Route 100,P.O.Box 100, Somers, New York 10589. Dr. Wladawsky-Berger iscurrently the General Manager of the IBM POWER ParallelDivision. He joined IBM in 1970 as a research staff memberat the Thomas J. Watson Research Center. During 15years atthe Center, he served in a variety of key assignments, includ-ing that of Vice-President of Systems with overall responsibil-ity for computer science research at IBM. One of his main ac-complishments in this position was the establishment of majortechnology transfer programs, in a number of system areas,between IBM's Research Division and the company's productlaboratories. In 1985 Dr. Wladawsky-Berger joined IBM'sLarge Systems Product Division. He held a number of seniormanagement positions and was responsible for launching IBM'sSystem/370-based supercomputing efforts, as well as for direct-ing the company's System/370 operating systems software bus-iness. He was deeply involved in initiating IBM's System/390microprocessor and parallel technology efforts, and in helpingto lead the transition of the company's mainframe business inthis new direction. In 1992he helped organize and launch IBM'sRISC-based parallel processing business for scientific andcommercial applications, and was named General Manager ofIBM's POWERparallel Systems business unit. In recognitionof the enterprise's rapid growth, IBM commissioned the bus-iness unit a formal division in January 1995, where he contin-ues today. Dr. Wladawsky-Berger holds his master's and doc-toral degrees from the University of Chicago. He is a memberof the Fermilab Board of Overseers and has served on the Com-puter Sciences and Telecommunications Board of the NationalResearch Council.

Reprint Order No. G321-5562.


the power and promise of parallel computing

Documents