Stanford Research Communication Program
  Home   Researchers Professionals  About
 
Archive by Major Area

Engineering
Humanities

Social Science

Natural Science

Archive by Year

Fall 1999 - Spring 2000

Fall 2000 - Summer 2001

Fall 2001 - Spring 2002

Fall 2002 - Summer 2003


 

 

 


Designing Large and Efficient Computer Systems With Little Money

Håkan Zeffer
Computer Science
Uppsala University
October 2003

It is very expensive to combine smaller computers into a large system capable of doing larger computations. Together with other members of my research team, the computer architecture group at Uppsala University, I am working to construct such large systems with less money. We use a cheap software based solution which avoids building of complex and expensive custom hardware. The systems we design are used when scientists and companies need to perform large computations, such as predicting the weather or simulating car crashes.

Multiprocessor and supercomputers are designed to solve large problems in a fast and efficient way. However, problems arise when multiple computers are put together to form such a system. One is the enormous cost involved with verification. That is, checking that the hardware actually is working properly. This is especially true if the parts that make the different computers work together are made from specially designed hardware, such as custom network cards. This is because the verification process for such hardware is extremely complex and costs an enormous amount of money. We have taken another direction. Instead of special designed hardware, we use software and an ordinary network to glue together the different parts of our large scale system. This makes the system much less expensive and, also, more dynamic. With our software approach we have the ability to change the system even after it has been designed, something that is impossible if the classic special hardware design has been used.

My part of the project is to investigate how the network that connects the different computers in the system can be utilized as much as possible. It is sometimes best to move data located on one computer to another, since that specific computer is the only one that uses it. Optimizations such as this one reduce the amount of communication needed in the system, which is good since communication takes time and can overload networks in a way similar to a traffic jam. Other examples of optimizations I work with involve the distribution of results. In cases where many computers in a system need to access this data quickly, it can be valuable to distribute results to all computers in the system as fast as possible. However, distributing results introduces an enormous amount of unnecessary network traffic if the other computers in the system do not need that specific data.

We design our system to replicate data, and to distribute or not distribute results according to the program that it runs. We have come up with some quite new and unique designs which also have shown very promising results. We believe that this software-based and less expensive way of designing large computer systems is a good and promising thing. Systems with this design are easier to construct, costs less money and still show good performance.