Dr. Markus Fischer



While most of todays applications will be happy using a single core, 300Mhz CPU, there are a variety of complex, compute instensive applications which can benefit from hundreds of cores.


Moving an original application to this environment generally requires changes. A new method or even a new algorithm must be pursued in order to obtain a performance gain. A multi core system using shared memory in a threaded manner is a local solution whereas a new distributed parallel algorithm involves additional computers including a high speed network.


The latter include non standard solutions such as Infiniband and Myrinet which allow very low latency and high bandwidth. The latency which can be computed by measuring the time for a round trip communication between two nodes indicates how fast new information can be acquired. Still, as of today this latency is an order of magnitude higher than a local memory access and thus results in reduced speedups for increasing number of nodes.


Another feature of high speed networks which allow user level communication is the very high message rate in which data can be submitted to other nodes. Traditional Ethernet networks can not keep up with this, however the bandwidth has drastically improved by using 10Gbps Ethernet networks.


The remaining difference is the 'quality' of the network. That is high speed, or better system area networks guarantee correct data delivery and are lossless. A recent effort will bring lossless network features to standard 10Gbps Ethernet networks allowing even user level communication to achieve low latency similar to system area networks.

© 2007, Dr. Markus Fischer