7 December 2009Dr Rio Yokota, a postdoctoral researcher working with Dr Lorena Barba in the Department of Mathematics, was part of the team who won a Gordon Bell prize at SC09, an annual international conference for high-performance computing (HPC), networking, storage and analysis.
The Gordon Bell prizes are a prestigious set of awards made every year at the conference to recognise outstanding achievement in high-performance computing, particularly advances in peak performance, cost performance, and innovative techniques.
The team won a prize in the ‘price per performance’ category, which is awarded to the entry demonstrating the best price-performance ratio as measured in Mflops (megaflop per second) per dollar on a genuine application (for an explanation of the FLOPS measurement of a computer’s performance, see the wikipedia entry). This is the first time the ‘price per performance’ award has been given since 2001.
The team members and their institutions are as follows:
Tsuyoshi Hamada (Nagasaki University)
Rio Yokota (University of Bristol)
Keigo Nitadori (RIKEN)
Tetsu Narumi (University of Electro-Communications)
Kenji Yasuoka (Keio University)
Makoto Taiji (RIKEN)
Kiyoshi Oguri (Nagasaki University)
In November, the team achieved a sustained performance of 57.3 Tflops (teraflop per second) on a cluster of 760 GPUs (graphics processing units) that cost a total of US $428,134. This works out at 138 Mflops per dollar, which is 32 times better than that of the 2001 ‘price per performance’ winner – and an increase from the 42 Tflops that the team themselves recorded in their paper, 42 TFlops Hierarchical N-body Simulations on GPUs with Applications in both Astrophysics and Turbulence, published in August 2009.
The actual calculations were run on the GPU cluster in Nagasaki University, but Dr Yokota carried out the programming remotely from the University of Bristol, where he continues to develop the fast multipole method for GPUs with Dr Barba’s group.
‘This is a strong showing for the international competitiveness of the University’s HPC research,’ said Dr Yokota, who added that a large part of the work should be credited to his collaborators in Japan.
1. The performance of 57.3 TFlops is not for the LINPACK benchmark, but for the real application using hierarchical N-body algorithms.
2. The GPU is a general-purpose computer, but not all algorithms run efficiently on GPUs, so this performance is restricted to specific applications in astrophysics, molecular dynamics, fluid mechanics, elastics, acoustics, electromagnetics and quantum mechanics.
Dr Rio Yokota