1,2,4,6,8 and 9 processors (that is, server plus newest nodes). Note that the 1-processor benchmark was obtained on one of the compute nodes (pc10, not the server). This explains the value of 2.18 for the 2-processor case. Also note that the 9th processor is a node located outside the cluster room which is connected with the cluster via an additional hub. The result is that for this node the network latency is higher (see Raw TCP performance).
Number of processors | Days per nanosecond of simulation | Scale-up |
1 | 17.464 | 1.00 |
2 | 8.013 | 2.18 |
4 | 4.755 | 3.67 |
6 | 3.799 | 4.60 |
8 | 2.794 | 6.25 |
9 | 2.616 | 6.67 |
Graphical comparison with the ideal (linear) scaling-up :
Extending the test to include the slower nodes as well gives the following table :
Number of processors | Days per nanosecond of simulation | Scale-up |
1 | 17.464 | 1.00 |
2 | 8.013 | 2.18 |
4 | 4.755 | 3.67 |
6 | 3.799 | 4.60 |
8 | 2.794 | 6.25 |
9 | 2.616 | 6.67 |
10 | 2.581 | 6.76 |
12 | 2.552 | 6.84 |
14 | 2.412 | 7.24 |
16 | 2.327 | 7.50 |
18 | 2.170 | 8.05 |
Graphical comparison with the ideal (linear) scaling-up :