MBG wiki | RecentChanges | Blog | 2024-05-04 | 2024-05-03

Qs tests on MBG cluster

The version of Qs used is dated September, 10th, 2004. The parallel version was compiled with
mpicc -DMPI -static -I/usr/local/include -O3 -unroll -tpp6 -xK -wp_ipo Qs_working.c -lsrfftw_intel -lsfftw_intel -limf -lm
where mpicc resolves to
icc -O2 -tpp6 -DUSE_STDARG -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_UNISTD_H=1 -DHAVE_STDARG_H=1 -DUSE_STDARG=1 -DMALLOC_RET_VOID=1 -L/usr/local/mpich-ssh/lib -lmpich

Before measuring (using wall-clock time) the performance of the program, a comparison was made between running Qs with a simple 'mpirun' command, and running it via SGE. The setting for this test was

MethodWall-clock time
mpirun919 seconds
Grid Engine894 seconds

The tests : Celerons & server

The parameters tested are

  1. Number of unique reflections
  2. Number of crystallographic symmetry operators (space group)
  3. Number of processors

In all cases only one minimisation was performed lasting 100,000 steps. The results are :

8 symmetry operators

ReflectionsSpace groupProcessorsWall-clock time in secondsScale-up
4156P4221 (celeron)20321.000
4156P4221 (Pentium IV)16901.202
4156P4222 (mixed)12411.637
4156P42248942.272
4156P42268022.533
4156P42287532.698

ReflectionsSpace groupProcessorsWall-clock time in secondsScale-up
2049P4221 (celeron)10581.000
2049P4221 (Pentium IV)8181.293
2049P42226641.593
2049P42245002.116
2049P42264872.172
2049P42284492.356

ReflectionsSpace groupProcessorsWall-clock time in secondsScale-up
8224P4221 (celeron)41481.000
8224P4221 (Pentium IV)34491.202
8224P422223771.745
8224P422416852.501
8224P422615642.652
8224P422813293.121

4 symmetry operators

ReflectionsSpace groupProcessorsWall-clock time in secondsScale-up
4156P2221 (celeron)10561.000
4156P2221 (Pentium IV)8701.213
4156P2222 (mixed)7521.404
4156P22246591.602
4156P22266611.597
4156P22286501.624

ReflectionsSpace groupProcessorsWall-clock time in secondsScale-up
2049P2221 (celeron)5681.000
2049P2221 (Pentium IV)4281.327
2049P22224091.388
2049P22243741.518
2049P22264011.416
2049P22284081.392

ReflectionsSpace groupProcessorsWall-clock time in secondsScale-up
8224P2221 (celeron)21801.000
8224P2221 (Pentium IV)17961.213
8224P222214661.487
8224P222412111.800
8224P222611581.882
8224P222811141.956

2 symmetry operators

ReflectionsSpace groupProcessorsWall-clock time in secondsScale-up
10044P21 (celeron)14641.000
10044P21 (Pentium IV)12081.211
10044P22 (mixed)12151.204
10044P2411631.258
10044P2612161.203

ReflectionsSpace groupProcessorsWall-clock time in secondsScale-up
19879P21 (celeron)27181.000
19879P21 (Pentium IV)22711.196
19879P22 (mixed)23211.171
19879P2421691.253
19879P2621581.259

Conclusions

On the newer nodes :

For high symmetry space groups it is probably worth using the parallel version. For orthorhombic space groups you will have to measure scale-up. For the low symmetry cases (monoclinic, triclinic), forget it. Just spawn 5 jobs (each with a different seed for the random number generator) and forget them.