MBG wiki | RecentChanges | Blog | 2025-01-22 | 2025-01-21

NAS Parallel Benchmarks, v.2.3, 4 old nodes

The benchmarks were built with the Intel compilers. MPICH was used as the MPI implementation. Process migration to the newest nodes was avoided by locking the ssh deamon (not the best solution). The nodes were pc13, pc14, pc15, pc16 (733 MHz, PIIIs). Results :

NAS Parallel Benchmarks 2.3 -- BT Benchmark

 No input file inputbt.data. Using compiled defaults
 Size:  64x 64x 64
 Iterations: 200    dt:   0.000800
 Number of active processes:     4

 Time step    1
 Time step   20
 Time step   40
 Time step   60
 Time step   80
 Time step  100
 Time step  120
 Time step  140
 Time step  160
 Time step  180
 Time step  200
 Verification being performed for class A
 accuracy setting for epsilon =  0.1000000000000E-07
 Comparison of RMS-norms of residual
           1 0.1080634671464E+03 0.1080634671464E+03 0.6969749535109E-14
           2 0.1131973090122E+02 0.1131973090122E+02 0.1255405701709E-14
           3 0.2597435451158E+02 0.2597435451158E+02 0.3282665917769E-14
           4 0.2366562254468E+02 0.2366562254468E+02 0.8406791988557E-14
           5 0.2527896321175E+03 0.2527896321175E+03 0.1394160013545E-13
 Comparison of RMS-norms of solution error
           1 0.4234841604053E+01 0.4234841604053E+01 0.1468118413674E-14
           2 0.4439028249700E+00 0.4439028249700E+00 0.6252624235385E-14
           3 0.9669248013635E+00 0.9669248013635E+00 0.3789059889763E-14
           4 0.8830206303977E+00 0.8830206303977E+00 0.0000000000000E+00
           5 0.9737990177083E+01 0.9737990177083E+01 0.2553811957023E-14
 Verification Successful

 BT Benchmark Completed.
 Class           =                        A
 Size            =               64x 64x 64
 Iterations      =                      200
 Time in seconds =                   917.25
 Total processes =                        4
 Compiled procs  =                        4
 Mop/s total     =                   183.47
 Mop/s/process   =                    45.87
 Operation type  =           floating point
 Verification    =               SUCCESSFUL
 Version         =                      2.3
 Compile date    =              30 Jun 2004

 Compile options:
    MPIF77       = mpif77
    FLINK        = mpif77
    FMPI_LIB     = -L/usr/local/lib
    FMPI_INC     = -I/usr/local/include
    FFLAGS       = -O2
    FLINKFLAGS   = -static
    RAND         = (none)

 Please send the results of this run to:

 NPB Development Team 
 Internet: npb@nas.nasa.gov
 If email is not available, send this to:

 MS T27A-1
 NASA Ames Research Center
 Moffett Field, CA  94035-1000

 Fax: 415-604-3957

NAS Parallel Benchmarks 2.3 -- CG Benchmark

 Size:      14000
 Iterations:    15
 Number of active processes:     4

   iteration           ||r||                 zeta
        1       0.14674662350377E-12    19.9997581277040
        2       0.13825296499213E-14    17.1140495745506
        3       0.13582200639364E-14    17.1296668946143
        4       0.13390743666272E-14    17.1302113581192
        5       0.13151128694994E-14    17.1302338856353
        6       0.12717775790555E-14    17.1302349879482
        7       0.12434040923854E-14    17.1302350498916
        8       0.12169541498574E-14    17.1302350537510
        9       0.11851688069578E-14    17.1302350540101
       10       0.11493047350287E-14    17.1302350540284
       11       0.11175026684273E-14    17.1302350540298
       12       0.10968717022187E-14    17.1302350540299
       13       0.10439446919306E-14    17.1302350540299
       14       0.10142396619043E-14    17.1302350540299
       15       0.98043441651967E-15    17.1302350540299
 Benchmark completed 
 Zeta is      0.171302350540E+02
 Error is     0.891731133379E-12

 CG Benchmark Completed.
 Class           =                        A
 Size            =                    14000
 Iterations      =                       15
 Time in seconds =                    20.69
 Total processes =                        4
 Compiled procs  =                        4
 Mop/s total     =                    72.33
 Mop/s/process   =                    18.08
 Operation type  =           floating point
 Verification    =               SUCCESSFUL
 Version         =                      2.3
 Compile date    =              30 Jun 2004

NAS Parallel Benchmarks 2.3 -- EP Benchmark

 Number of random numbers generated:    536870912  
 Number of active processes:                    4

EP Benchmark Results:

CPU Time =   35.0941
N = 2^   28
No. Gaussian Pairs =     210832767.
Sums =    -4.295875165634738D+03   -1.580732573678648D+04
  0      98257395.
  1      93827014.
  2      17611549.
  3       1110028.
  4         26536.
  5           245.
  6             0.
  7             0.
  8             0.
  9             0.

 EP Benchmark Completed.
 Class           =                        A
 Size            =                536870912  
 Iterations      =                        0
 Time in seconds =                    35.09
 Total processes =                        4
 Compiled procs  =                        4
 Mop/s total     =                    15.30
 Mop/s/process   =                     3.82
 Operation type  = Random numbers generated
 Verification    =               SUCCESSFUL
 Version         =                      2.3
 Compile date    =              30 Jun 2004

NAS Parallel Benchmarks 2.3 -- FT Benchmark

 No input file inputft.data. Using compiled defaults
 Size                : 256x256x128
 Iterations          :           6
 Number of processes :           4
 Processor array     :       1x  4
 Layout type         :          1D
 T =    1     Checksum =    5.046735008193D+02    5.114047905510D+02
 T =    2     Checksum =    5.059412319734D+02    5.098809666433D+02
 T =    3     Checksum =    5.069376896287D+02    5.098144042213D+02
 T =    4     Checksum =    5.077892868474D+02    5.101336130759D+02
 T =    5     Checksum =    5.085233095391D+02    5.104914655194D+02
 T =    6     Checksum =    5.091487099959D+02    5.107917842803D+02
 Result verification successful
 class = A

 FT Benchmark Completed.
 Class           =                        A
 Size            =              256x256x128
 Iterations      =                        6
 Time in seconds =                    64.84
 Total processes =                        4
 Compiled procs  =                        4
 Mop/s total     =                   110.07
 Mop/s/process   =                    27.52
 Operation type  =           floating point
 Verification    =               SUCCESSFUL
 Version         =                      2.3
 Compile date    =              30 Jun 2004

NAS Parallel Benchmarks 2.3 -- IS Benchmark

 Size:  8388608  (class A)
 Iterations:   10
 Number of processes:     4


 IS Benchmark Completed
 Class           =                        A
 Size            =                  8388608
 Iterations      =                       10
 Time in seconds =                    21.74
 Total processes =                        4
 Compiled procs  =                        4
 Mop/s total     =                     3.86
 Mop/s/process   =                     0.96
 Operation type  =              keys ranked
 Verification    =               SUCCESSFUL
 Version         =                      2.3
 Compile date    =              30 Jun 2004

NAS Parallel Benchmarks 2.2 -- LU Benchmark

 Size:  64x 64x 64
 Iterations: 250
 Number of processes:     4

 Time step    1
 Time step   20
 Time step   40
 Time step   60
 Time step   80
 Time step  100
 Time step  120
 Time step  140
 Time step  160
 Time step  180
 Time step  200
 Time step  220
 Time step  240
 Time step  250

 Verification being performed for class A
 Accuracy setting for epsilon =  0.1000000000000E-07
 Comparison of RMS-norms of residual
           1   0.7790210760669E+03 0.7790210760669E+03 0.1503135748836E-13
           2   0.6340276525969E+02 0.6340276525969E+02 0.4258587752173E-14
           3   0.1949924972729E+03 0.1949924972729E+03 0.9328509706709E-14
           4   0.1784530116042E+03 0.1784530116042E+03 0.9556031307595E-15
           5   0.1838476034946E+04 0.1838476034946E+04 0.1100708234961E-13
 Comparison of RMS-norms of solution error
           1   0.2996408568547E+02 0.2996408568547E+02 0.1067091565711E-14
           2   0.2819457636500E+01 0.2819457636500E+01 0.1323073386331E-13
           3   0.7347341269878E+01 0.7347341269877E+01 0.5802447794332E-14
           4   0.6713922568778E+01 0.6713922568778E+01 0.2645780944304E-15
           5   0.7071531568839E+02 0.7071531568839E+02 0.1044985005012E-13
 Comparison of surface integral
               0.2603092560489E+02 0.2603092560489E+02 0.2729609951429E-15
 Verification Successful

 LU Benchmark Completed.
 Class           =                        A
 Size            =               64x 64x 64
 Iterations      =                      250
 Time in seconds =                   513.00
 Total processes =                        4
 Compiled procs  =                        4
 Mop/s total     =                   232.55
 Mop/s/process   =                    58.14
 Operation type  =           floating point
 Verification    =               SUCCESSFUL
 Version         =                      2.3
 Compile date    =              30 Jun 2004

NAS Parallel Benchmarks 2.3 -- MG Benchmark

 No input file. Using compiled defaults 
 Size: 256x256x256  (class A)
 Iterations:   4
 Number of processes:     4

 Initialization time:          11.724 seconds

 Benchmark completed 
 L2 Norm is   0.243336530907E-05
 Error is     0.694855007951E-16

 MG Benchmark Completed.
 Class           =                        A
 Size            =              256x256x256
 Iterations      =                        4
 Time in seconds =                    27.39
 Total processes =                        4
 Compiled procs  =                        4
 Mop/s total     =                   142.08
 Mop/s/process   =                    35.52
 Operation type  =           floating point
 Verification    =               SUCCESSFUL
 Version         =                      2.3
 Compile date    =              30 Jun 2004

NAS Parallel Benchmarks 2.3 -- SP Benchmark

 No input file inputsp.data. Using compiled defaults
 Size:  64x 64x 64
 Iterations: 400    dt:   0.001500
 Number of active processes:     4

 Time step    1
 Time step   20
 Time step   40
 Time step   60
 Time step   80
 Time step  100
 Time step  120
 Time step  140
 Time step  160
 Time step  180
 Time step  200
 Time step  220
 Time step  240
 Time step  260
 Time step  280
 Time step  300
 Time step  320
 Time step  340
 Time step  360
 Time step  380
 Time step  400
 Verification being performed for class A
 accuracy setting for epsilon =  0.1000000000000E-07
 Comparison of RMS-norms of residual
           1 0.2479982239930E+01 0.2479982239930E+01 0.9633939753781E-13
           2 0.1127633796437E+01 0.1127633796437E+01 0.8624744776582E-13
           3 0.1502897788877E+01 0.1502897788877E+01 0.6293907835791E-13
           4 0.1421781621170E+01 0.1421781621170E+01 0.4997551840455E-13
           5 0.2129211303514E+01 0.2129211303514E+01 0.5318530308612E-13
 Comparison of RMS-norms of solution error
           1 0.1090014029782E-03 0.1090014029782E-03 0.3834445484860E-12
           2 0.3734395176929E-04 0.3734395176928E-04 0.1159500326349E-12
           3 0.5009278540654E-04 0.5009278540654E-04 0.9293340428823E-13
           4 0.4767109393954E-04 0.4767109393953E-04 0.2105184825798E-12
           5 0.1362161339921E-03 0.1362161339921E-03 0.6287946157047E-13
 Verification Successful

 SP Benchmark Completed.
 Class           =                        A
 Size            =               64x 64x 64
 Iterations      =                      400
 Time in seconds =                   704.50
 Total processes =                        4
 Compiled procs  =                        4
 Mop/s total     =                   120.67
 Mop/s/process   =                    30.17
 Operation type  =           floating point
 Verification    =               SUCCESSFUL
 Version         =                      2.3
 Compile date    =              30 Jun 2004