Requirements


POP has been compiled successfully for the Cray T3D machine, running UNICOS 9.0.2.4 and up to 1024 processors in parallel. The following compilers must be present in the system: cft77, cc, and cam. The input data for initializing the model are in the file initial.F, which can be modified to suit your own purpose. Other initialization data are in the set_*.F files and the size of the grid can be altered in params.H.

Running the Software

  1. Uncompress and untar the package POP.tar.Z into your directory. In the POP subdirectory, edit the Makefile to tell the computer where the paths are to your compilers, and any other options you wish to compile with.
  2. Type make. A selection of possible POP versions will show up, and select the one you desire.
  3. The compiled code, if successful, is named "pop". You can then run the code by typing: ./pop -npes # where # is the number of PE's desired.

Timing the Code


To get an estimate of the rate of the program in gflops/sec, run the program without the -Ta option (default), and take note of the number of seconds in Timer 1. Then turn on the apprentice option -Ta, in the Makefile, compile the code, run the program, and count the FLOPS as specified in our webpage. The estimated speed is given by the ratio of the above two numbers. Note that the current default local grid size is 64x32x20. For example, based on 10 time steps and 50 iterations (scans) with 64 processors, we obtained the following table:
Gflops Time(sec) Speed(Gflops/s) Comments
14.79 42.63 0.347 Original
18.24 37.56 0.486 Mask on
18.24 28.52 0.640 Mask+BLAS
18.24 20.65 0.883 Mask+BLAS+compiler optimization
18.24 20.00 0.912 Full optimizationExplanation of the comments:

  • Original means the code original code as ported from the POP program for CM5 by Rick Smith at LANL.
  • Mask on means the code contains routines to mask off the land portion of the (idealized) topography.
  • Mask+BLAS means the code now includes BLAS routines for speeding up the array computations.
  • Mask+BLAS+compiler optimization means code is compiled with the following options: -o aggress -o unroll -o noieeedivide, and then linked with -D rdahead=on.
  • Full optimiziation means some loops were manually unrolled and optimized in addition to the above.

See CRAY's Scientific Libraries Reference Manual, publication SR-2081, for more information on BLAS routines.

Diagnostic Requirements


The POP comes with diagnostic routines to check the accuracy of the output. These can be turned on by setting -diagnostic=1 in the Makefile and then recompiling the program. Make sure to type "make clean" to clean out any old code before recompiling. A sample diagnostic output (from stdout) for 256 PE is given in the file SAMPLE included in the distribution. The diagnostic routines also produces a numerical output file containing the final values of the ocean variables and it is called IO_out. This is a Cray binary output file, which can be read by rd_2d_hpcc.f. This program can be compiled with "cf77 rd_2d_hpcc.f" and then executed with ./a.out -npes 1 We have integrated the POP code for 10 days, and analyzed the sea level at the end of the 10-day integration. Results from the optimized code matched well with the original code within error bounds.