Optimization Recommendations

See: Code optimization with the IBM XL Compilers

-O0

Most likely to produce results that match other machines. Produces the slowest code.

-O3

Produces fastest code. May produce results that are not consistent with other levels of optimization, possible incorrect.

The compiler produces the warning:

  "The NOSTRICT option (default at OPT(3)) has the potential 
   to alter the semantics of a program. "
  

-O3 -qhot

We have found (counter intuitively) that adding -qhot to -O3 produces results that are more likely to match other compilers. However it still may produce results that are not consistent with other levels of optimization, possible incorrect.

The compiler produces the warning:

  "The NOSTRICT option (default at OPT(3)) has the potential 
   to alter the semantics of a program. "
  

-O3 -qhot -qstrict

Not as fast as without the "-qstrict" option but more likely to produce results that are consistent with other compilers.

-O2 -g

-g option support debugging. When the -O2 optimization level is in effect, the debug capability is completely supported. When an optimization level higher than -O2 is in effect, the debug capability is limited.

-O2 -g9

Provides the most information for debugging.

-g

Enables traceback information printing if a program crashes. This works at all levels of optimization.

Getting a human readable traceback

Core files are (usually) produced when a program that is build using the -g option crashes. By default, core files are text files with the mpi task number appended to "core", for example, core.1 or core.2. The core files do not have line numbers and routine names associated with the stack trace. Line numbers can be found using the utilities bgcore_backtrace or bgq_stack. These utilities can be brought in to your path by loading the module load PrgEnv/Debug/stacktools

Example:

mpixlf90_r -o v8  -O3 -g  -qarch=auto -qhot pbody.f90

The program v8 is run with 3 MPI tasks producing core files:

[joeuser@mc2 pbody]$ ls -l core*
-rw-rw-r-- 1 joeuser joeuser 17542 Feb 25 12:56 core.0
-rw-rw-r-- 1 joeuser joeuser 18109 Feb 25 12:56 core.1
-rw-rw-r-- 1 joeuser joeuser 17521 Feb 25 13:17 core.2
[joeuser@mc2 pbody]$ module  load PrgEnv/Debug/stacktools

We can then use bgq_stack to find the line numbers for the stack trace.

[joeuser@mc2 pbody]$ bgq_stack core.1
------------------------------------------------------------------------
Program   : v8
------------------------------------------------------------------------
+++ID Rank: 1, TGID: 97, Core: 4, HWTID:0 TID: 97 State: RUN 

00000000010069b0
do_ave
/bins/joeuser/nbody/parallel/fort/pbody/pbody.f90:802

0000000001007f9c
lmac
/bins/joeuser/nbody/parallel/fort/pbody/pbody.f90:1032

0000000001490168
generic_start_main
/bgsys/drivers/V1R2M1/ppc64/toolchain/gnu/glibc-2.12.2/csu/../csu/libc-start.c:226

0000000001490464
__libc_start_main
/bgsys/drivers/V1R2M1/ppc64/toolchain/gnu/glibc-2.12.2/csu/../sysdeps/unix/sysv/linux/powerpc/libc-start.c:194

0000000000000000
??
??:0



[joeuser@mc2 pbody]$ bgcore_backtrace v8 core.1
Faulted at:
do_ave
/bins/joeuser/nbody/parallel/fort/pbody/pbody.f90:806

Thread: 0
do_ave
/bins/joeuser/nbody/parallel/fort/pbody/pbody.f90:802
lmac
/bins/joeuser/nbody/parallel/fort/pbody/pbody.f90:1032
generic_start_main
/bgsys/drivers/V1R2M1/ppc64/toolchain/gnu/glibc-2.12.2/csu/../csu/libc-start.c:226
__libc_start_main
/bgsys/drivers/V1R2M1/ppc64/toolchain/gnu/glibc-2.12.2/csu/../sysdeps/unix/sysv/linux/powerpc/libc-start.c:194
??
??:0

[joeuser@mc2 pbody]$ 

See: Blue Gene/Q Tips and Techniques Overview