Hartree Centre Logo

Back to Contents Page

Software Development on the IBM Power-8 Systems

Last modified: 2/5/2017

Quick Links

The Power-8 systems at the Hartree Centre are as follows:

  • Fairthorpe: Login nodes as for Panther below.
  • Panther: System: S822LC Power8 8335-GTA 32-node cluster; Processors: POWER8 (raw), altivec supported 3.857 GMHz; Nodes; 2x 8-core sockets, 128 threads plus 4x nVidia K80 GPU accelerators (Kepler GK210).
  • Paragon: System: S822LC PowerNV 8335-GTB 32-node cluster; Processors: POWER8NVL (raw), altivec supported 4.023 GHz; Nodes; 2x 8-core sockets, 128 threads plus 4x nVidia P100 GPU accelerators (Pascal).

You will be using Fairthorpe for development work described in the rest of this page and also job submission as described here: Jobs on Panther and Paragon.

Using environment modules to access compilers

Environment variables are installed and we are using the lmod and xalt software from TACC.

For the IBM compiler suite, use e.g.:

module load spectrum_mpi

For the GNU compilers, use e.g.:

module load spectrum_mpi
module unload ibm

Where is the documentation?

Using XL Compilers on Power Systems.

The full XL compiler manuals are on line as noted above.

For XLC use CFLAGS="-qarch=pwr8 -qtune=pwr8:smt8 -qcache=auto".

More extensive options might include:

CFLAGS=-O -qhot=level=2 -g -qnoipa -qfloat=nans:spnans:subnormals -qmaxmem=-1 -qsmp=omp -qarch=pwr8 -qtune=pwr8:smt8 -qcache=auto -qsimd=auto -qpic=large -qlibansi

FFLAGS=-O3 -g -qfloat=nans:subnormals -qsclk=micro -qnosave -qsclk=micro -qmaxmem=-1 -qsmp=omp -qarch=pwr8 -qtune=pwr8:smt8 -qcache=auto -qsimd=auto -qthreaded -qtbtable=full -qwarn64 -qassert=contiguous:refalign -qpic=large -qlibansi

Explanations are given by typing "xlc -qhelp" or "xlf -qhelp".

It is useful to add the "-qreport -qlistopt -qsaveopt -qphsinfo" flags which will produce diagnostic information. Listopt only applies to Fortran.

Using the GNU Compiler Suite

GCC-4.8.5 is the default with RedHat-7.4 on these systems.

For GCC use "-mcpu=power8" which implies "-mvsx, -maltivec -mabi=altivec".

Specific options might be:

CFLAGS="-O3 -g -fopenmp -mcpu=power8"

FFLAGS="-O3 -g -fopenmp -mcpu=power8"

Using nVidia Tools

Our Power-8 systems contain nVidia GPUs as follows:

  • Panther - 4x Tesla K80 (Kepler) = sm_37 (compute capability 3.7)
  • Paragon - 4x Tesla P100-SXM2-16GB (Pascal) = sm_60 (compute capability 6.0)

NVCC works with XL as follows: "export HOST_COMPILER=/gpfs/panther/local/apps/ibm/xlC/13.1.5/bin/xlC_r" and when you make the samples, you'll see it turned into "nvcc -ccbin /gpfs/panther/local/apps/ibm/xlC/13.1.5/bin/xlC_r". This can be added explicitly to makefiles.

An extensive list of options might include:

NVCCFLAGS=-O -DNDEBUG -DNOCHANGE -Xcompiler "-qsimd=auto" -Xcompiler "-qsmp=omp" -Xcompiler "-qarch=pwr8" -Xcompiler "-qtune=pwr8:sm t8" -Xcompiler "-qcache=auto" -Xcompiler "-qmaxmem=-1" -Xcompiler "-qhot=level=2" -Xcompiler "-qnoipa" -Xcompiler "-qlibansi" -Xcompiler "-qfloat=subnormals"

Using GNU Configure

Most HPC applications use MPI, but it is not possible to run MPI tests interactively on Fairthorpe. This means that GNU configure will fail. We are effectively cross compiling because the application will run on a host different to the one used to build it. There are two important keywords --build=ppc64le-redhat-linux-gnu and ---host=ppc64le-redhat-linux; it is also possible to use ppc64le-linux-thread-multi. The former is for the Fairthorpe login node used for compilation and the second is the back end (host) cluster. For instance to compile FFTW we used the following. Configure will normaly figure out the build option itself, so it is sufficient for the host option to be different to bypass the run time tests.

../configure --host=ppc64le-redhat-linux \
--prefix=$INSTALLDIR --enable-mpi --enable-openmp --disable-shared \
CC=mpixlc_r F77=mpixlf90_r MPICC=mpixlc_r MPICXX=mpixlcxx_r MPIFC=mpixlf90_r MPILIBS=" " \
CFLAGS=-O3 FCFLAGS=-O3 LDFLAGS=-qnostaticlink

make -j 8
make install

Note: you will find "make check" won't work for reasons described. You will have to run each one via the batch queue to test, see Jobs on Panther and Paragon. Some environment variables that might need setting to use configure and other tools are for MPI applications are:

  • CC - C compiler command, e.g. export CC=mpcc
  • CXX - C++ compiler command, e.g. export CXX=mpCC
  • FC and F77 - the Fortran compiler commands (free form and fixed
  • form), e.g. FC=mpfort, F77=mpfort
  • CFLAGS - C compiler flags, export CFLAGS=""
  • LDFLAGS - linker flags, e.g. export LDFLAGS=""
  • LIBS - libraries to pass to the linker, e.g. export LIBS=""
  • CPP - C preprocessor, e.g. "cpp -E"
  • CPPFLAGS - (Objective) C/C++ preprocessor flags, e.g. -I<include dir> if you have headers in a non-standard directory <include dir>

We attempt to set suitable defaults in the IBM compiler environment modules ibm/13.1.3 or ibm/13.1.5, but these are for the underlying compiler, e.g. xlc and xlf and not for the MPI compiler wrappers.

Some packages, like Xorg, have an aclocal.m4 that specified "ld -m elf64ppc". This is wrong for our little-endian systems. Either change to "ld -m elf64lppc" and re-build using aclocal, or change the line in the configure script to have "ld -m elf64lppc". Not doing this will prevent shared libs from being built.


The CMake Web site guide is here: external link: . For cross compilation using CMAKE you need to specify the node environment, the choice of compiler, and whether the binary should be statically or dynamically linked. There are pre-configured platform toolchain configurations to do this.

We have found that on our Power systems CMake can be used in a standard way providing run time tests are supressed.

Using ESSL and PESSL

ESSL is IBM's optimised Engineering and Scientific Subroutine Library. PESSL is a parallel version. You can use ESSL as follows, examples show C and Fortran linkage. Note you must use the thread safe compiler (with _r) which is automatically envoked from the MPI wrappers.

module load ibm ibmmpi ibmessl

# C example
mpcc -o test.out call_essl.c -L$ESSL_LIB -lessl

# Fortran example single threaded
mpfort -qsmp -o test.out call_essl.f90 -L$ESSL_LIB -lessl

# Fortran example, multi-threaded with -qsmp flag
mpfort -qsmp -o test.out call_essl.f90 -L$ESSL_LIB -lesslsmp

You can also use the MASS library in a similar way, but it is automatically included if using the -qhot option (high order transforms).

It is possible to use ESSL with GNU Fortran on Power, but you need to link in the XL libraries too, here is an example.

module load ibmessl
gfortran -DGEMM_D -fno-underscoring gemm_bench.F90 -o Dgemm_bench.exe -lesslsmp \
  -Wl,-rpath=/gpfs/panther/local/apps/ibm/lib -L/gpfs/panther/local/apps/ibm/xlsmp/4.1.5/lib \
  -lxlsmp -L/gpfs/panther/local/apps/ibm/xlf/15.1.5/lib -lxlf90_r -lxlfmath -lxl

Further Information

The IBM developerWorks community site for the XL compilers is here: external link:

The latest ESSL documentation is here: external link:

IBM RedBooks for Open Power: external link:

A very useful RedBook: Performance Optimization and Tuning Techniques for IBM Power Systems Processors Including IBM POWER8: external link:

IBM KnowledgeCenter external link:

Back to Contents Page