This page is a guide for IBM Power 9 (POWER9) systems. It is divided into two sections:
- A guide for users at CCI wishing to run on these systems
- A guide for system administrators at other sites looking to deploy and run these systems
- 1 For CCI users
- 2 For system administrators
- 2.1 Setting up the BMC
- 2.2 Red Hat Enterprise Linux (RHEL) 7 for Power 9 (ppc64le)
- 2.3 NVIDIA CUDA
- 2.4 IBM Spectrum Scale (GPFS)
- 2.5 Firmware updates
- 2.6 See also
For CCI users
This system is experimental!
This system is experimental and may be unavailable at any time.
User may connect to
bgrs01 to build and submit jobs via Slurm.
Two nodes each housing:
- Two IBM Power 9 processors clocked at 3 GHz. Each processor contains 20 cores with 4 hardware threads (160 logical processors per system).
- Four NVIDIA Tesla V100 GPUs with 16GiB of memory each
- 512 GB RAM
Nodes are connected with FDR Infiniband and connect to the unified File System.
Many packages for building software are available as modules. However, some tools are available without loading any modules and a subset of those packages can be overridden by modules. Please pay careful attention to which modules you have loaded.
- ninja 1.7.2
- cmake 188.8.131.52
- autoconf 2.69
- automake 1.13.4
- gcc 4.8.5
- clang/llvm 3.4.2
Currently the following are available as modules:
- automake 1.16.1
- bazel 0.17.2, 0.18.0, 0.18.1, 0.21.0
- ccache 3.5
- cmake 3.12.2, 3.12.3
- gcc 6.4.0, 6.5.0, 7.4.0, 8.1.0, 8.2.0
- xl/xl_r (xlC and xlf) 13, 16.0
- MPICH 3.2.1 (mpich module, built with XL compiler)
- CUDA 9.1
Note: When mixing CUDA and MPI, please make sure an xl module is loaded and nvcc is called with
-ccbin $(CXX) otherwise linking will fail.
Jobs are submitted via Slurm to one of the following partitions: debug, test.
The debug partition is limited to single node jobs, running up to 30 minutes, and may only use a maximum of 128G of memory.
The test partition makes both nodes and all their resources available for up to 6 hours. This partition will typically not be available during business hours to ensure nodes are available for compiling/debugging. Batch jobs may be submitted to this partition to be run when it is available.
Note: When submitting MPI jobs via Slurm, users must specify
Note: When submitting CUDA jobs via Slurm, users must specify
--gres=gpu:# to specify the number of GPUs desired per node.
One method for profiling is reading the time base registers (mftb, mftbu). An example of this is found in the FFTW cycle header.
The time base for the Power 9 processor is 512000000.
For system administrators
This is an effort to record all the various guides, quirks, and additional information for system administrators to install and run an IBM Power 9 system.
Setting up the BMC
Use the IBM provided guide to setup the BMC network address. The process uses "ipmitool" on the node but the system does not actually use IPMI for management functions. The system will not give any indication that the network configuration has changed until the final ipmitool raw command is executed (i.e.
ipmitool lan print 1 will return the original information until the raw command is sent).
Red Hat Enterprise Linux (RHEL) 7 for Power 9 (ppc64le)
- Ensure your Red Hat login has an activiation code specifically for a Power 9. Previous Red Hat licenses for Power systems will not grant access to the downloads necessary for Power 9.
- Obtain the DVD ISO Product Variant "Red Hat Enterprise Linux for Power 9" ppc64le "Red Hat Enterprise Linux Alternate Architectures 7.4 Binary DVD". Note: The "Red Hat Enterprise Linux for Power, little endian" product variant is not the correct variant for a Power 9 system.
Option 1: Installing via USB
The IBM guide for installing Linux on Power 9 via USB is fairly complete. You can prep the USB device as follows (this will overwrite data on the USB drive):
dd if=rhel-alt-server-7.4-ppc64le-dvd.iso of=<usb device>
Note: You must use the "Red Hat Enterprise Linux for Power 9"/"Alternate Architectures" variant of RHEL 7 for this method or the system will freeze when loading the installer.
Option 2: Installing via xCAT
This assumes you already have a working xCAT install. It may be helpful to review the xCAT OpenPower documentation. Power 9 uses OpenBMC and the tables/keys are slightly different than those for IPMI-based systems.
|Initial 7.4 kernel||4.11.0-44.el7a.ppc64le|
|Latest booting kernel||4.11.0-44.2.1.el7a.ppc64le|
|Driver version||387.36 BETA|
|Driver release Date||2017.12.21|
IBM Spectrum Scale (GPFS)
No known GA build for Power 9. Current 4.2.3 (184.108.40.206) GPL build fails:
/usr/lpp/mmfs/src/gpl-linux/tracelin.c: In function my_send_sig_info: /usr/lpp/mmfs/src/gpl-linux/tracelin.c:571:6: error: implicit declaration of function send_sig_info [-Werror=implicit-function- declaration] send_sig_info(mySig, sigData, tsP);
Review installation guide, particularly item specific to Power 9:
Firmware updates can be performed remotely with openbmctool per IBM's instructions:
openbmctool -U <username> -P <password> -H <BMC IP address or BMC host name> firmware flash <bmc or pnor> -f xxx.tar
where bmc or pnor is the type of image you wish to flash to the system.