HIGH PERFORMANCE & RESEARCH COMPUTING

CCIT’s High-Performance Computing group maintains various HPC (“supercomputer”) resources and offers support for Mines faculty and students using using HPC systems in research efforts. The goal of the service is to help scientists do their science through the application of HPC.

What we do:

The goal of the HPC group is to help researchers do their research enhanced by high performance computing. The group:

  • Maintains several HPC platforms:
    • Mio – Machine managed unde the condo paradigm (150+ Tflops)
    • AuN (Golden) – Standard 144 node x86 based machine (50 Tflops)
    • Mc2 (Energy) – IBM Blue Gene Q with 512 nodes (100 Tflops)
    • Wendian – Mines’ newest machine with 87 nodes, 5 with GPUS (350+ Tflops)
  • Monitors the platforms for potential issues
  • Installs user software on the HPC platforms
  • Installs common libraries and community codes
  • Maintains documentation
  • Offers consulting services to enable more effective research
  • Offers workshops covering HPC topics
  • Provides help porting and optimizing applications
  • Provides recommendations for more effective use HPC platforms
  • Represents the Mines community at HPC related conferences

Where to find things?

This page describes how you can use Mines’ HPC resources, in particular:

Additional information, including machine descriptions can be found on the Quick Start Guide and FAQs under Further Resources.

Module System

The module system is commonly used on many HPC systems to help users set up their environment to run particular programs. The behavior on Linux systems is controlled by setting environmental variables. You can see the settings of all variables by running the command printenv. Arguably, the most important variables are PATH and LD_LIBRARY_PATH. PATH is a list of directories that can be searched for finding applications. Likewise, LD_LIBRARY_PATH is a list of directories that will be searched to find libraries used by applications. If you enter a command and see “command not found” then it is possible the directory containing the application is not in PATH. If an application can not find a library it the system will display a similar message. The module system is designed to easily set collections of variables. You can set a number of variables by loading a module Mines uses the lmod module system. The following description is taken from: https://lmod.readthedocs.io/en/latest/015_writing_modules.html

A Reminder of what Lmod is doing

All Lmod is doing is changing the environment. Suppose you want to use the “ddt” debugger installed on your system which is made available to you via the module. If you try to execute ddt without the module loaded you get:

[joeuser@mio001 ~]$  ddt
bash: command not found: ddt

[joeuser@mio001 ~]$  module load ddt
[joeuser@mio001 ~]$  ddt

After the ddt module is loaded, executing ddt now works. Let’s remind ourselves why this works. If you try checking the environment before loading the ddt modulefile:

[joeuser@mio001 ~]$ env | grep -i ddt
[joeuser@mio001 ~]$ module load ddt
[joeuser@mio001 ~]$ env | grep -i ddt

DDTPATH=/opt/apps/ddt/5.0.1/bin
LD_LIBRARY_PATH=/opt/apps/ddt/5.0.1/lib:...
PATH=/opt/apps/ddt/5.0.1/bin:...

[joeuser@mio001 ~]$ module unload ddt
[joeuser@mio001 ~]$ env | grep -i ddt

The first time we check the environment we find that there is no ddt stored there. But the second time there we see that the PATH and LD_LIBRARY_PATH have been modified. Note that we have shorten the path-like variables to show the important changes. There are also several environment variables which have been set. Full documentation on the module system can be found at https://lmod.readthedocs.io/en/latest/ Here are some examples:

module spider
will list all available modules
module -r spider mpi
List modules that are related to MPI
module keyword gromacs
List modules that are related to the gromacs program
module load Apps/gromacs/5.1.2
Load the module for gromacs – this will enable you to run gromacs
module purge
Unload all modules
module list
List currently loaded modules

Some modules have dependencies that need to be manually entered. For example, the gromacs module requires that modules for the compiler and MPI be loaded first. If there is an unsatisfied dependency you will be notified. A list of modules are available on the web pages:

AuN Modules
http://mindy.mines.edu/modules/aun/
Mc2 Modules
http://mindy.mines.edu/modules/mc2/
Mio Modules
http://tuyo.mines.edu/mods/
Wendian Modules
http://adelie.mines.edu/mods/

Modules for Compilers

The primary compilers on AuN, Mio, and Wendian are from the Intel and gnu (gcc) suites. For most parallel applications an MPI is also needed. MPI compilers actually require a backend compiler, again normally either Intel or gcc based. The gnu compilers are:

  • gcc – C
  • g++ – C++
  • gfortran – fortran

The Intel compilers are:

  • icc -C
  • icpc – C++
  • ifort – fortran

The default version of the gnu compilers is rather old 4.x. Newer versions of the compilers are available via a module load. For example:

[joeuser@mio001 ~]$ gcc -v 2>&1 | tail -1
gcc version 4.4.7 20120313 (Red Hat 4.4.7-18) (GCC) 

[joeuser@mio001 ~]$ module load PrgEnv/devtoolset-6

[joeuser@mio001 ~]$ gcc -v 2>&1 | tail -1
gcc version 6.2.1 20160916 (Red Hat 6.2.1-3) (GCC) 
[joeuser@mio001 ~]$ 

[joeuser@mio001 ~]$ gfortran -v 2>&1 | tail -1
gcc version 6.2.1 20160916 (Red Hat 6.2.1-3) (GCC) 
[joeuser@mio001 ~]$ 


[joeuser@mio001 ~]$ g++ -v 2>&1 | tail -1
gcc version 6.2.1 20160916 (Red Hat 6.2.1-3) (GCC) 
[joeuser@mio001 ~]$ 

The 2018 version of the Intel compilers can be loaded:

[joesuer@mio001 ~]$ module load Compiler/intel/18.0
[joesuer@mio001 ~]$ ifort -V
Intel(R) Fortran Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 18.0.1.163 Build 20171018
Copyright (C) 1985-2017 Intel Corporation.  All rights reserved.

[joesuer@mio001 ~]$ icpc -V
Intel(R) C++ Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 18.0.1.163 Build 20171018
Copyright (C) 1985-2017 Intel Corporation.  All rights reserved.

[joesuer@mio001 ~]$ icc -V
Intel(R) C Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 18.0.1.163 Build 20171018
Copyright (C) 1985-2017 Intel Corporation.  All rights reserved.

[joesuer@mio001 ~]$ 

When using the Intel compilers if a program uses some of the newer features of C++ it will need to reference libraires associated with the newer version of g++. So you may want to load the newer version of the gnu compilers.

The most popular versions of MPI available on the HPC platforms are OpenMPI and Intel MPI. MPI compilers are actually scripts that set environmental variables and then call a backen compilers. Both versions of MPI can be used with either gnu or Intel backend compilers. Examples: To build with gnu compilers and Openmpi

[joeuser@mio001 ~]$ module purge 
[joeuser@mio001 ~]$ module load PrgEnv/devtoolset-6
[joeuser@mio001 ~]$ module load MPI/openmpi/3.0.0/gcc

[joeuser@mio001 ~]$ which mpicc
/sw/compilers/mpi/openmpi/3.0.0/gcc/bin/mpicc

[joeuser@mio001 ~]$ which mpic++
/sw/compilers/mpi/openmpi/3.0.0/gcc/bin/mpic++

[joeuser@mio001 ~]$ which mpif90
/sw/compilers/mpi/openmpi/3.0.0/gcc/bin/mpif90

[joeuser@mio001 ~]$ which mpif77
/sw/compilers/mpi/openmpi/3.0.0/gcc/bin/mpif77
[joeuser@mio001 ~]$  

To build with Intel compilers and Openmpi

[joeuser@mio001 ~]$ module purge
[joeuser@mio001 ~]$ module load PrgEnv/devtoolset-6
[joeuser@mio001 ~]$ module load Compiler/intel/18.0
[joeuser@mio001 ~]$ module load MPI/openmpi/3.0.0/intel

[joeuser@mio001 ~]$ which mpicc
/sw/compilers/mpi/openmpi/3.0.0/intel/bin/mpicc

[joeuser@mio001 ~]$ which mpic++
/sw/compilers/mpi/openmpi/3.0.0/intel/bin/mpic++

[joeuser@mio001 ~]$ which mpif90
/sw/compilers/mpi/openmpi/3.0.0/intel/bin/mpif90

[joeuser@mio001 ~]$ which mpif77
/sw/compilers/mpi/openmpi/3.0.0/intel/bin/mpif77
[joeuser@mio001 ~]$ 

To build with gnu compilers and Intel MPI

[joeuser@mio001 ~]$ module purge
[joeuser@mio001 ~]$ module load PrgEnv/devtoolset-6
[joeuser@mio001 ~]$ module load MPI/impi/2018.1/gcc

[joeuser@mio001 ~]$ which mpicc
/sw/compilers/intel/2018/compilers_and_libraries_2018.1.163/linux/mpi/bin64/mpicc

[joeuser@mio001 ~]$ which mpicxx
/sw/compilers/intel/2018/compilers_and_libraries_2018.1.163/linux/mpi/bin64/mpicxx

[joeuser@mio001 ~]$ which mpif77
/sw/compilers/intel/2018/compilers_and_libraries_2018.1.163/linux/mpi/bin64/mpif77

[joeuser@mio001 ~]$ which mpif90
/sw/compilers/intel/2018/compilers_and_libraries_2018.1.163/linux/mpi/bin64/mpif90
[joeuser@mio001 ~]$ 

You can verify that the gnu compilers are being used as the backend

[joeuser@mio001 ~]$ mpicc -v 2>&1 | tail -1
gcc version 6.2.1 20160916 (Red Hat 6.2.1-3) (GCC) 
[joeuser@mio001 ~]$ 

To build with Intel compilers and Intel MPI

[joeuser@mio001 ~]$ module purge
[joeuser@mio001 ~]$ module load PrgEnv/devtoolset-6
[joeuser@mio001 ~]$ module load Compiler/intel/18.0
[joeuser@mio001 ~]$ module load MPI/impi/2018.1/intel
[joeuser@mio001 ~]$ 
[joeuser@mio001 ~]$ 
[joeuser@mio001 ~]$ which mpicc
/sw/compilers/intel/2018/compilers_and_libraries_2018.1.163/linux/mpi/bin64/mpicc

[joeuser@mio001 ~]$ which mpif90
/sw/compilers/intel/2018/compilers_and_libraries_2018.1.163/linux/mpi/bin64/mpif90

[joeuser@mio001 ~]$ which mpif77
/sw/compilers/intel/2018/compilers_and_libraries_2018.1.163/linux/mpi/bin64/mpif77

[joeuser@mio001 ~]$ which mpicxx
/sw/compilers/intel/2018/compilers_and_libraries_2018.1.163/linux/mpi/bin64/mpicxx

[joeuser@mio001 ~]$ which mpiicc
/sw/compilers/intel/2018/compilers_and_libraries_2018.1.163/linux/mpi/bin64/mpiicc
[joeuser@mio001 ~]$ 

You can verify that the Intel compilers are being used as the backend

[joeuser@mio001 ~]$ mpicxx -v
mpiicpc for the Intel(R) MPI Library 2018 Update 1 for Linux*
Copyright(C) 2003-2017, Intel Corporation.  All rights reserved.
icpc version 18.0.1 (gcc version 6.2.1 compatibility)

Compiling

Compiling for various processors with the Intel compilers

Mines’s HPC resources AuN, Mio, and Wendian, contain various generations of Intel Processors. AuN contains 16 core nodes with Sandybridge processors. Mio contains: Nehalem, Westmere, Sandybridge, Ivybridge, Haswell, Broadwell and Skylake (oldest to newest) processors. Wendian uses Skylake processors. This is important because newer generation processors have some instructions that will not work on older processors. The newer instructions offer optimizations for the newer processors. Some of these optimizations allow for significantly faster speed for some operations, in particular when working on arrays or vectors. When a program is built the Intel compilers detect what generation processor is being used to compile the code and will include the latest instructions for that processor. If the code is then run on an older processor it might return an illegal instruction error. Or if it is run on a newer processor it will not take advantage of the increased functionality. Applications built on Wendian may not run on Mio or AuN because Wendian’s head nodes contains Skylake processors. It is possible to:

  1. Build an application with the lowest common instruction set so it will run on all processors
  2. Build an application so that it contains multiple sets of instructions so it will use advanced features on processors when available
  3. Build an application so that it will only run on specific (or newer) processors

Here are two options that control the instruction set used for a compile. -ax and -march These command have the same sub-options, specifying which processor to target. However, they work differently. The -ax sub-options are additive. You can specify multiple processors and the binary will contain instructions for each. This will create programs that can be larger but it will run well on all specified processors. The -march option will only take a single sub-option. This will create programs that can run well on the specified generation processor or newer but will most likely not run on older processors. The table below shows the various generations of processors on Mines’ platforms and the important extra instructions that are added in each generation. With the exception of “fma” these are advanced versions of vector instructions.

FeatureProcessor
-Core 2nehalemwestmeresandybridgeivybridgehaswellbroadwellskylake
ssexxxxxxxx
sse2xxxxxxxx
sse4_1xxxxxxx
sse4_2xxxxxxx
avxxxxxx
fmaxxx
avx2xxx
avx512fx
avx512cdx
-ax suboptions:
SUBOPTIONAdding instructions for Processor
-ax=SSE2anywhere
-ax=SSE4.2Nehalem or newer
-ax=AVXSandybridge or newer
-ax=SANDYBRIDGESandybridge or newer
-ax=IVYBRIDGEIvybridge or newer
-ax=HASWELLHaswell or newer
-ax=BROADWELLBroadwell or newer
-ax=SKYLAKESkylake
-ax=SKYLAKE-AVX512Skylake
-ax=CORE-AVX-ISandybridge or newer
-ax=CORE-AVX-2Ivybridge or newer
-ax=CORE-AVX512Skylake
As noted above, -ax options are additive. You can specify -ax=SSE2,CORE-AVX512 and the code should run on any processor but it will also use the Skylake instruction set if run on a node that supports it. Since the -ax options are additive, if you specify -ax=CORE-AVX512 the code will contain the default instruction set which will run anywhere but will also contain Skylake specific instructions that will be used if run on that processor. As you add more options the code will grow in size. For example, a common chemistry code was 44 Mbytes with -ax=SKYLAKE-AVX512 and 24 Mbytes without the option. The table below shows the options for march
SUBOPTIONProcessor Set
-marchanywhere
-march=corei7nehalem or newer
-march=core-avx-isandybridge or newer
-march=sandybridgesandybridge or newer
-march=ivybridgeivybridge or newer
-march=haswellhaswell or newer
-march=core-avx2haswell or newer
-march=broadwellbroadwell or newer
-march=skylakeskylake
-march=skylake-avx512skylake
-march=core-avx-isandybridge or newer
-march=core-avx-2ivybridge or newer
The -march options are not additive. You can only specify one and you can not use both the -ax and -march options. Note: if you build an application on Wendian and don’t specify either -ax or -march it will in default to effectively -march=skylake-avx512 and the application will most likely not run on Mio or AuN. Here are some example compiles with notes on where the apps will run and the size of the application.

#Build on Mio to run anywhere
[tkaiser@mio001 hybrid]$ icc phostone.c 
[tkaiser@mio001 hybrid]$ ls -lt a.out
-rwxr-x--x 1 tkaiser tkaiser 120095 Sep  7 13:18 a.out

#Build on Mio to run anywhere but include skylake instructions
[tkaiser@mio001 hybrid]$ icc -ax=SKYLAKE-AVX512 phostone.c -o runs_well
[tkaiser@mio001 hybrid]$ ls -lt a.out
-rwxr-x--x 1 tkaiser tkaiser 140969 Sep  7 13:18 a.out

#Build on Mio to run anywhere but include skylake instructions, same as above
[tkaiser@mio001 hybrid]$ icc -ax=SKYLAKE-AVX512,SSE2 phostone.c 
[tkaiser@mio001 hybrid]$ ls -lt a.out
-rwxr-x--x 1 tkaiser tkaiser 140969 Sep  7 13:19 a.out

#Build on Mio but can only run on skylake nodes
[tkaiser@mio001 hybrid]$ icc -march=skylake-avx512  phostone.c -o skylake_only
[tkaiser@mio001 hybrid]$ ls -lt a.out
-rwxr-x--x 1 tkaiser tkaiser 125392 Sep  7 13:19 a.out

#Trying to run a skylake code on Mio returns an error
[tkaiser@mio001 hybrid]$ srun -N 1 --tasks-per-node=8 ./a.out
srun: job 4393558 queued and waiting for resources
srun: job 4393558 has been allocated resources
srun: error: compute030: tasks 0-7: Illegal instruction (core dumped)
[tkaiser@mio001 hybrid]$ 

Compiling for various processors with the gcc compilers

The gcc compilers support the -march option as described under Intel Compilers with the following sub-options.

core2
Core 2 CPU
nehalem
Nehalem CPU
westmere
Westmere CPU
sandybridge
Sandy Bridge CPU
ivybridge
Ivy Bridge CPU
haswell
Haswell CPU
broadwell
Broadwell CPU
skylake
Skylake CPU

Using the scheduler (Slurm)

Access to compute resources on Mines’ HPC platforms in managed via a scheduler. That is, to run on a compute node a user normally would create a script. The script is submitted to the scheduler using the sbatch command. The scheduler will launch the script on compute resources when they are available. The script consists of two parts, instructions for the scheduler and the commands that the user wants to run.

Here is a simple example:

#!/bin/bash
#SBATCH --job-name="sample"
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=4
#SBATCH --ntasks=8
#SBATCH --exclusive
#SBATCH --export=ALL
#SBATCH --time=01:00:00

cd $HOME
ls > myfiles
srun hostname

The lines that begin with #SBATCH are instructions to the scheduler. You are telling the scheduler that you want to run on two nodes and run 4 tasks per node for a total of 8 tasks. The ntasks line is redundant in case. When the job runs you will have exclusive access to the nodes. All your environmental variables setting will passed to the compute nodes. You will run no longer than 1 hr.

The last three lines are normal commands. You will be put in your home directory. A directory listing will be put in the file myfiles. Finally, the srun command will launch the program hostname in parallel, in this case 8 copies will be started simultaneously. Note that the “ls” command is not run in parallel; only a single instance will be launched.

The script is launched using the sbatch command. By default the standard output and standard error will be put in files with the names slurm-######.out and slurm-######.err where ###### is a job number. For example running this script on AuN produces:

[joeuser@aun001 ~]$ sbatch dohost
Submitted batch job 363541

After some time...

[joeuser@aun001 ~]$ ls -lt | head
total 88120
-rw-rw-r--  1 joeuser joeuser        64 Sep 24 16:28 slurm-363541.out
-rw-rw-r--  1 joeuser joeuser      2321 Sep 24 16:28 myfiles
...

[joeuser@aun001 ~]$ cat slurm-363541.out
node001
node002
node001
node001
node001
node002
node002
node002

There are a number of user level commands availabel for managing jobs on the HPC resources. These are discussed breifly on the page: http://geco.mines.edu/prototype/How_do_I_manage_jobs/

Basic scripting and job submission are discussed under the FAQ tab How do I do a simple build and run?

If you want to select specific nodes, such as nodes belonging to a particular group or with a particular type of processor see: FAQHow do I select MY nodes?

There is a guide for creating complex scripts, including how to run parameter sweep jobs (array jobs), chained jobs, and scripts that self document under the : FAQI want to run complex scripts; any advice?

There is a nice overview of slurm at provided by the HPC Knowledge Portal

Full documentation for the slurm scheduler can be found at the Slurm Home Page:

Policies

  1. Nodes on Wendian are available for purchase.
  2. Node purchases are Mines-subsidized for a per-node cost of $8500 / $10427 (high mem).
  3. When the number of PI-purchased nodes reaches 75% of the total, new nodes will be added.
  4. The percent of nodes kept from private (PI) purchase will remain at or above 25% of the total.
  5. New nodes acquired at policy-dictated intervals will be of the current generation.
  6. Queue management of this hybrid environment is outlined below, with periodic re-evaluation as HPC community experience evolves:
    • There are three queue types:
      • Group: Comprises QoSes of nodes owned by PIs;
      • Normal: Includes remaining nodes in separate QoS;
      • Full: Comprises all nodes on machine (Group + Normal).
  7. Allocations on Wendian are by proposal.
  8. Allocations are awarded in fixed core-hours; upon expiration or depletion, priority for running jobs will decrease.
  9. Allocations will not be debited if users run jobs on their set of purchased nodes.
  10. Allocations will be charged for 36 core per node by jobs run as ‘exclusive’, regardless of cores used.
  11. A default amount of memory is set per job. Users are encouraged to request only the amount of memory needed by a job.
  12. Allocations will be charged based on the higher of two metrics; number of cores or amount of memory used; for nonexclusive jobs.
  13. Wendian will have approximately 1 Pbyte (1000 Tbytes) of storage with the majority in scratch.  Additional storage is available for research groups to purchase.  Files stored in owned storage will not be purged, and are NOT BACKED UP.
  1. Compute time on Mc2 and AuN will be free. That is, time will not be charged against allocations.
  2. Allocations on Wendian, Mc2, and AuN will be by proposal
  1. Directories belonging to people who leave Mines will be deleted after 3 months. It is the PI’s responsibility to archive any desired data before that time.
  2. Directories which have not been accessed for 1 years are subject to deletion. Directories maybe deleted earlier as the needed.
  3. Wendian will have approximately 1 Pbyte (1000 Tbytes) of storage with a majority of the storage in scratch. Research groups will have the opportunity to “purchase” some of the storage. Files stored in owned storage will not expire.
  1. Students who are not supported by a researcher will be allowed to run on Mio. However, faculty will not be allowed to take authorship of papers based on research done on Mio unless they own nodes.
  2. Students supported by faculty and faculty will only be allowed to run on Mio if their research group has purchased nodes on Mio.
  3. Nodes on Mio that fail outside of warranty will be retired.
  4. No new research groups will be added to Mio.