Basic Slurm Commands:

sbatch is used to submit a job script for later execution. The script will typically contain one or more srun commands to launch parallel tasks.

scancel is used to cancel a pending or running job or job step. It can also be used to send an arbitrary signal to all processes associated with a running job or job step.

sinfo reports the state of partitions and nodes managed by SLURM. It has a wide variety of filtering, sorting, and formatting options.

smap reports state information for jobs, partitions, and nodes managed by SLURM, but graphically displays the information to reflect network topology.

squeue reports the state of jobs or job steps. It has a wide variety of filtering, sorting, and formatting options. By default, it reports the running jobs in priority order and then the pending jobs in priority order.

srun is used to submit a job for execution or initiate job steps in real time. srun has a wide variety of options to specify resource requirements, including: minimum and maximum node count, processor count, specific nodes to use or not use, and specific node characteristics (so much memory, disk space, certain required features, etc.). A job can contain multiple job steps executing sequentially or in parallel on independent or shared nodes within the job's node allocation.

Rosetta Stone of Workload Managers for PBS/Torque, Slurm, LSF, SGE and LoadLeveler:

Source
Source Link: http://slurm.schedmd.com/rosetta.html

This table lists the most common command, environment variables, and job specification options used by the major workload management systems: PBS/Torque, Slurm, LSF, SGE and LoadLeveler. Each of these workload managers has unique features, but the most commonly used functionality is available in all of these environments as listed in the table.

CSM Slurm Commands:

These commands are in /opt/utility on AuN, Mio001, and Mc2.

expands:
Creates a PPS_NODES_FILE type list of nodes on which you job is running.
slurmjobs:
Shows information about jobs on the systems.
slurmnodes:
Shows information about node, including usage.
phostname:
Gives node names on which an mpi program will run. Must be run with srun.

Examples:

Show jobs running…

[joeuser@mc2 ~]$ squeue -l
Wed Jan 22 15:39:52 2014
            JOBID PARTITION     NAME     USER    STATE       TIME TIMELIMIT  NODES MIDPLANELIST(REASON)
              890  mc2-part ademlt3e    tpxyi  RUNNING    2:42:50 1-00:00:00      2 bgq0000[00000x00001]
              892  mc2-part 2h2o_2co ghahuazn  RUNNING    1:01:33 1-00:00:00     64 bgq0000[10000x13330]

Show which nodes are allocated and idle…

[joeuser@mc2 ~]$ sinfo -l
Wed Jan 22 15:40:15 2014
PARTITION AVAIL  TIMELIMIT   JOB_SIZE ROOT SHARE     GROUPS  NODES       STATE MIDPLANELIST
mc2-part*    up 1-00:00:00 1-infinite   no FORCE        all     66   allocated bgq0000
mc2-part*    up 1-00:00:00 1-infinite   no FORCE        all    446        idle bgq0000



Script Examples:

For a very complete slurm script see:
mc2_script at http://hpc.mines.edu/bluem/quickfiles
BlueM run instructions at:
http://hpc.mines.edu/bluem/quick.html

expands:

[joeuser@mio001 utility]$ ./expands
./expands
     Options:
        Without arguments
           If SLURM_NODELIST is defined
           use it to find the node list as
           described below.
      
           Note: SLURM_NODELIST is defined
           by slurm when running a parallel
           job so this command in realy only
           inside a batch script or when
           running interactive parallel jobs
      
        -h
           Show this help
      
     Usage:
        Takes an optional single command line argument,
        the environmental variable SLURM_NODELIST
        defined within a slurm job.
      
        SLURM_NODELIST is a compressed list of nodes
        assigned to a job.  This command returns an expanded
        list similar to what is defined in the PBS_NODES_FILE
        under pbs.
      
     Example:
[joeuser@mio001 utility]$ printenv SLURM_NODELIST
compute[004-005]
[joeuser@mio001 utility]$  ./expands  $SLURM_NODELIST
compute004
compute004
compute004
compute004
compute005
compute005
compute005
compute005
      
[joeuser@mio001 utility]$

slurmnodes:

[joeuser@mio001 utility]$ slurmnodes -h
/opt/utility/slurmnodes
     Options:
        Without arguments show full information for all nodes
       -fATTRIBUTE
           Show only the given attribute, Witout ATTRIBUTE just list the nodes
        list of nodes
           Show information for the given nodes
        -h
           Show this help
 
     Author:
        Timothy H. Kaiser, Ph.D.
        February 2014
[joeuser@mio001 utility]$ 

slurmjobs:

[joeuser@mio001 utility]$ slurmjobs -h
/opt/utility/slurmjobs
     Options:
        Without arguments show full information for all jobs
       -fATTRIBUTE
           Show only the given attribute, Witout ATTRIBUTE just list the jobs and users
       -uUSERNAME or USERNAME
           Show only jobs for USERNAME
        list of jobs to show
           Show information for the given jobs
        -h
           Show this help
 
     Author:
        Timothy H. Kaiser, Ph.D.
        February 2014
[joeuser@mio001 utility]$ 

phostname:

phostname run without options just gives a list of nodes given to you for your srun command.

[joeuser@mio001 hello_bgq]$ srun --nodes=2 --ntasks-per-node=2  /opt/utility/phostname 
compute003
compute002
compute002
compute003

phostname run with the -f option also shows the node assigned to each MPI task the OpenMP threads for each MPI task.

[joeuser@mio001 hello_bgq]$ export OMP_NUM_THREADS=2
[joeuser@mio001 hello_bgq]$ srun --nodes=2 --ntasks-per-node=2  /opt/utility/phostname -f > ~^C
[joeuser@mio001 hello_bgq]$ cd ~
[joeuser@mio001 ~]$ srun --nodes=2 --ntasks-per-node=2  /opt/utility/phostname -f > nlist
[joeuser@mio001 ~]$ sort nlist
  compute002     0   0
  compute002     0   1
  compute002     1   0
  compute002     1   1
  compute003     2   0
  compute003     2   1
  compute003     3   0
  compute003     3   1
[joeuser@mio001 ~]$