sbatch is used to submit a job script for later execution. The script will typically contain one or more srun commands to launch parallel tasks.
scancel is used to cancel a pending or running job or job step. It can also be used to send an arbitrary signal to all processes associated with a running job or job step.
sinfo reports the state of partitions and nodes managed by SLURM. It has a wide variety of filtering, sorting, and formatting options.
smap reports state information for jobs, partitions, and nodes managed by SLURM, but graphically displays the information to reflect network topology.
squeue reports the state of jobs or job steps. It has a wide variety of filtering, sorting, and formatting options. By default, it reports the running jobs in priority order and then the pending jobs in priority order.
srun is used to submit a job for execution or initiate job steps in real time. srun has a wide variety of options to specify resource requirements, including: minimum and maximum node count, processor count, specific nodes to use or not use, and specific node characteristics (so much memory, disk space, certain required features, etc.). A job can contain multiple job steps executing sequentially or in parallel on independent or shared nodes within the job's node allocation.
This table lists the most common command, environment variables, and job specification options used by the major workload management systems: PBS/Torque, Slurm, LSF, SGE and LoadLeveler. Each of these workload managers has unique features, but the most commonly used functionality is available in all of these environments as listed in the table.
These commands are in /opt/utility on AuN, Mio001, and Mc2.
[joeuser@mc2 ~]$ squeue -l Wed Jan 22 15:39:52 2014 JOBID PARTITION NAME USER STATE TIME TIMELIMIT NODES MIDPLANELIST(REASON) 890 mc2-part ademlt3e tpxyi RUNNING 2:42:50 1-00:00:00 2 bgq0000[00000x00001] 892 mc2-part 2h2o_2co ghahuazn RUNNING 1:01:33 1-00:00:00 64 bgq0000[10000x13330]
[joeuser@mc2 ~]$ sinfo -l Wed Jan 22 15:40:15 2014 PARTITION AVAIL TIMELIMIT JOB_SIZE ROOT SHARE GROUPS NODES STATE MIDPLANELIST mc2-part* up 1-00:00:00 1-infinite no FORCE all 66 allocated bgq0000 mc2-part* up 1-00:00:00 1-infinite no FORCE all 446 idle bgq0000
[joeuser@mio001 utility]$ ./expands ./expands Options: Without arguments If SLURM_NODELIST is defined use it to find the node list as described below. Note: SLURM_NODELIST is defined by slurm when running a parallel job so this command in realy only inside a batch script or when running interactive parallel jobs -h Show this help Usage: Takes an optional single command line argument, the environmental variable SLURM_NODELIST defined within a slurm job. SLURM_NODELIST is a compressed list of nodes assigned to a job. This command returns an expanded list similar to what is defined in the PBS_NODES_FILE under pbs. Example: [joeuser@mio001 utility]$ printenv SLURM_NODELIST compute[004-005] [joeuser@mio001 utility]$ ./expands $SLURM_NODELIST compute004 compute004 compute004 compute004 compute005 compute005 compute005 compute005 [joeuser@mio001 utility]$
[joeuser@mio001 utility]$ slurmnodes -h /opt/utility/slurmnodes Options: Without arguments show full information for all nodes -fATTRIBUTE Show only the given attribute, Witout ATTRIBUTE just list the nodes list of nodes Show information for the given nodes -h Show this help Author: Timothy H. Kaiser, Ph.D. February 2014 [joeuser@mio001 utility]$
[joeuser@mio001 utility]$ slurmjobs -h /opt/utility/slurmjobs Options: Without arguments show full information for all jobs -fATTRIBUTE Show only the given attribute, Witout ATTRIBUTE just list the jobs and users -uUSERNAME or USERNAME Show only jobs for USERNAME list of jobs to show Show information for the given jobs -h Show this help Author: Timothy H. Kaiser, Ph.D. February 2014 [joeuser@mio001 utility]$
phostname run without options just gives a list of nodes given to you for your srun command.
[joeuser@mio001 hello_bgq]$ srun --nodes=2 --ntasks-per-node=2 /opt/utility/phostname compute003 compute002 compute002 compute003
phostname run with the -f option also shows the node assigned to each MPI task the OpenMP threads for each MPI task.
[joeuser@mio001 hello_bgq]$ export OMP_NUM_THREADS=2 [joeuser@mio001 hello_bgq]$ srun --nodes=2 --ntasks-per-node=2 /opt/utility/phostname -f > ~^C [joeuser@mio001 hello_bgq]$ cd ~ [joeuser@mio001 ~]$ srun --nodes=2 --ntasks-per-node=2 /opt/utility/phostname -f > nlist [joeuser@mio001 ~]$ sort nlist compute002 0 0 compute002 0 1 compute002 1 0 compute002 1 1 compute003 2 0 compute003 2 1 compute003 3 0 compute003 3 1 [joeuser@mio001 ~]$