Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 7 Current »

Say you dispatch thousands of jobs with Slurm, but goofed something up and want to cancel some of those jobs.

  • If you want to cancel all of your jobs then you can use scancel -u username, where username is your system username (i.e. jharri62 is my username).

  • Often you may want to be selective and keep some jobs running, but cancel others. This situation can be handled via the script below (source: cas on stack exchange).

For more see: https://www.rc.fas.harvard.edu/resources/documentation/convenient-slurm-commands/

Step-by-step guide

  1. Make a file to house the program. Put it somewhere convenient, like in your home directory, or in a directory of common scripts in your home directory. One way to make a file is nano yourfilename.sh

  2. Open the file with nano or another editor and paste the following into it:

    #!/bin/bash
    
    declare -a jobs=()
    if [ -z "$1" ] ; then echo "Minimum Job Number argument is required. Run as '$0 jobnum'" 
      exit 1
    fi
    minjobnum="$1"
    myself="$(id -u -n)"
    for j in $(squeue --user="$myself" --noheader --format=%i) ; do
      if [ "$j" -gt "$minjobnum" ] ; then
        jobs+=($j)
      fi
    done
    scancel "${jobs[@]}"
  3. Make the file executable: chmod u+x yourfilename.sh

  4. Usage is: yourfilename.sh 300000 where 300000 is the base job id number used to delimit where wanted jobs stop. In other words, any of your jobs with an ID smaller than this number will be retained, but jobs with IDs larger than this number will be removed. This will not mess with anybody else's jobs. Alternatively, if the current directory is not in your PATH, use: bash yourfilename.sh 300000




  • No labels