Job failing

Why does my job fail after only a couple seconds?

It's likely that there is an error in your job submission script. Check to make sure it is trying to cd to the correct directory, load any necessary modules, etc. Make sure your submission script has a blank line at the end. Staff can assist if requested; please email help@msi.umn.edu and let us know your username, which machine you are using, and the path to your submission script.

Supercomputing FAQs

Putting the following lines in your submission script will cause the system to send you emails about the status of your jobs to your UMN email address (replace YourInternetID with your UMN username):

#SBATCH --mail-type=END,FAIL     # Mail events (can use any combination of the following: ALL, NONE, BEGIN, END, FAIL)
#SBATCH --mail-user=YourInternetID@umn.edu

This does not need to be your UMN email address; you can also have this email sent to an alterantive email address, such as Gmail, with lines such as:

#SBATCH --mail-type=END,FAIL     # Mail events (can use any combination of the following: ALL, NONE, BEGIN, END, FAIL)
#SBATCH --mail-user=YourEmail@gmail.com

The emails that the system sends you will sometimes contain information regarding the nature of an error your job is experiencing. The emails will be sent to the email address you list in your job script.
 
If your job experiences an error there may be an error output file created that will contain information. Error outputfiles by default have names ending with .e followed by the job ID number, and are usually created in same directory as your submission script. Standard output files by default have names ending in .o followed by the job ID number. Standard output files may also be examined to determine if a job has executed properly.
 
You can use the following commands in your job script to change the name of each of these error and output files to a name of your choosing (error.$J and output.$J can also be replaced with your desired file names, but these examples contain the $J to name your file with your job ID number to make them easier to find).
 
Error file:    #SBATCH -e error.$J
Output File:    #SBATCH -o output.$J
 
The messages in your job output files and job status emails may help you or the MSI staff determine the source of a job error if one exists.  

If it is determined that an error exists in code which you have written, MSI has debugging software resources available which may help you locate and fix the error.

Discover Advanced Computing and Data Solutions at MSI

Our Services