Skip to content

Turing-Innovation-Catalyst-Collab/RONIN-Jobscript

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RONIN-Jobscript

When running a python script on a remote cloud machine (e.g. AWS), you may wish to shut down the machine when the script finishes to limit costs. This is a shell script to implement auto-shutdown after your python job has finished. The functionality then becomes a bit like running a job on an HPC system, but without the queue.

Overview

This repo has a shell script run_job.sh which implements this workflow:

  1. Define maximum time before shutdown

  2. Run script.py

  3. If script.py finishes, shutdown

  4. Otherwise, wait until the maximum shutdown time. Once that is reached, kill script.py and shut down the instance.

Requirements

This script will only work on an Ubuntu-based system. You must run from a terminal with sudo access.

Setup instructions

  1. Create your python script, environment etc. as normal

  2. Copy run_job.sh into the root folder of your python project/repo on your remote machine

  3. Modify run_job.sh with the path of your python script by setting the PYTHON_SCRIPT variable at the top

  4. You may wish implement some code that will run if your job reaches the time limit (see graceful time limit shutdown below).

Usage instructions

Once you have finished the setup instructions, run your script using run_job.sh. You can add the following command line flags:

Flag Argument Default Description
--no-shutdown None (Shutdown enabled) Disables automatic system shutdown
--timeout <duration> 1h Sets maximum runtime before timeout
--script <path> script.py Path to Python script to execute
--job-name <name> my_python_job Custom name for the job and log file

Examples:

Running normally:

./run_job.sh

Testing without shutdown:

./run_job.sh --no-shutdown

Setting time limit to 45min:

./run_job.sh --timeout 45m

Running a different python script with a time limit of 12 hours:

./run_job.sh --timeout 45m --script anotherscript.py --job-name another_job

Testing

This script has been designed for and tested on RONIN using ubuntu instances. It successfully shuts down the instance and the RONIN interface recognises they are shutdown once you click refresh.

Graceful time limit shutdown

If your python script reaches the time limit that you set, it will be killed. If this happens, in the demo script.py there is a function handle_sigterm(signum, frame)which will run before the machine shuts down, so you can put in here anything like saving data, debugging info etc. you would like to output before a time limit shutdown. To implement this in your code:

  1. Add in the relevant imports sys and signal

  2. Copy the handle_sigterm function from script.py into your project and add any code you should like to run in the event your script reaches the time limit

  3. Add this line to the start of your script where your main code is (e.g. in your main() function): signal.signal(signal.SIGTERM, handle_sigterm)

About

Jobscript to submit jobs on RONIN and shut down afterwards

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published