We will have rolling reboots of all HPC clusters (Ascend, Cardinal, Owens, and Pitzer cluster), including login and compute nodes, starti

Slurm to be Upgraded to Version 23.11.4

Category: 
Resolution: 
Resolved

Updates on 04/08/2024:

The rolling reboots are completed. 

Updates:

We will perform rolling reboots on this upgrade on March 27, starting from 9AM. 

OSC is preparing to update Slurm on its production systems to version 23.11.4 in preparation for the deployment of the new Cardinal cluster. This version of Slurm has a number of improvements, but it also has a known regression in behavior where if a job requests both a total number of tasks (--ntasks=N) and a number of tasks per node (--ntasks-per-node=n), the number of tasks per node takes precedence. This behavior is contrary to what is stated in the Slurm documentation.

OSC users are strongly encouraged to review their job scripts for jobs that request both --ntasks and --ntasks-per-node. Jobs should request either --ntasks or --ntasks-per-node, not both. To learn more about how to properly prepare your job script, please refer to the following documentation: