Owens

POSSIBLE OWENS OUTAGE

We have been experiencing an issue with the Ethernet switches in the Owens cluster, which may potentially kill the running jobs on Owens. We have been monitoring this issue closely and reserving nodes with the switch errors for emergency maintenance. So far, no running job has been killed due to this issue based on our monitoring. Oakley and Ruby clusters are not affected. A possible Owens outage will happen for the permanent fix of this issue. We'll provide updates as we learn more from the vendor. We apologize for any inconvenience this may cause you.

ROLLING REBOOT OF ALL THREE CLUSTERS STARTING SEPT. 4

Owens, Ruby, and Oakley clusters will have OS kernel and BIOS updates to address the latest Intel processor security vulnerabilities. As part of these updates it is necessary to update the OS distribution on Ruby and Oakley from Red Hat version 6.9 to 6.10. These changes will be applied via rolling reboots of compute nodes. Owens and Ruby will also have a software refresh at this time. For the most updated information on the software refresh, please see https://bit.ly/2PANrtZ

Pages