System Downtime

Date: 
Tuesday, September 15, 2015 - 7:00am to 5:00pm

DOWNTIME UPDATES:

Tues, Sept 15th 5PM: We are experiencing issues with the downtime.  Systems will not be up at 5PM.  We plan to be operational by later today.

Tues, Sept 15th 8PM: Oakley is back online and have resumed running of jobs.  Ruby has resumed running of jobs but remains unavailable for login.  Glenn and web services (OnDemand, AweSim) are experiencing networking issues and will not be brought back online tonight.   Work on resolving issues will resume tomorrow.  We appreciate your patience.

Wed, Sept 16th, 9AM: Oakey and Ruby are both online and fully operational.  Glenn and the web service (OnDemand, AweSim) issues are still being worked on.

Wed, Sept 16th, 12PM: All systems are up and running jobs.  Minor issues are still being sorted out.  OnDemand and AweSim are experiencing issues with Oakley related services.  Please report additonal issues to oschelp@osc.edu.  Thank you for your patience.

 
The Ohio Supercomputer Center has scheduled a downtime for all HPC systems starting Tuesday, September 15th, 2015, beginning at 7 a.m and scheduled to finish by 5 p.m. The downtime will affect the Glenn, Oakley, and Ruby clusters, web portals (including OnDemand and AweSim), and HPC file servers. Login services and access to storage will not be available during this time.
 
In order to quiesce the system for an orderly shutdown the batch scheduler will begin holding jobs that cannot complete before 6 a.m. September 15th.  Jobs that are not started will be held until after the downtime and then started once the system is returned to production status.
 
Departmental clusters that we are administering will not be affected by this outage.
 
Important updates will be done to the software stack on both Oakley and Ruby.  This will include updating the default modules.  Users who have built their own software may be required to rebuild their software or explicitly load the compiler and MPI modules used in building the software.
 
 
Other highlights of the downtime activities:
 
  • Routine OS patches, updates, firmware updates 
  • Maintenance on Parallel File System (Lustre) 
  • Batch server software upgrades and configuration changes
  • Network upgrades for data transfer nodes (scp.osc.edu, sftp.osc.edu, rsync.osc.edu)
 
To stay up to date on system notices, please visit http://osc.edu/n
 
Jobs that cannot complete before 6 am September 15th will be held until after the downtime.  This is a change from the previously listed 7 am cutoff time.