System Status

Follow @HPCNotices on X for real-time updates about the status of OSC's clusters and storage.

For information about the status of licenses for the software packages installed at OSC, visit the License Server Status Updates page.

View unresolved known issues and current system-related status messages below.

If you have encountered a problem not already listed below, please contact our client support team.

Scheduled OSC System Downtimes

OSC strives to minimize the number of system downtimes and provide clients with significant advance notice but reserves the right to change the date(s) of these downtimes and/or add additional downtimes to improve the performance of our services.

Tuesday, March 11, 2025

Active System Messages

Early User Program of Next Gen Ascend Starts 03/03/2025

The early user program for the next-generation resources now available on the Ascend cluster begins today, March 3, and will run through Monday, March 31. Check this page for more information: https://www.osc.edu/content/next_gen_ascend_1/next_gen_ascend_early_user_program

System Downtime March 11 2025

A downtime for OSC HPC systems is scheduled from 7 a.m. to 9 p.m., Tuesday, March 11 2025. The downtime will affect the Pitzer, Cardinal and Ascend Clusters, web portals, the state-wide licenses and HPC file servers. MyOSC (the client portal) will be available during the downtime. In preparation for the downtime, the batch scheduler will not start jobs that cannot be completed before 7 a.m., March 11. Jobs that are not started on clusters will be held until after the downtime and then started once the system is returned to production status.

Change is Coming!

Later this fall, OSC will be introducing a major change to the current process clients use to get help. We will be rolling out a new web-based system for clients to manage and review open cases with the service desk. More information will be forthcoming over the next few months, so keep an eye out for messages on various OSC communication channels, as well as at https://www.osc.edu/service_portal_deployment .

Known Issues (unresolved)

A list of all known issues, including those that have been resolved, can be found here.

Title Category Description Posted Updated
LS-DYNA mpp-dyna Cardinal: Remote access error on mlx5_0:1, RDMA_READ Cardinal, Software

You may encounter the following error while running mpp-dyna jobs with multiple nodes:

[c0054:22206:0:22206] ib_mlx5_log.c:179  Remote access error on mlx5_0:1/IB (synd 0x13 vend 0x88... Read more          
1 month 1 week ago 1 month 1 week ago