Due to a critical security vulnerability we need to reboot public-facing systems to deploy a mitigation against the vulnerabilit

Known issues

Unresolved known issues

Known issue with an Unresolved Resolution state is an active problem under investigation; a temporary workaround may be available.

Resolved known issues

A known issue with a Resolved (workaround) Resolution state is an ongoing problem; a permanent workaround is available which may include using different software or hardware.

A known issue with Resolved Resolution state has been corrected.

Known Issues

Title Category Resolution Description Posted Updated
Apptainer container builds fail on compute nodes due to /tmp namespace behavior Software Resolved
(workaround)

Building Apptainer containers on compute nodes may fail during apt update or other package operations... Read more

3 months 3 weeks ago 2 months 3 weeks ago
GPU/VIS nodes for various OOD apps are broken Cardinal, OnDemand, Outage, Software Resolved

After the most recent downtime we discovered that various OOD apps relying on the "virtualgl" module on Cardinal were broken. We have since updated and pinned the latest virtualgl version... Read more

3 months 2 days ago 2 months 3 weeks ago
Handling full-node MPI warnings with MVAPICH 3.0 Ascend, Cardinal, Pitzer, Software Resolved
(workaround)

When running a full-node MPI job with MVAPICH 3.0 , you may encounter the following warning message:

[][mvp_generate_implicit_cpu_mapping] WARNING: You appear to be... Read more          
1 year 6 months ago 2 months 3 weeks ago
MVAPICH 3.0 hang due to PMI mismatch with Slurm Software Resolved
(workaround)

Applications such as Quantum ESPRESSO, LAMMPS, and NWChem experienced hangs with MVAPICH 3.0 due to a PMI mismatch. MVAPICH 3.0 was built with PMI-1, while newer Slurm versions on RHEL 9... Read more

6 months 4 weeks ago 4 months 2 weeks ago
Python version mismatch in Jupyter + Spark instance Software Resolved
(workaround)

You may encounter the following error message when running a Spark instance using a custom kernel in the Jupyter + Spark app:

25/04/25 10:49:01 WARN TaskSetManager:... Read more          
11 months 3 weeks ago 4 months 3 weeks ago
AlphaFold 3 Warning: NVIDIA Driver Version Mismatch Software Resolved

While running AlphaFold 3, you may encounter the following warning message:

025-08-27 12:44:10.888652: W external/xla/xla/service/gpu/nvptx_compiler.cc... Read more          
8 months 1 day ago 5 months 1 day ago
AlphaFold 3 GPU Out-of-Memory Error During Inference Software Resolved
(workaround)

When you run AlphaFold 3, you may encounter a GPU out-of-memory (OOM) failures during model execution. The job terminated with errors similar to:

Can't... Read more          
5 months 1 week ago 5 months 1 week ago
Testing Issue for Quantum Espresso 7.4.1 on Ascend Ascend, Software Resolved

Benchmark AUSURF112 for quantum-espresso/7.4.1 on Ascend aborts.  We suspect that this is a lurking bug in Quantum Espresso and are reporting it as a convenience.  Concerned users can use... Read more

7 months 1 week ago 5 months 1 week ago
Performance issues with MVAPICH2 on Cardinal Cardinal, Software Resolved
(workaround)

We have observed that several applications built with MVAPICH2, including Quantum ESPRESSO 7.4.1, HDF5, and OpenFOAM... Read more

7 months 3 weeks ago 5 months 1 week ago
CP package hang with MVAPICH3 in Quantum-Espresso Software Resolved

While using MVAPICH3 builds of Quantum ESPRESSO (QE), users may encounter hangs when running the CP package, which can lead to job timeouts. We recommend switching to the OpenMPI build of... Read more

10 months 1 week ago 5 months 1 week ago

Pages