AlphaFold 3 GPU Out-of-Memory Error During Inference
When you run AlphaFold 3, you may encounter a GPU out-of-memory (OOM) failures during model execution. The job terminated with errors similar to:
When you run AlphaFold 3, you may encounter a GPU out-of-memory (OOM) failures during model execution. The job terminated with errors similar to:
Trying to download a directory through code server results in the error "ondemand.osc.edu cannot open this folder because it contains system files"
To work around this, tar the directory first and download the .tar file, and untar it on your local machine
Applications such as Quantum ESPRESSO, LAMMPS, and NWChem experienced hangs with MVAPICH 3.0 due to a PMI mismatch. MVAPICH 3.0 was built with PMI-1, while newer Slurm versions on RHEL 9 use PMI-2. Although the development team states that using the PMI-1 interface with Slurm’s PMI-2 implementation should work, there may be a bug in MVAPICH 3.0.
We are currently testing MVAPICH 4.1 and plan to migrate the software stack associated with MVAPICH 3.0 to MVAPICH 4.1 in the coming weeks.
Benchmark AUSURF112 for quantum-espresso/7.4.1 on Ascend aborts. We suspect that this is a lurking bug in Quantum Espresso and are reporting it as a convenience. Concerned users can use Cardinal or Pitzer as a workaround.
Update on November 24, 2025: The issue is resolved with MVAPICH 4 variants.
Initially when going to the ouollama page it may display a broken image and hang. If this happens please refresh your browser by hitting the refresh button. Re-entering the URL may not work
We have observed that several applications built with MVAPICH2, including Quantum ESPRESSO 7.4.1, HDF5, and OpenFOAM, may experience poor performance on Cardinal. We suspect this issue could be related to the newer network devices or drivers. Since MVAPICH2 is no longer supported, we recommend switching to MVAPICH 3.0 or another MPI implementation to ensure continued performance and stability in your work.
Mathematica was not launching due to firewall issues. We point our mathematica instances at an external Mathematica server, which have stopped working. The issue is now resolved as of Sep 24, 2025
While running AlphaFold 3, you may encounter the following warning message:
After the downtime on August 19, 2025, users may encounter UCX errors such as:
UCX ERROR no active messages transport to <no debug data>: self/memory - Destination is unreachable
when running a multi-node job with intel-oneapi-mpi/2021.10.0, mvapich/3.0 or openmpi/5.0.2.
You may encounter missing functionality on MATLAB (server) such as SIMULINK not being available. This occurs when using MATLAB version r2024a. Please use MATLAB r2024b
If there are still issues with MATLAB r2024b, please open a ticket