OSC provides hardware and software services to support AI, ML and data analytical need for our clients. Data-intensive workloads are supported by AI services and software, interactive web-based access with OnDemand, GPU and parallel computing resources.
Cores and GPUs
OSC currently operates three HPC systems, the Cardinal, Ascend, and Pitzer clusters. These systems provide of a mix of x86 CPU cores and NVIDIA GPU devices in a range of configurations. You can find detail on available compute resources on the Cluster Computing page.
CPU configurations are available with up to 3 TB of memory to support large-scale data analyses. A summary of available GPU configurations is provided below:
- NVIDIA H100s with 96 GB device memory and NVLink (four per node)
- NVIDIA A100s with 80 GB device memory and NVLink (four per node)
- NVIDIA A100s with 40 GB device memory (two per node)
- NVIDIA Volta V100 with 32 GB device memory and NVLink (four per node)
- NVIDIA Volta V100 with 16 or 32 GB device memory (two per node)
Data Transfer and Storage
Ohio researchers have access to many reseach data storage options at OSC. OSC has over 14 petabytes (PB) of disk storage capacity distributed over several file systems, plus more than 14 PB of available backup tape storage. Additional storage capacity is expected to be available in November 2025.
There are multiple options for transferring data to OSC resources. Using our web platform, OnDemand, users can transfer smaller files (<10 GB) using simple drag and drop. For a more comprehensive solution, Globus can be used to connect securely to a wide range of data sources including AWS S3, MS OneDrive, Google Drive or DropBox. See our guidance on use of Globus at OSC. OSC also supports the SSH File Transfer Protocol (SFTP) and corresponding tools.
Software Support
Here is a list of software that we offer related to data analytics and machine learning.
- Python:
- A popular general-purpose, high-level programming language with numerous mathematical and scientific packages available for data analytics and machine learning. Python programming environment can be accessed through Jupyter App on OnDemand as well.
- R:
- A programming language for statistical and machine learning applications with very strong graphical capabilities
- Rstudio:
- RStudio is a free and open-source integrated graphical environment for R. Rstudio is available as OnDemand App with various versions of R.
- MATLAB:
- A full-featured data analysis toolkit with many advanced algorithms readily available. MATLAB is available as an OnDemand App as well.
- Spark:
- Big data Frameworks based on memory with distributed storage. Spark is available as OnDemand App as well
- Hadoop:
- Big data Frameworks based on a hard disk with distributed storage
- TensorFlow:
- TensorFlow is a free and open-source software library for dataflow and differentiable programming across a range of tasks. It is a symbolic math library and is also used for machine learning applications such as neural networks.
- Pytorch:
- PyTorch is a python machine learning library based on the Torch library, used for applications such as deep learning and natural language processing.
- Horovod:
- Horovod is a distributed training framework for TensorFlow, PyTorch, and many more
- Intel Compilers:
- Compilers for generating optimized code for Intel CPUs.
- Intel MKL:
- The Math Kernel Library provides optimized subroutines for common computing tasks such as matrix-matrix calculations. Statistical software: Octave, Stata, FFTW, ScaLAPACK, MINPACK, sprng2
- Other statistical softwares:
- Octave, Stata, FFTW, ScaLAPACK
Get a complete list of software available at OSC.
Public Dataset
View more about public dataset availability at OSC.
Containers at OSC
OSC now supports containers for several applications. More information is provided here.
Getting Started
If you are new to supercomputing, new to OSC, or simply interested in getting an account (if you don't already have one), please see here for further information.
Gettting Help
Several relevent HOWTO pages are available to provide guidance on data analytics and AI/ML at OSC.
Python Guidance
- HOWTO: Use Jupyter on OnDemand
- HOWTO: Create and Manage Python Environments
- HOWTO: Use GPU in Python
AI/ML Guidance
- HOWTO: Estimating and Profiling GPU Memory Usage for Generative AI
- HOWTO: Reduce GPU Memory Usage During ANN Training and Inference
- HOWTO: PyTorch Disctributed Data Parallel (DDP)
- HOWTO: PyTorch Fully Sharded Data Parallel (FSDP)
If these resources do not address your concern, please see our technical support page.