OSC has entered an agreement with OSU College of Medicine (CoM) to provide dedicated compute services. Researchers from CoM will have access to Pitzer cluster, whose hardware is described below.
Hardware
Pitzer node specifications for CoM dedicated compute are:
- 16 GPU "P18" nodes (deployed in 2018)
- 40 cores per node
- 384 GB of memory per node
- 1 TB of local disk space per node
- Dual Intel Xeon 6148s
- 2 Dual NVIDIA Volta V100 GPUs with 16GB GPU memory
- 100Gbs infiniband
- 21 GPU "P20" nodes (deployed in 2020)
- 48 cores per node
- 384 GB of memory per node
- 1 TB of local disk space per node
- Dual Intel Xeon 8268s
- 2 Dual NVIDIA Volta V100 GPUs with 32GB GPU memory
- 100Gbs infiniband
- 2 AI/GPU nodes (deployed in 2020)
- 48 cores per node
- 768 GB of memory per node
- 4 Quad NVIDIA Volta V100 GPUs w/32GB GPU memory and NVLink
- Dual Intel Xeon 8260s Cascade Lakes
- 4 TB disk space
- 25 CPU nodes (deployed in 2018)
- 40 cores per node
- 192 GB of memory per node
- Dual Intel Xeon 6148s Skylakes
- 1 TB disk space
In addition, high-performance storage is available on Pitzer Project file system.
Governance
CoM dedicated compute service is available to approved CoM users. To obtain access to the dedicated compute clusters and to project storage, please contact the OSU CoM Research Computing and Infrastucture subcommittee for RISST. The committee will also consider requests for approval for an increase in project storage quota for existing projects.
Connecting
CoM dedicated compute is only available to approved users. Users are guaranteed access to their nodes; idle nodes may be made available to general users for jobs of less than 4 hours in duration.
To access compute resources, you need to log in to Pitzer at OSC by connecting to the following hostname:
pitzer.osc.edu
You can either use an ssh client application or execute ssh on the command line in a terminal window as follows:
ssh <username>@pitzer.osc.edu
From there, you can run programs interactively (only for small and test jobs) or through batch requests.
Running Jobs
OSC clusters are utilizing Slurm for job scheduling and resource management. Slurm , which stands for Simple Linux Utility for Resource Management, is a widely used open-source HPC resource management and scheduling system that originated at Lawrence Livermore National Laboratory. Please refer to this page for instructions on how to prepare and submit Slurm job scripts. A compatibility layer that allows users to submit batch PBS job scripts is also available. However, we encourage userS to convert their Torque/Moab PBS batch scripts to Slurm.
Remember to specify your project codes in the Slurm batch jobs, such that:
#SBATCH --account=PCON0000
where PCON0000 specifies your individual project code for Pitzer. Below are some examples on requesting different resources on Pitzer.
To request one whole 40-core node for non-gpu computation (on P18):
#SBATCH --nodes=1 --ntasks-per-node=40
To request four whole 48-core nodes for non-gpu computation (on P20):
#SBATCH --nodes=4 --ntasks-per-node=48
To request 2 Quad GPU nodes, 4 gpus per node on P20:
#SBATCH --nodes=2 --ntasks-per-node=48 --gpus-per-node=4
You also have an option of running GPU jobs on Pitzer by requesting partial nodes. To request 24 cores and 2 GPUs:
#SBATCH --nodes=1 --ntasks-per-node=24 --gpus-per-node=2
It is important to note that available memory on the cluster scales with the number of cores in the job. For P18 dual GPU nodes, usable memory is 9292 MB/core or 363 GB/node; while P20 dual GPU nodes have usable memory of 7744 MB/core or 363 GB/node. You can request memory for a job with a --mem=XMb argument. For instance, to request 5 cores but with access to only 14283 MB of RAM, and charged for 5 cores worth of usage:
#SBATCH --nodes=1 --ntasks-per-node=5 --mem=14283
Scheduling Policy
As part of the servive level agreement between OSC and the OSU CoM, jobs submitted by the general client community (non-COM jobs) can be scheduled to run on the CoM dedicated nodes when they are idle (backfill). However, the general community jobs eligible for backfill must have a walltime request of no more than 4 hours to be considered for running on CoM nodes.
CPU-only jobs will be scheduled to run on the available 15 CPU nodes. If these CPU-only nodes are not available, the CPU only jobs will be scheduled on CPUs on GPU nodes, based on availability. Please note that in instances where a CPU job is requesting access to a large number CPUs, jobs requesting fewer CPUs might be queued and might not be offered priority.
Cluster Heterogeneity
Notice that since Pitzer is a heterogenous cluster (with P18 and P20 nodes), a job may land on different types of CPUs when submitting jobs repeatedly. To consistently use the same type of CPU — if need be — users can explicity specify which core or node type to use using the --constraint
option.
To run the job on the P18 40 core node (Dual Intel Xeon 6148s Skylake @2.4GHz CPUs) with usable memory of 9292 MB/core, specify the following in the SBATCH directive:
#SBATCH --constraint=40core
For the P20 48 core node (Dual Intel Xeon 8268s Cascade Lakes @2.9GHz) with usable memory of 7744 MB/core, use:
#SBATCH --constraint=48core
Jobs submitted without specifying a Pitzer partition (40 core or 48 core) will run on any available Pitzer core type, and will therefore tend to have shorter queue wait times.
Parallel Jobs and Shared partition
By default, parallel jobs on OSC clusters are assigned whole nodes. However, there might be cases where a user may want to run a parallel job that requires a number of CPUs that is a function of the number of requested GPUs, resulting in a multi-node job utilizing partial nodes, i.e using less then the total number of CPUs on each requested node. Simply stated, parallel jobs that don't required whole nodes enable available resourses to be shared with other jobs. Similarly, parallel jobs requesting partial nodes will theoretically have shorter wait times or time to solution since they are not predicated on availability of whole nodes in order to run.
To run parallel jobs that make use of partial nodes, users may use the -shared
partition, as shown bellow:
#SBATCH --partition=condo-osumed-gpu-40core-parallel-shared,condo-osumed-gpu-48core-parallel-shared
File Systems
CoM dedicated compute uses the same OSC mass storage environment as our other clusters. Large amounts of project storage is available on our Project storage service. Note that requests for project storage should be submitted to the OSU CoM Research Computing and Infrastucure subcommittee for RISST. Full details of the storage environment are available in our storage environment guide.
Software Environment
Users on the CoM nodes have access to all software packages installed on Pitzer cluster. By default, you will have the batch scheduling software modules, the Intel compiler and an appropriate version of mvapich2 loaded. Use module load <package>
to add a software package to your environment. Use module list
to see what modules are currently loaded and module avail
to see the modules that are available to load. To search for modules not be visible due to dependencies or conflicts, use module spider
.
You can see what packages are available on Pitzer by viewing the Software by System page and selecting the Pitzer system.
Software
You may check to see the list of previoulsy installed bioinformatics software from bioinformatics & biology software.
Software | Installed version |
---|---|
|
2.25.0 1.1.0 2.2.2 1.3.1 1.1.2 0.7.13 1.5.1 0.0.14 3.5 5/25/2016 4.8 4.0.2 2.0.0.8 1.1.4 2.3.0 1.1.8 1.3.1 4.2 2.6.3 2.5.2a 0.7.0 1.5.0-p2 0.36 2.4.1 0.1.14 |
Training and Education Resources
The following are resource guides and select training materials available to OSC users:
- Users new to OSC are encouraged to refer to our New User Resource Guide page and an Introduction to OSC training video.
- A guide to the OSC Client Portal: MyOSC. MySC portal is primarily used for managing users on a project code, such as adding and/or removing users.
- Documentation on using OnDemand web portal can be found here.
- Training materials and tutorial on Unix Basics are here.
- Documentation on the use of the XDMoD tool for viewing job performance can be found here.
- The HOWTO pages, highlighting common activities users perform on our systems, are here.
- A guide on batch processing at OSC is here.
- For specific information about modules and file storage, please see the Batch Execution Environment page.
- Information on Pitzer programming environment can be found here.
Job/Core Limits
Max Running Job Limit | Max Core/Processor Limit | ||||
---|---|---|---|---|---|
For all types | GPU jobs | Regular debug jobs | GPU debug jobs | For all types | |
Individual User | 384 | 140 | 4 | 4 | 3240 |
Project/Group | 576 | 140 | n/a | n/a | 3240 |
An individual user can have up to the max concurrently running jobs and/or up to the max processors/cores in use. However, among all the users in a particular group/project, they can have up to the max concurrently running jobs and/or up to the max processors/cores in use.
Select PCON project codes have priority access to the decicated hardware. Please contact the OSU CoM Research Computing and Infrastucture subcommittee for RISST to request priority access to CoM hardware juring job submission.
Walltime limit per job
All CoM jobs can run for a maximum of 168 hours (7 days).
For additional batch limit rules, select Pitzer limits or Owens limits.
Getting Support
Contact OSC Help if you have any other questions, or need other assistance.