Supercomputers

We currently operate two major systems:

  • Oakley Cluster, our newest, is an 8,300+ core HP Intel Xeon machineOakley computing cluster
    • One in every 10 nodes has 2 Nvidia Tesla GPU accelerators
    • One node has 1 TB of RAM and 32 cores, for large SMP style jobs
  • Glenn Cluster, a 5,300+ core IBM AMD Opteron machine

Our clusters share a common environment, and we have several guides available.

OSC also provides more than 2 PB of storage, and another 2 PB of tape backup.

  • Learn how that space is made available to users, and how to best utilize the resources, in our storage environment guide.

System Notices are available online.

Finally, you can keep up to date with any known issues on our systems (and the available workarounds). An archive of resolved issues can be found here.

Service: 

Glenn

Photo: Image of the Glenn supercomputer

The Ohio Supercomputer Center's IBM Cluster 1350, named "Glenn", features AMD Opteron multi-core technologies. The system offers a peak performance of more than 54 trillion floating point operations per second and a variety of memory and processor configurations. The current Glenn Phase II components were installed and deployed in 2009, while the earlier phase of Glenn – now decommissioned – had been installed and deployed in 2007.

<--break->2014/01/22: THE EIGHT GLENN LARGE MEMORY NODES HAVE BEEN REMOVED. 
The eight Glenn large memory nodes have been removed, to be reused as upgraded login nodes. All compute nodes on Oakley can match the 4 GB of RAM/core that was available in these nodes; if you need more than 48 GB of RAM in a single node, you can access one of the 8 nodes on Oakley with 192 GB of RAM, or the single node with 1 TB of RAM. For information about how to request those resources, please see http://www.osc.edu/supercomputing/computing/oakley, or contact OSC Help for assistance.

Hardware

The current hardware configuration consists of the following:

  • 658 System x3455 compute nodes
    • Dual socket, quad core 2.5 GHz Opterons
    • 24 GB RAM
    • 393 GB local disk space in /tmp
  • 4 System x3755 login nodes
    • Quad socket 2 dual core 2.6 GHz Opterons
    • 8 GB RAM
  • Voltaire 20 Gbps PCI Express adapters

There are 36 GPU-capable nodes on Glenn, connected to 18 Quadro Plex S4's for a total of 72 CUDA-enabled graphics devices. Each node has access to two Quadro FX 5800-level graphics cards.

  • Each Quadro Plex S4 has these specs:
    • Each Quadro Plex S4 contains 4 Quadro FX 5800 GPUs
    • 240 cores per GPU
    • 4GB Memory per card
  • The 36 compute nodes in Glenn contain:
    • Dual socket, quad core 2.5 GHz Opterons
    • 24 GB RAM
    • 393 GB local disk space in '/tmp'
    • 20Gb/s Infiniband ConnectX host channel adapater (HCA)

How to Connect

To connect to Glenn, ssh to glenn.osc.edu.

Batch Specifics

Refer to the documenation for our batch environment to understand how to use PBS on OSC hardware. Some specifics you will need to know to create well-formed batch scripts:

  • All compute nodes on Glenn are 8 cores/processors per node (ppn). Parallel jobs must use ppn=8.
  • If you need more than 24 GB of RAM per node, you will need to run your job on Oakley
  • GPU jobs must request whole nodes (ppn=8) and are allocated two GPUs each.

Using OSC Resources

For more information about how to use OSC resources, please see our guide on batch processing at OSC. For specific information about modules and file storage, please see the Batch Execution Environment page.

Supercomputer: 
Service: 

Queues and Reservations

Here are the queues available on Glenn. Please note that you will be routed to the appropriate queue based on your walltime and job size request.

Name Nodes available max walltime max job size notes

Serial

Available minus reservations

168 hours

1 node

 

Longserial

Available minus reservations

336 hours

1 node

Restricted access

Parallel

Available minus reservations

96 hours

256 nodes

 

Dedicated

Entire cluster

48 hours

965 nodes

Restricted access

"Available minus reservations" means all nodes in the cluster currently operational (this will fluctuate slightly), less the reservations listed below. To access one of the restricted queues, please contact OSC Help. Generally, access will only be granted to these queues if performance of the job cannot be improved, and job size cannot be reduced by splitting or checkpointing the job.

In addition, there are a few standing reservations.

Name Times Nodes Available Max Walltime Max job size notes
Debug 8AM-6PM Weekdays 16 1 hour 16 nodes For small interactive and test jobs.
GPU ALL 32 336 hours 32 nodes

Small jobs not requiring GPUs from the serial and parallel queues will backfill on this reservation.

Occasionally, reservations will be created for specific projects that will not be reflected in these tables.

Supercomputer: 
Service: 

Oakley

OSC’s newest system, an HP-built, Intel® Xeon® processor-based supercomputer dubbed the Oakley Cluster, features more cores (8,328) on half as many nodes (694) as the center’s former flagshipsystem, the IBM Opteron 1350 Glenn Cluster. The Oakley Cluster can achieve 88 teraflops, tech-speak for performing 88 trillion floating point operations per second, or, with acceleration from 128 NVIDIA® Tesla graphic processing units (GPUs), a total peak performance of just over 154 teraflops.

 

Hardware

Photo: OSC Oakley HP Intel Xeon ClusterDetailed system specifications:

  • 8,328 total cores
    • 12 cores/node  & 48 gigabytes of memory/node
  • Intel Xeon x5650 CPUs
  • HP SL390 G7 Nodes
  • 128 NVIDIA Tesla M2070 GPUs
  • 873 GB of local disk space in '/tmp'
  • QDR IB Interconnect
    • Low latency
    • High throughput
    • High quality-of-service.
  • Theoretical system peak performance
    • 88.6 teraflops
  • GPU acceleration
    • Additional 65.5 teraflops
  • Total peak performance
    • 154.1 teraflops
  • Memory Increase
    • Increases memory from 2.5 gigabytes per core to 4.0 gigabytes per core.
  • Storage Expansion
    • Adds 600 terabytes of DataDirect Networks Lustre storage for a total of nearly two petabytes of available disk storage.
  • System Efficiency
    • 1.5x the performance of former system at just 60 percent of current power consumption.

How to Connect

To connect to Oakley, ssh to oakley.osc.edu.

Batch Specifics

Refer to the documentation for our batch environment to understand how to use PBS on OSC hardware. Some specifics you will need to know to create well-formed batch scripts:

  • Compute nodes on Oakley are 12 cores/processors per node (ppn). Parallel jobs must use ppn=12.
  • If you need more than 48 GB of RAM per node, you may run on one of the 8 large memory (192 GB) nodes  on Oakley ("bigmem"). You can request a large memory node on Oakley by adding the following directive to your batch script: #PBS -l mem=192GB
  • We have a single huge memory node ("hugemem"), with 1 TB of RAM and 32 cores. You can schedule this node by adding the following directive to your batch script: #PBS -l nodes=1:ppn=32. This node is only for serial jobs, and can only have one job running on it at a time, so you must request the entire node to be scheduled on it. In addition, there is a walltime limit of 48 hours for jobs on this node.
Requesting less than 32 cores but a memory requirement greater than 192 GB will not schedule the 1 TB node! Just request nodes=1:ppn=32 with a walltime of 48 hours or less, and the scheduler will put you on the 1 TB node.
  • GPU jobs may request any number of cores and either 1 or 2 GPUs.

Using OSC Resources

For more information about how to use OSC resources, please see our guide on batch processing at OSC. For specific information about modules and file storage, please see the Batch Execution Environment page.

Supercomputer: 
Service: 

Queues and Reservations

Here are the queues available on Oakley. Please note that you will be routed to the appropriate queue based on your walltime and job size request.

Name Nodes available max walltime max job size notes

Serial

Available minus reservations

168 hours

1 node

 

Longserial

Available minus reservations

336 hours

1 node

Restricted access

Parallel

Available minus reservations

96 hours

170 nodes

 

Longparallel

Available minus reservations

250 hours

230 nodes

Restricted access

Hugemem

1

48 hours

1 node

 

"Available minus reservations" means all nodes in the cluster currently operational (this will fluctuate slightly), less the reservations listed below. To access one of the restricted queues, please contact OSC Help. Generally, access will only be granted to these queues if performance of the job cannot be improved, and job size cannot be reduced by splitting or checkpointing the job.

In addition, there are a few standing reservations.

Name Times Nodes Available Max Walltime Max job size notes
Debug 8AM-6PM Weekdays 12 1 hour 12 nodes For small interactive and test jobs.
GPU ALL 62 336 hours 62 nodes

Small jobs not requiring GPUs from the serial and parallel queues will backfill on this reservation.

OneTB ALL 1 48 hours 1 node Holds the 32 core, 1 TB RAM node aside for the hugemem queue.

 

Occasionally, reservations will be created for specific projects that will not be reflected in these tables.

Supercomputer: 
Service: