System Technical Specifications

Hardware Details for Midway3

System Overview

Midway3 is the latest high performance computing (HPC) cluster built, deployed and maintained by the Research Computing Center. Midway3 features a heterogeneous collection of the latest in HPC technologies, including Intel Cascade Lake and AMD EPYC Rome processors, HDR InfiniBand interconnect, NVIDIA V100 and RTX 6000 GPUs, and SSD storage controllers for more performant small file I/O.
At this time, only the Intel specific resources have been made available.

The key features of the Midway3 Intel hardware are listed as follows:

  • 220 nodes total (10,560 cores)
    • 210 standard compute nodes
    • 1 big memory node
    • 9 GPU nodes
  • All nodes have HDR InfiniBand (100 Gbps) network cards.
  • Each node has 960 GB SSD local disk
  • Operating System throughout the cluster is CentOS 8

Compute Nodes

There are a total of 210 standard Intel Cascade Lake CPU-only nodes available to all users.

Each node has the following base components:

  • 2x Intel Xeon Gold 6248R (48 cores per node)
  • 192 GB of conventional memory
  • HDR InfiniBand (100 Gbps) network card

Storage

Distributed file system storage: 2.2 PB of high performance GPFS storage
 

Important News

Midway/Midway2 storage is NOT mounted to the Midway3 cluster.

Post-Production Work

The RCC intends to mount the storage of Midway2 to Midway3 and vice-versa. This integration work will at times impact the Midway3 production phase for short periods. The RCC will provide advance notice of the anticipated system maintenance periods, so users can plan accordingly.

Storage Options

Until the post-production cross-cluster storage work is completed there are several available options, including:

  • Purchase capacity storage on Midway3.
  • A PI who has procured CPP compute nodes on Midway3 can be provided with a project space on Midway3 as long as the aggregate storage (Midway2 + Midway3) does not exceed their CPP storage on Midway2. Once the integration of the systems is completed, the RCC will help the PI consolidate their data back to Midway2 CPP storage. For assistance, please contact us at help@rcc.uchicago.edu.
  • A PI can be allocated temporary storage space on Midway3 up to their available CPP storage on Midway2 with the understanding that they will consolidate their data once the two systems are combined. For example, if a PI has 250 TB of CPP storage on Midway2 and 50 TB is unused, they can use up to the 50TB on Midway3. When Midway2 is mounted to Midway3, the RCC will move the portion of the 50TB back to Midway2. For assistance, please contact us at help@rcc.uchicago.edu.
  • Midway2 CPP storage should not be swapped with Midway3 CPP storage.

Network

HDR (100 Gbps)  Infiniband fabric between nodes and storage in a fat tree network topology.

GPU Card Specs

There are 9 GPU nodes that are part of the Midway3 communal resources. Each GPU node has the Standard Intel Compute Node specifications and the following GPU configurations:

  • 4 GPU nodes w/ 4x NVIDIA V100 GPUs per node
  • 5 GPU nodes w/ 4x NVIDIA Quadro RTX 6000 GPUs per node
See the User Guide for more detailed information on various aspects of the cluster and how to get access.