Datacenter at Bigelow Laboratory

High Performance Computing

Overview

Bigelow Laboratory for Ocean Sciences's High Performance Computer Cluster (HPCC), Charlie, is designed to handle a diverse range of scientific data processing needs. Charlie consists of three high performance nodes with a combined 468 processor cores, 4,000 GB of memory (RAM) and access to 250 TB of onsite storage.

Features

468 cores 4 TB memory 250 TB Storage 160 TFLOPS

Nodes

Charlie consists of three systems (nodes), C1, C2, and C3, that work together to process data. C1 and C2 are shared memory systems, where a single process can use any number of CPUs and any amount of memory for a single process/job. C3 is a distributed system, made up of fifteen seperate machines that each have 12 CPU cores and between 64 and 128 GB of memory. The memory on C3 can not be shared between machines/nodes.

C1: SGI UV2000 - 160 cores, 1.2 TB memory
C2: SGI UV300 - 128 cores, 2 TB memory
C3: Dell PowerEdge R610s - 180 cores, 1 TB memory

Network

Charlie's nodes are interconnected using a 10Gbps network. They are connected to the internet using a fast 1Gbps fiber network. Bigelow is also connected to the national Internet 2 and the local MaineREN networks that link research and educational institutions directly together in order to reduce latency and increase speed/performance when sharing resources.

Software

At the heart of Charlie's software environment is PBS Pro. PBS Pro is a scheduling system that allocates resources to HPCC jobs based on availability, priority, permissions, etc.. Charlie uses environment modules to provide access to many bioinformatics tools, including BLAST, VirSorter, and dada2, in addition to general applications such as R, Python, and OpenMPI.

Funding

Charlie has been supported by multiple grants, including from the National Science Foundation's Division of Biological Infrastructure and the Simons Foundation.