The compute server shamu at CSB is a Beowulf type PC based parallel computer

The compute server shamu at CSB is a Beowulf type PC based parallel computer.

It is setup to suit the needs of a research group doing computational studies of biomolecules in solution, mainly molecular dynamics.

Configuration (September 1999)

Hardware

44x 450MHz PentiumII (512 KB cache) processors on 22 Dual processor MS6120 100 MHz bus motherboards, each with:

128 MB RAM (three boards have 256MB and one board has 512MB RAM)

4.3GB Seagate EIDE disks

FastEthernet, NetGear FA-310TX

Power supply

Chassis fan

One FastEthernet switch, D-LINK w/ 24 ports

One keyboard, mouse, monitor, floppy disk drive and graphics card for the whole system.

Most of this (18 boards) is mounted in a standard 19" rack, on simple aluminum shelves with holes to attach the motherboard, disk, powersupply and fan.

Software

OS: Linux RedHat6.0, with kernel version 2.2.9 SMP, using NFS, NIS and autofs to integrate the system in the user and file spaces at CSB.

Scheduling: NQS 3.50.5 (VERY IMPORTANT TOOL!) There are 2 patches which have to be applied.

Compiler: Absoft ProFortran 6.0

Parallel library: MPIch

Main application: CHARMM

Current NQS configuration

6 queues, 4 with access to 8 CPUs and 2 with 4 CPUs, with a time-limit of 24 hours (48 for the 4-CPU queues) on the CPU running the NQS scheduler. (In addition there are 8 DEC Alpha machines for single-CPU jobs, also under NQS control). We run CHARMM using its own socket based parallel communication library. The two remaining Linux-boxes are used for compilations, testing and as on-line spare machines (we also have a cardboard box with spare parts, from fans to memory and CPUs, right next to the rack).

Installation & configuration procedures

First node:

Install Linux in a minimal configuration.
Make sure the amount of memory is correctly setup in /etc/lilo.conf
Configure NIS and autofs.
Install NQS.
Install MPIch.

For the following nodes we copied the newly created disk to a new disk, made a handful of changes to
transform this new disk into a bootable disk for a system with a new identity (name&IP number). This procedure is run by a script and so is very easy to repeat for all the system disks of your machine, both at the initial setup
and later on in case of disk problems..

System administration and reliability

The system has been very stable, and does not incur any extra administration.

Performance

The standard CHARMM benchmark, 1000 steps of MD on myoglobin in water (14000 atoms) runs in about 10 minutes on 8 nodes (approximately same as on 16 nodes of a T3E)

As long as the users can fill the queue with jobs we get 100% utilization of the machine. The parallel overhead in each job of course reduces the actual yield, but on 8 CPUs our typical MD simulation jobs give around 65-70% efficieny (meaning that they run ca 5 times faster than on a single CPU)