The last few weeks have been quite hectic with build of our new HPC system from ClusterVision. We've had 14 pallets of computer equipment arrive and all of this needs to be assembled and racked ready so that ClusterVision can begin to configure the cluster.
Prior to any equipment arriving on site the Data Centre, where Balena would be housed, needed additional power and the cooling capacity need to be increased so that it could cope with the additional load. The additional chillers and coolers needed to be craned into place.
The new rack cabinets were positioned and plumbed into the cooling system.
On Monday 15th September, our Data Centre and Operations Team took delivery of 8 pallets of Dell compute nodes and chassis. Then on Tuesday another 4 pallets arrived with the remainder of the equipment: management nodes, storage units, switches, an assortment of cables and even more compute nodes. The team from ClusterVision began work in unboxing all the equipment ready to start racking the systems.
Below are some photos of the Intel Xeon Phi 5110p cards and the NVIDIA TESLA K20x cards.
Racking of all of the equipment was the quick part, the time consuming tasks were installing the Infiniband PCI cards in all of the nodes and the cabling, lots of cabling, which ClusterVision have kept nice and neat. The black cables running in between the racks connect up the core Infiniband switches. From the front the racks the cluster started to take shape, and we can see the Dell C8220 nodes with the blue ethernet and black infiniband cables linking these nodes to the rest of the cluster.
In the second week ClusterVision started to connect the power supplies to all the Dell C8000 chassis, ready for the nodes to be powered on for the first time. We only experienced a few issues with the nodes once they had power, but ClusterVision were at hand to quickly resolve these issues. At the end of this week ClusterVision had completed the physical build of the Balena HPC and had begun the configuring the networks, management tools and storage arrays.
We're now at the end of the third week, the majority of the cluster configuration is complete, burn-in tests have completed on the nodes. ClusterVision have just started run some Linpack (HPL) benchmark jobs, these will put a heavy load on Balena to help test the overall stability of the cluster. There is plently more work to be done before Balena is ready for general use; more updates will appear here.
To find out more about what the Balena HPC cluster will consist of visit: http://www.bath.ac.uk/bucs/services/hpc/facilities.