In 2004 I was asked to design a HPC cluster by my supervisor at Imperial; for a long time this process resembled choosing components for an enthousiast microcomputer in the 1970s and 1980s; choosing the right components that, together, would provide the best platform (processor, storage, memory bandwidth, interconnects) for your cluster. Even in 2004, five years ago, this wasn’t as forthcoming as it is in the server or home computer market today: Apple had pretty good performance/price ratios in the XServe G5 (remember Virginia Tech?) and the performance per watt wasn’t that bad either — at least in the small window of time I got to come up with the spec. Intel had a Titanic Failure in the Itanium family and the Xeons and Opterons were having a field day.
But it was choosing the interconnects, that is the technology that networks the computers together, that was the most interesting; the two proprietary technologies that were most prominent among vendors in supercomputing at the time were Myricom’s Myrinet and Mellanox Infiniband (other options were Quadrics, SP Switches and other proprietary solutions). The former was much cheaper, but also had a strong, but dwindling presence in the top tier of supercomputing clusters, a bad sign; the latter was an up-and-coming competitor, faster, but also much more expensive. And Gigabit Ethernet was rapidly becoming the ‘Open Standard King’.
In my proposal I went with Infiniband, not only because it had the best performance, but also because it seemed futureproof enough for the needs of the group. While the design was approved, the cluster was never funded, partly due to the marginal needs of the research group (I was probably one of the very few people around that would make use of it and I left a year later) and partly due to the fact that a much larger network-simulation cluster had been installed less than a year earlier and many thought that yet another cluster was pointless (even though the two systems were completely different in architecture and scope).
Nevertheless, the experience of designing the cluster was great and five years later I’m reading that Infiniband, the technology I had chosen in 2004 for a supercomputing cluster is now more ‘readily’ available in two boards by MSI and Asus. With 10GE slowly entering the mass-market, technologies like Infiniband seems increasingly uninteresting, but it’s great when good technology trickles down to commodity hardware and at such lower prices, making the acquisition of HPC cluster hardware easier than ever before.