Sometimes the combination of networking at a trade show and catching an insightful presentation provide a valuable insight into market dynamics. Earlier this year I attended the HPC and Quantum Computing summit in London and following this, watched Addison Snell’s (CEO of Intersect360 Research) “The New HPC” presentation from the last Stamford HPC Conference. Both confirmed my HPC suspicions garnered over the last year.
“HPC is changing” may not be fake news but it’s certainly not news. Those of us in the supercomputing industry know it’s continuing to rapidly evolve and a new form of HPC is starting to emerge. When I started writing this blog I struggled to find an appropriate term for the original, former HPC. Could it be described as ‘old?’, ‘traditional?’, etc….. but I’ve finally settled on ‘classic’. Classic HPC is still a much larger market than the newer HPC but to my mind it’s not as vibrant or diverse.
So what am I defining as ‘Classic HPC’? Well, I take this as the compute that is typically the domain of government, education and research centers and is very well catalogued by the Top-500 list of worldwide computer centers. All things classic HPC are methodically well planned. The computer centers often have three to five-year technology refresh budgets with a lengthy government like procurement process involving RFPs, preferred vendors and likely local politics and opinion formers playing a not so trivial element.
In classic HPC it often takes two or more years to refresh the technology in one of these supercomputer centers. There will be a review of available technology followed by a trial or proof of concept phase leading up to the RFP requirements which will be widely published. However, the vendors helping define the requirements are likely in the best position to generate the most compatible RFP response. Most of these supercomputing centers are filled with heavy metal, originally mainframes and now high-end servers from the traditional computer vendors like Cray, IBM, HPE, NEC, etc...
Notwithstanding its classic HPC name its changing with about 50% of the market being commercial projects and within that chemical engineering is growing the fastest.
About five to seven years ago classic HPC started to encounter an alternative. The new hyperscale companies like Google, Amazon and Microsoft were looking to build-out their cloud compute infrastructure very economically. My first insight into this was sitting in a bar at the North American Network Operators Group (NANOG) conference one Superbowl Sunday listening to Dave Crowley, Microsoft’s Chief Technical Advisor, trying to buy an 8 block of IP addresses (16,777,214) so that each of their Azure cloud compute servers could have a unique outside addressable IP address. Definitely not your father’s classic HPC!
Nowadays it’s very common for the hyperscalers to build their own servers in volume using slightly behind the bleeding edge hardware and compensate for its performance by deploying them in volume. In parallel to scaling their servers the hyperscalers and several desperate researchers started augmenting their server CPUs with commercial GPUs for large-scale HPC applications.
Today hardly any classic HPC supercomputer center upgrade doesn’t involve huge volumes of high-end servers with GPU augmentation. The number 1 in the Top-500 supercomputer list, the US Oak Ridge National Laboratory Summit computer cluster, has 9,216 IBM POWER9 22-core CPUs and 27,648 Nvidia Tesla V100 GPUs.
However, the untold story is the number of commercial supercomputers qualifying for 'Top-500 fame' but who are too busy focused on getting a new product to market to complete the necessary paperwork. DeepL, a deep neural network (DNN) based natural language translation portal employs a small number of staff and data scientists, but would likely make the top 40 of the Top-500 supercomputer list. They are engaged in a linguistic arms war with Alexa, Siri, Cortana, Google Now, etc... to have the best business translation capability. Their cluster retrains their DNNs continuously and when they need more computers or hosting to achieve this, they prioritise getting it yesterday or risk being left behind.
The 'new HPC' is volume server and GPU centric and fast moving, never stopping for the academic holidays and always looking for the best deal possible. The users never sleep, rarely boast and are always scaling to keep ahead of the pack.
Increasingly the research labs of the F-10,000 are following the example of the DNN start-ups and finding ways to cost-effectively exploit GPUs. This new HPC chases fast moving technology and is often powered by open source tools like TensorFlow which change almost as fast as your Android O/S.
Surprisingly the financial services community are leading the commercial new HPC pack. They are investing heavily in AI technology to drive a compelling competitive advantage in the risk management, trading and pricing areas. This investment is largely on their own hardware versus the compute cloud which they use for less competitive uses like general office and cloud supported enterprise applications.
Modest HPC applications are possible on the generic compute clouds but they are almost always more expensive than bare-metal clouds or dedicated hardware. The generic clouds are almost always cheaper for bursty loads like email, word processing, CRM and they are structured to be so, but this makes them a poor fit for HPC. The hot New HPC community are containerizing their applications and replicating their data so they aren’t beholden to any vendor or service provider.
Until recently I thought that quantum compute based HPC was about 10 years out into the future but I’m not so sure anymore. Quantum compute uses new hardware and programming techniques to approach some curve fitting type puzzles very quickly. Imagine that you had to find the low spot in 100 Scottish valleys. Traditional computers would map each point in each valley and compare them to adjacent and other points looking for the lowest one, perhaps in parallel using GPUs but using sequential type search algorithms. Quantum compute techniques allow the analysis of all the valleys simultaneously resulting in a much quicker solution for this type of puzzle.
There are very few commercially available quantum computers and those are generally destined for the classic HPC research labs where folks experiment to see their full potential.
What I’ve now started to see is people using GPUs to simulate quantum computers at a speed perhaps a thousand times slower than the quantum potential to test the technology’s application to various computational challenges.
Having verified the value of using quantum compute then it’s time to order the early production quantum hardware. My understanding is that quantum compute will initially be used for a small sub-set of a total project, possibly something which could not be performed at all using new HPC, and ultimately expanded as the ROI becomes more manageable – the DWave quantum compute cloud costs $2,000/hour which will most certainly get the folks in finance animated!
So, given quantum compute coming over the horizon it looks like the new HPC’s days are already numbered – such is the rapid pace of change in the global technology sector.
For now, whatever your flavour of HPC bring it to Verne Global in Iceland and we’ll make it well worth your trouble. Save up to 70% versus major metro colocation and if cloud is your thing with hpcDIRECT you can go 30% faster than hyperscale at up to 50% of the cost. Classic or new - those savings will never go out of fashion...