You can do real HPC in the cloud

The latest Top 500 list came out earlier this week. I’m generally indifferent to the Top 500 these days (in part because China has two exaflop systems that it didn’t submit benchmarks for). But for better or worse, it’s still an important measure for many HPC practitioners. And that’s why the fact that Microsoft Azure cracked the top 10 is such a big deal.

For years, I heard that the public cloud can’t be used for “real” HPC. Sure, you can do throughput workloads, or small MPI jobs as a code test, but once it’s time to do the production workload, it has to be bare metal. This has never not been wrong. With a public cloud cluster as the 10th most powerful supercomputer* in the world, there’s no question that it can be done.

So the question becomes: should you do “real” HPC in the cloud? For whatever “real” means. There are cases where buying hardware and running it makes sense. There are cases where the flexibility of infrastructure-as-a-service wins. The answer has always been—and always will be—run the workload on the infrastructure that best fits the needs. To dismiss cloud for all use cases is petty gatekeeping.

I congratulate my friends at Azure for their work in making this happen. I couldn’t be happier for them. Most of the world’s HPC happens in small datacenters, not the large HPC centers that tend to dominate the Top 500. The better public cloud providers can serve the majority of the market, the better it is for us all.

Leave a Reply

Your email address will not be published. Required fields are marked *