Join top executives in San Francisco on July 11-12, to hear how leaders are integrating and optimizing AI investments for success. Learn More
Microsoft has been developing a new artificial intelligence (AI) chip, internally code-named Athena, since as early as 2019, according to reporting from The Information today. The company could make Athena widely available for use within the company itself and OpenAI as early as next year.
Experts say Nvidia won’t be threatened by these moves — but it does signal the need for hyperscalers to develop their own custom silicon.
AI chip development in response to a GPU crisis
The chip, like those developed in-house by Google (TPU) and Amazon (Trainium and Inferentia processor architectures), is designed to handle large language model (LLM) training. That is essential because the scale of advanced generative AI models is growing faster than compute capabilities needed to train them, Gartner analyst Chirag Dekate told VentureBeat by email.
Nvidia is the market leader by a mile when it comes to supplying AI chips — with about 88% market share, according to John Peddie Research. Companies are vying just to reserve access to the high-end A100 and H100 GPUs that cost tens of thousands of dollars each — causing what could be described as a GPU crisis.
Join us in San Francisco on July 11-12, where top executives will share how they have integrated and optimized AI investments for success and avoided common pitfalls.
“Leading-edge generative AI models are now using hundreds of billions of parameters requiring Exascale computational capabilities,” he explained. “With next-generation models ranging in trillions of parameters, it is no surprise that leading technology innovators are exploring diverse computational accelerators to accelerate training while reducing the time and cost of training involved.”
As Microsoft seeks to accelerate its generative AI strategy while cutting costs, it makes sense that they develop a differentiated custom AI accelerator strategy, he added, which “could help them deliver disruptive economies of scale beyond what is possible using traditional commoditized technology approaches.”
Custom AI chips address the need for inference speed
The need for acceleration also applies, importantly, to AI chips that support machine learning inference — that is, when a model is boiled down to a set of weights that then use live data to produce actionable results. Compute infrastructure is used for inference every time ChatGPT generates responses to natural language inputs, for example.
Nvidia produces very powerful, general purpose AI chips and offers its parallel computing platform CUDA (and it derivatives) as a way to do ML training specifically, said analyst Jack Gold, of J Gold Associates, in an email to VentureBeat. But inference generally requires less performance, he explained, and the hyperscalers see a way to also impact the inference needs of their customers with customized silicon.
“Inference ultimately will be a much larger market than ML, so it’s important for all of the vendors to offer products here,” he said.
Microsoft’s Athena not much of a threat to Nvidia
Gold said he doesn’t see Microsoft’s Athena as much of a threat to Nvidia’s place in AI/ML, which has been dominant since the company helped power the deep learning “revolution” of a decade ago; built a powerhouse platform strategy and software-focused approach; and has seen its stock rise in an era of GPU-heavy generative AI.
“As needs expand and diversity of use expands as well, it’s important for Microsoft and the other hyperscalers to pursue their own optimized versions of AI chips for their own architectures and optimized algorithms (not CUDA specific),” he said.
It’s about cloud operating costs, he explained, but also about providing lower-cost options for diverse customers who may not need/want the high cost Nvidia option. “I expect all of the hyperscalers to continue to develop their own silicon, not just to compete with Nvidia, but also with Intel in general purpose cloud compute.”
Dekate also maintained that Nvidia shows no signs of slowing down. “Nvidia continues to be the primary GPU technology driving extreme scale generative AI development and engineering,” he said. “Enterprises should expect NVIDIA to continue building on its leadership-class innovation and drive competitive differentiation as custom AI ASICs emerge.”
But he pointed out that “innovation in the last phase of Moore’s law will be driven by heterogenous acceleration comprising GPUs, and application-specific custom chips.” This has implications for the broader semiconductor industry, he explained, especially “technology providers that have yet to meaningfully engage in addressing the needs of the rapidly evolving AI market.”
VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.