“These accelerators are becoming increasingly important in cloud infrastructure due to their superior price-performance and price-efficiency ratios, which lead to better return on investments,” he said.
Microsoft was a little late in joining the custom chip revolution. Its rivals had introduced custom chips for AI workloads years earlier, AWS in the form of Trainium and Inferentia, and Google in the form of Tensor Processing Units (TPUs), but it wasn’t until last year’s Ignite conference that Microsoft unveiled its first custom chips, Maia and Cobalt, to tackle internal AI workloads and make its data centers more energy efficient.
Speeding AI dataflows with DPUs, not GPUs
At this year’s event, Microsoft introduced two more chips: the Azure Boost DPU to accelerate data processing, and the Azure Integrated HSM module to improve security.
The Azure Boost DPU is a hardware-software co-design, specific to Azure infrastructure, that runs a custom, lightweight data-flow operating system that Microsoft claims enables higher performance, lower power consumption, and enhanced efficiency compared to traditional implementations.
Microsoft is also introducing a new version of its liquid-cooling sidekick rack to support servers running AI workloads, and a new disaggregated power rack co-designed with Meta that it claims will enable 35% more AI accelerators to fit in each server rack.
“We expect future DPU-equipped servers to run cloud storage workloads at three times less power and four times the performance compared to existing servers,” the company said in a blog post.