Cooling is likely the real bottleneck. Air → liquid → immersion → maybe cryogenic. As GPT-likes scale and inference workloads run longer, the cooling load only compounds.
The funny thing is. It's not actually the technology. It's the business angle. Specifically, the speed to market and how easy is it to get more partners om board. So it'll be direct to chip for a long time yet.
It seems like Nvidia plans to keep with direct to chip cooling. Just spoke to a data centre building Nvidia's cloud - the real challenge is the data centre - it must be rebuilt (again) to support 600kW racks.
Cooling is likely the real bottleneck. Air → liquid → immersion → maybe cryogenic. As GPT-likes scale and inference workloads run longer, the cooling load only compounds.
The funny thing is. It's not actually the technology. It's the business angle. Specifically, the speed to market and how easy is it to get more partners om board. So it'll be direct to chip for a long time yet.
It seems like Nvidia plans to keep with direct to chip cooling. Just spoke to a data centre building Nvidia's cloud - the real challenge is the data centre - it must be rebuilt (again) to support 600kW racks.