4 Comments
User's avatar
Tectonyx's avatar

Cooling is likely the real bottleneck. Air → liquid → immersion → maybe cryogenic. As GPT-likes scale and inference workloads run longer, the cooling load only compounds.

Paul Mah's avatar

The funny thing is. It's not actually the technology. It's the business angle. Specifically, the speed to market and how easy is it to get more partners om board. So it'll be direct to chip for a long time yet.

Paul Mah's avatar

It seems like Nvidia plans to keep with direct to chip cooling. Just spoke to a data centre building Nvidia's cloud - the real challenge is the data centre - it must be rebuilt (again) to support 600kW racks.

AI Networking & Infrastructure's avatar

NVIDIA Rubin Platform is an AI Supercomputer with Six New Chipshttps://www.naddod.com/ai-insights/nvidia-rubin-platform-ai-supercomputer-with-six-new-chips