AI is reversing cloud’s move to modular infrastructure: the scarce product is a pre-integrated rack with guaranteed memory, network, and latency that delivers tokens predictably. The live working set spans model weights, KV cache, tool outputs, routing state and multimodal preprocessing across HBM, DRAM, flash and interconnects, so **topology becomes...