TorchTPU: Running PyTorch Natively on TPUs at Google Scale
Summary
TorchTPU introduces running PyTorch natively on Google's TPUs, emphasizing usability, portability, and performance. The post details an eager-first execution model, a static-graph path via Torch Dynamo and XLA/StableHLO, and support for distributed training, with a roadmap for 2026 including public release, Helion integration, and improved dynamic-shape handling.