DigiNews

Tech Watch by Johan Denoyer

← Back to articles

Occupancy Math on the AMD MI355X: A From-First-Principles Guide

Quality: 9/10 Relevance: 9/10

Summary

This post provides a from-first-principles walkthrough of occupancy on AMD MI355X (CDNA4). It explains the four resource limiters (VGPRs, SGPRs, LDS, and workgroup/barrier slots), how to compute the occupancy ceiling by hand, and how granularity and per-SIMD vs per-CU budgeting affect results. The author demonstrates with MXFP8 GEMM examples that maximizing occupancy is not always optimal and argues for ILP-based strategies to keep the matrix core fed, with practical measurement guidance.

🚀 Service construit par Johan Denoyer