DigiNews

Tech Watch by Johan Denoyer

← Back to articles

Getting peak TOPS on a Ryzen AI 7 350 NPU

Quality: 9/10 Relevance: 9/10

Summary

The article analyzes the Ryzen AI 7 350 NPU (XDNA2, AIE-MLv2) and investigates how to approach peak TOPS by using an 8x8x8x8 int8 matrix multiply on a 32-tile compute array at ~1.8 GHz. It explains the architecture, SIMD/MAC operations, and the data-paths, then walks through a peak-tops implementation using MLIR/IRON tooling (mlir-aie, Peano) and a C++ kernel. The result is ~56 TOPS with ~95% efficiency, offering practical insight into benchmarking AI hardware and the associated development workflow.

🚀 Service construit par Johan Denoyer