AVX2 is slower than SSE2-4.x under Windows ARM emulation
Summary
A RemObjects blog post benchmarks AVX2 versus SSE2-4.x under Windows ARM emulation (Prism) and finds AVX2 is significantly slower when emulated, with a geometric mean around 2/3 the performance of SSE2-4.x. The article details the benchmarking approach (21 math operations, normalised baselines), discusses possible causes (NEON width mismatch, emulation optimizations), and concludes that for performance-critical apps you should compile for ARM rather than rely on AVX2 emulation. It also notes practical implications for developers targeting Windows on ARM and the importance of native ARM builds for speed.