We hid backdoors in ~40MB binaries and asked AI + Ghidra to find them
Summary
The article details a BinaryAudit benchmark evaluating AI agents' ability to detect backdoors in compiled binaries using open-source tools like Ghidra and Radare2. It reports mixed results—moderate success on small binaries but significant false positives—and discusses limitations, tool gaps, and future directions for making AI-assisted binary analysis practical for security.