Our evaluation of Claude Mythos Preview’s cyber capabilities
Summary
AISI evaluates Claude Mythos Preview’s cybersecurity capabilities, reporting improved performance in capture-the-flag tasks and multi-step attack simulations, including autonomously completing portions of a 32-step corporate network attack. The study notes limitations of evaluation environments and emphasizes the importance of basic security practices for organizations, while highlighting the dual-use nature of AI in security.