GPT-5.5 matches heavily hyped Mythos Preview in new cybersecurity tests
Summary
Independent evaluation by the UK AI Security Institute finds GPT-5.5 performing at a similar level to Mythos Preview in cybersecurity benchmarks, suggesting hype around a single model may reflect broader advances in long-horizon reasoning rather than model-specific breakthroughs. The tests show GPT-5.5 achieving a 71.4% Expert score (vs 68.6% for Mythos Preview) and solving a Rust disassembler task in 10:22 at about $1.73 per 1k API calls, but it still fails the Cooling Tower scenario, highlighting limitations. The piece also discusses marketing rhetoric, limited releases, and the OpenAI Trusted Access for Cyber program for defensive research.