Show HN: Cua-Bench – a benchmark for AI agents in GUI environments

January 26, 2026 at 17:46

Quality: 7/10 Relevance: 9/10

Summary

Cua-Bench is a benchmark suite within the open-source Cua platform for evaluating AI agents that can interact with GUIs. It provides RL environments (OSWorld, ScreenSpot, Windows Arena) and supports exporting trajectories for training, with setup and run instructions, architecture visuals, and a permissive MIT license.

Read Original Article