Can LLMs model real-world systems in TLA+?
Summary
This SIGOPS blog post analyzes whether LLMs can model real-world distributed systems using TLA+, via the SysMoBench benchmark. It highlights that LLMs tend to produce syntactically correct TLA+ specs but struggle with conformance and invariants, often reciting textbook protocols rather than accurately reflecting actual system behavior, as shown with examples like Raft, ZooKeeper, and Etcd. The piece describes the Transition Validation approach and discusses future directions, including Specula and broader benchmarking efforts.