DigiNews

Tech Watch by Johan Denoyer

← Back to articles

Human Judgment as a Specification

Quality: 8/10 Relevance: 9/10

Summary

The Brown PLT blog post argues that as GenAI enters programming, formal methods are needed to ensure AI-generated solutions meet user intent. It discusses the challenge of turning informal prose into formal specifications and argues for keeping humans in the loop to avoid misinterpretation and automation bias. It introduces PICK, a tool that, given a prompt (e.g., a regex for dates), returns several plausible candidates and shows concrete strings that distinguish them, asking users to upvote or downvote. PICK is demonstrated across three domains—regular expressions, linear temporal logic, and attribute-based access control—using the same algorithm: generate candidates, sample differences, present scenarios, update scores, and converge or admit defeat. The workflow relies on closure under negation and intersection and the ability to sample the differences between candidates, enabling a spec-elucidation process where human judgments reveal implicit intent. The authors argue this approach provides a meaningful and moderate human-in-the-loop workflow that serves as an independent witness to user intent, helps catch mismatches between prompts and outcomes, and remains valuable even as models improve. They point to an ECOOP 2026 paper and a Pick-regex VS Code extension for readers to explore.

🚀 Service construit par Johan Denoyer