The famous o3 "GeoGuessr" prompt did not work

May 21, 2026 at 08:52

Quality: 8/10 Relevance: 9/10

Summary

This post analyzes OpenAI's o3 geolocation abilities using a GeoGuessr-style prompt. It compares a default prompt to a specialized prompt across 200 images, showing the basic prompt often performs as well or better, and highlights the importance of benchmarks to separate hype from actual capability. It also discusses how prompt engineering can mislead and the need for rigorous evaluation.

LLM & Prompting AI News

Read Original Article