Chess engines do weird stuff
Summary
The article surveys how modern chess engines leverage distillation, runtime adaptation, and gradient-free optimization (SPSA) to improve performance with less training. It contrasts search-driven strength with model-based improvements, notes runtime distillation as a practical technique, and highlights architectural tweaks like smolgen. The discussion offers transferable lessons for AI and automation, including data efficiency, adaptive evaluation, and parameter tuning without full gradient-based retraining.