Hamilton-Jacobi-Bellman Equation: Reinforcement Learning and Diffusion Models
Summary
The article explains why the Hamilton-Jacobi-Bellman equation in continuous time aligns with Bellman's equation, extends to Itô diffusion processes, and shows how to solve the resulting control problem with neural policy iteration. It also connects diffusion models to stochastic optimal control, with concrete benchmarks (Stochastic LQR and Merton portfolio) and practical code snippets illustrating generator computations and policy updates.