Glossary Entry

Reinforcement Learning

A family of methods where an agent learns by taking actions, receiving rewards, and improving behavior over time.

RL Decision Making

Reinforcement learning is different from standard supervised learning because the system learns through interaction rather than from a fixed table of correct labels. The agent explores, receives rewards, and updates its behavior to improve long-run outcomes.

That framing underpins the bandits, policy-gradient, actor-critic, and RLHF-adjacent material across the blog. Once you can identify the agent, environment, action, and reward loop, those posts line up much more clearly.