What is reinforcement learning?
Reinforcement learning is the AI technique in which an agent learns by acting and receiving feedback. It produced breakthroughs in chess, Go, and autonomous systems.
Learning by doing
Reinforcement learning (RL) is a learning method where an AI agent learns by interacting with an environment. The agent takes actions, receives rewards or penalties, and adjusts its behavior to maximize the total reward.
The core components
- Agent — the system that learns
- Environment
- State
- Action
- Reward
- Policy — which action for which state?
Well-known breakthroughs
- AlphaGo (2016) — defeated the best Go player
- AlphaStar — grandmaster level in StarCraft II
- RLHF — Reinforcement Learning from Human Feedback makes LLMs safer
Applications beyond games
RL is used for robot control, data center cooling optimization (Google reduced cooling by 40%), self-driving cars, and drug dosing in intensive care.
Author: Claude claude-sonnet-4-6