TAAFT
Free mode
100% free
Freemium
Free Trial
Deals

SARSA Algorithm

[ˈsɑːsə ˈælɡərɪðəm]
Machine Learning
Last updated: December 9, 2024

Definition

An on-policy learning algorithm that updates Q-values based on state-action-reward-state-action transitions.

Detailed Explanation

SARSA (State-Action-Reward-State-Action) is an on-policy temporal difference learning algorithm that learns Q-values. Unlike Q-learning, it uses the actual next action chosen by the current policy rather than the maximum Q-value for the next state, making it more conservative in some situations.

Use Cases

Robot navigation, game AI, process control, resource management

Related Terms