Envisioning LogoEnvisioning
HomeServicesWorkSignalsVocabAbout
Vocab
1092 entries
GridGraphSunflowerInfo

John D. Williams

(2 articles)
Temporal Difference Learning
1988

Temporal Difference Learning

A method in reinforcement learning that updates predictions based on the difference between successive predictions, rather than solely relying on final outcome errors.

Generality: 775

Policy Gradient Algorithm
1992

Policy Gradient Algorithm

Type of RL algorithm that optimizes the policy directly by computing gradients of expected rewards with respect to policy parameters.

Generality: 805