Envisioning LogoEnvisioning
HomeServicesWorkSignalsVocabAbout
Vocab
1092 entries
GridGraphSunflowerInfo

Sergey Levine

(9 articles)
Policy Gradient Algorithm
1992

Policy Gradient Algorithm

Type of RL algorithm that optimizes the policy directly by computing gradients of expected rewards with respect to policy parameters.

Generality: 805

Policy Gradient
1992

Policy Gradient

Class of algorithms in RL that optimizes the parameters of a policy directly through gradient ascent on expected future rewards.

Generality: 675

Data Efficient Learning
2012

Data Efficient Learning

ML approach that requires fewer data to train a functional model.

Generality: 791

TRPO (Trust Region Policy Optimization)
2015

TRPO
Trust Region Policy Optimization

Advanced algorithm used in RL to ensure stable and reliable policy updates by optimizing within a trust region, thus preventing drastic policy changes.

Generality: 635

Imitation Learning
2016

Imitation Learning

AI technique where models learn to perform tasks by mimicking human behavior or strategies demonstrated in training data.

Generality: 850

Sample Efficiency
2016

Sample Efficiency

Ability of a ML model to achieve high performance with a relatively small number of training samples.

Generality: 815

FSL (Few-Shot Learning)
2016

FSL
Few-Shot Learning

ML approach that enables models to learn and make accurate predictions from a very small dataset.

Generality: 575

Few Shot
2016

Few Shot

ML technique designed to recognize patterns and make predictions based on a very limited amount of training data.

Generality: 675

PPO (Proximal Policy Optimization)
2017

PPO
Proximal Policy Optimization

RL algorithm that aims to balance ease of implementation, sample efficiency, and reliable performance by using a simpler but effective update method for policy optimization.

Generality: 670