Envisioning LogoEnvisioning
HomeServicesWorkSignalsVocabAbout
Vocab
1092 entries
GridGraphSunflowerInfo

Philipp Moritz

(2 articles)
TRPO (Trust Region Policy Optimization)
2015

TRPO
Trust Region Policy Optimization

Advanced algorithm used in RL to ensure stable and reliable policy updates by optimizing within a trust region, thus preventing drastic policy changes.

Generality: 635

PPO (Proximal Policy Optimization)
2017

PPO
Proximal Policy Optimization

RL algorithm that aims to balance ease of implementation, sample efficiency, and reliable performance by using a simpler but effective update method for policy optimization.

Generality: 670