Adaptive Representation for Policy Gradient
Author | : Ujjwal Das Gupta |
Publisher | : |
Total Pages | : 40 |
Release | : 2015 |
ISBN-10 | : OCLC:918929732 |
ISBN-13 | : |
Rating | : 4/5 ( Downloads) |
Download or read book Adaptive Representation for Policy Gradient written by Ujjwal Das Gupta and published by . This book was released on 2015 with total page 40 pages. Available in PDF, EPUB and Kindle. Book excerpt: Much of the focus on finding good representations in reinforcement learning has been on learning complex non-linear predictors of value. Methods like policy gradient, that do not learn a value function and instead directly represent policy, often need fewer parameters to learn good policies. However, they typically employ a fixed parametric representation that may not be sufficient for complex domains. This thesis introduces two algorithms which can learn an adaptive representation of policy: the Policy Tree algorithm, which learns a decision tree over different instantiations of a base policy, and the Policy Conjunction algorithm, which adds conjunctive features to any base policy that uses a linear feature representation. In both of these algorithms, policy gradient is used to grow the representation in a way that enables the maximum local increase in the expected return of the policy. Experiments show that these algorithms can choose genuinely helpful splits or features, and significantly improve upon the commonly used linear Gibbs softmax policy, which is chosen as the base policy.