A Distributional Analogue to the Successor Representation
arXiv (Cornell University)(2024)
摘要
This paper contributes a new approach for distributional reinforcementlearning which elucidates a clean separation of transition structure and rewardin the learning process. Analogous to how the successor representation (SR)describes the expected consequences of behaving according to a given policy,our distributional successor measure (SM) describes the distributionalconsequences of this behaviour. We formulate the distributional SM as adistribution over distributions and provide theory connecting it withdistributional and model-based reinforcement learning. Moreover, we propose analgorithm that learns the distributional SM from data by minimizing a two-levelmaximum mean discrepancy. Key to our method are a number of algorithmictechniques that are independently valuable for learning generative models ofstate. As an illustration of the usefulness of the distributional SM, we showthat it enables zero-shot risk-sensitive policy evaluation in a way that wasnot previously possible.
更多查看译文
关键词
Capital Allocation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要