A Distributional Analogue to the Successor Representation
ICLR 2024(2024)
Research Scientist Intern | PhD student | Gatsby Unit and Google Deepmind | Research Scientist | Adjunct Professor
Abstract
This paper contributes a new approach for distributional reinforcementlearning which elucidates a clean separation of transition structure and rewardin the learning process. Analogous to how the successor representation (SR)describes the expected consequences of behaving according to a given policy,our distributional successor measure (SM) describes the distributionalconsequences of this behaviour. We formulate the distributional SM as adistribution over distributions and provide theory connecting it withdistributional and model-based reinforcement learning. Moreover, we propose analgorithm that learns the distributional SM from data by minimizing a two-levelmaximum mean discrepancy. Key to our method are a number of algorithmictechniques that are independently valuable for learning generative models ofstate. As an illustration of the usefulness of the distributional SM, we showthat it enables zero-shot risk-sensitive policy evaluation in a way that wasnot previously possible.
MoreTranslated text
Key words
reinforcement learning,distributional reinforcement learning,successor representation,successor measure,geometric horizon models,gamma models,risk-aware
PDF
View via Publisher
AI Read Science
Must-Reading Tree
Example

Generate MRT to find the research sequence of this paper
Related Papers
2020
被引用13 | 浏览
Data Disclaimer
The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn
Chat Paper
去 AI 文献库 对话