Distributional Bellman Operators over Mean Embeddings

ICML(2024)

引用 0|浏览42
暂无评分
摘要
We propose a novel algorithmic framework for distributional reinforcementlearning, based on learning finite-dimensional mean embeddings of returndistributions. We derive several new algorithms for dynamic programming andtemporal-difference learning based on this framework, provide asymptoticconvergence theory, and examine the empirical performance of the algorithms ona suite of tabular tasks. Further, we show that this approach can bestraightforwardly combined with deep reinforcement learning, and obtain a newdeep RL agent that improves over baseline distributional approaches on theArcade Learning Environment.
更多
查看译文
关键词
Reinforcement Learning,Adaptive Dynamic Programming,Deep Learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要