離線強化學習(A Survey on Offline Reinforcement Learning)


離線強化學習(A Survey on Offline Reinforcement Learning)

作者:凱魯嘎吉 - 博客園 http://www.cnblogs.com/kailugaji/

    通過閱讀《A Survey on Offline Reinforcement Learning: Taxonomy, Review, and Open Problems》與《Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems》這兩篇關於離線強化學習的綜述論文,初步認識離線強化學習,了解離線強化學習的概念、挑戰、相關方法(僅粗略介紹,未詳細展開)及未來可能的研究方向。更多強化學習內容,請看:隨筆分類 - Reinforcement Learning

1. Introduction

1.1 Supervised Machine Learning, RL, and Off-policy RL

1.2 The Power of Offline RL

1.3 On-policy vs. Off-policy

1.4 On-policy, Off-policy, and Offline (Batch) RL

1.5 Imitation Learning, RL, and Offline RL

2. Challenges

3. Taxonomy

 

Illustration of the general structure of an offline RL algorithm

3.1 Policy Constraints

3.2 Importance Sampling

3.3 Regularization

3.4 Uncertainty Estimation

3.5 Model-based Methods

3.6 One-step Methods

3.7 Imitation Learning

    模仿學習資料:

     許天,李子牛,俞揚,模仿學習簡潔教程,2021. http://www.lamda.nju.edu.cn/xut/Imitation_Learning.pdf

    【RLChina 2021】第10課 強化學習前沿(二)俞揚:https://www.bilibili.com/video/BV1qM4y1L7w9?spm_id_from=333.999.0.0

3.8 Trajectory Optimization

4. Open Problems

5. 參考文獻

[1] Rafael Figueiredo Prudencio, Marcos R. O. A. Maximo and Esther Luna Colombini. “A Survey on Offline Reinforcement Learning: Taxonomy, Review, and Open Problems”(2022).

[2] Sergey Levine, Aviral Kumar, George Tucker and Justin Fu. “Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems”(2020).

[3] CS 285 Deep Reinforcement Learning https://rail.eecs.berkeley.edu/deeprlcourse/

[4] CS330 Fall 2021 Deep Multi-Task and Meta Learning https://cs330.stanford.edu/

[5] Offline (Batch) Reinforcement Learning: A Review of Literature and Applications https://danieltakeshi.github.io/2020/06/28/offline-rl/

[6] RL-Paper-notes https://github.com/2019ChenGong/RL-Paper-notes

[7] An Optimistic Perspective on Offline Reinforcement Learning https://offline-rl.github.io/

[8] 離線強化學習基准:https://github.com/rail-berkeley/d4rl

[9] 【RLChina 2021】第9課 強化學習前沿(一) 盧宗青:https://www.bilibili.com/video/BV1cQ4y1m7Nn?spm_id_from=333.999.0.0

[10] Offline Reinforcement Learning Resources, https://offlinerl.ai/


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM