728x90
mdp
-
강화학습 Chapter 02) Markov Decision ProcessAI/Reinforcement Learning 2023. 4. 12. 22:49
강화학습 Chapter 02) Markov Decision Process 목차 Markov Property State Transition MatrixMarkov Process(=Chain) Markov Reward Process (Discounted) ReturnValue Function - State Value FunctionBellman Equation for MRPs Markov Decision Process Bellman Expectation Equation PoliciesValue Function Bellman Optimality Equation Optimal PoliciesOptimal Value Function 내용 1. Markov Property ⇒ " 어떤 상태 S는 Mar..