(RL强化学习)强化学习基础知识Basic ComponentsBack propagationInverse RLPolicy Gradient
文章目录Basic ComponentsActorCritic网络训练Q-learningActor + CriticA2C Advantage Actor-CriticA3C Asynchtonous Advantage Actor-CriticBack propagationInverse RLPolicy GradientPolicyExampleGradientBasic Components以Video Game为例Actor:遥杆Env:游戏界面Reward Funct