【MATLAB强化学习工具箱】学习笔记--在Simulink环境中训练智能体Create Simulink Environment and Train Agent创建接口生成DDPG Agent训练网络

260 阅读 0 评论 172 点赞

我是靠谱客的博主炙热煎蛋，这篇文章主要介绍【MATLAB强化学习工具箱】学习笔记--在Simulink环境中训练智能体Create Simulink Environment and Train Agent创建接口生成DDPG Agent训练网络，现在分享给大家，希望可以做个参考。

Simulink中便于搭建各类动力学与控制模型，通过将原有的控制器替换为AI控制器，可以方便使用已有模型，提供增量效果。

本节的重点是如何引入Simulink模型作为env，其他的内容在之前的文章中已有说明。

以水箱模型watertank为例，如下图所示：

采用PI控制器，控制效果如下所示：

将此PI控制器替换为神经网络控制器后，系统架构如下图所示：

具体替换策略如下所示：

（1）删去PID控制器；

（2）增加RL Agent模块；

（3）观测器模块：

观测向量为：，其中为水箱高度，，是设定的水位高度。

（4）奖励函数定义如下

如下所示：

（5）终止条件

创建接口

观测器信息

obsInfo = rlNumericSpec([3 1],...
'LowerLimit',[-inf -inf 0
]',...
'UpperLimit',[ inf
inf inf]');
obsInfo.Name = 'observations';
obsInfo.Description = 'integrated error, error, and measured height';
numObservations = obsInfo.Dimension(1);

动作器信息

actInfo = rlNumericSpec([1 1]);
actInfo.Name = 'flow';
numActions = actInfo.Dimension(1);

构建接口【最核心的步骤】

env = rlSimulinkEnv('rlwatertank','rlwatertank/RL Agent',...
obsInfo,actInfo);

定义重置函数

env.ResetFcn = @(in)localResetFcn(in);

配置仿真总时间和仿真步长

Ts = 1.0;
Tf = 200;

重置随机数种子

rng(0)

生成DDPG Agent

具体见前一篇文章。

state网络

statePath = [
featureInputLayer(numObservations,'Normalization','none','Name','State')
fullyConnectedLayer(50,'Name','CriticStateFC1')
reluLayer('Name','CriticRelu1')
fullyConnectedLayer(25,'Name','CriticStateFC2')];

action网络

actionPath = [
featureInputLayer(numActions,'Normalization','none','Name','Action')
fullyConnectedLayer(25,'Name','CriticActionFC1')];

合并网络

commonPath = [
additionLayer(2,'Name','add')
reluLayer('Name','CriticCommonRelu')
fullyConnectedLayer(1,'Name','CriticOutput')];

组成critic网络

criticNetwork = layerGraph();
criticNetwork = addLayers(criticNetwork,statePath);
criticNetwork = addLayers(criticNetwork,actionPath);
criticNetwork = addLayers(criticNetwork,commonPath);
criticNetwork = connectLayers(criticNetwork,'CriticStateFC2','add/in1');
criticNetwork = connectLayers(criticNetwork,'CriticActionFC1','add/in2');

critic网络参数配置

criticOpts = rlRepresentationOptions('LearnRate',1e-03,'GradientThreshold',1);
critic = rlQValueRepresentation(criticNetwork,obsInfo,actInfo,'Observation',{'State'},'Action',{'Action'},criticOpts);

actor网络

actorNetwork = [
featureInputLayer(numObservations,'Normalization','none','Name','State')
fullyConnectedLayer(3, 'Name','actorFC')
tanhLayer('Name','actorTanh')
fullyConnectedLayer(numActions,'Name','Action')
];

actor网络参数配置

actorOptions = rlRepresentationOptions('LearnRate',1e-04,'GradientThreshold',1);
actor = rlDeterministicActorRepresentation(actorNetwork,obsInfo,actInfo,'Observation',{'State'},'Action',{'Action'},actorOptions);

构建最终的DDPG Agent

agentOpts = rlDDPGAgentOptions(...
'SampleTime',Ts,...
'TargetSmoothFactor',1e-3,...
'DiscountFactor',1.0, ...
'MiniBatchSize',64, ...
'ExperienceBufferLength',1e6);
agentOpts.NoiseOptions.StandardDeviation = 0.3;
agentOpts.NoiseOptions.StandardDeviationDecayRate = 1e-5;
agent = rlDDPGAgent(actor,critic,agentOpts);

训练网络

maxepisodes = 5000;
maxsteps = ceil(Tf/Ts);
trainOpts = rlTrainingOptions(...
'MaxEpisodes',maxepisodes, ...
'MaxStepsPerEpisode',maxsteps, ...
'ScoreAveragingWindowLength',20, ...
'Verbose',false, ...
'Plots','training-progress',...
'StopTrainingCriteria','AverageReward',...
'StopTrainingValue',800);

开始训练


trainingStats = train(agent,env,trainOpts);

训练过程可以看到主程序一直在调用simulink模型

最后验证训练结果

simOpts = rlSimulationOptions('MaxSteps',maxsteps,'StopOnError','on');
experiences = sim(env,agent,simOpts);

得到如下结果

最后

以上就是炙热煎蛋最近收集整理的关于【MATLAB强化学习工具箱】学习笔记--在Simulink环境中训练智能体Create Simulink Environment and Train Agent创建接口生成DDPG Agent训练网络的全部内容，更多相关【MATLAB强化学习工具箱】学习笔记--在Simulink环境中训练智能体Create内容请搜索靠谱客的其他文章。

本图文内容来源于网友提供，作为学习参考使用，或来自网络收集整理，版权属于原作者所有。

本文分类：# MATLAB强化学习工具箱
浏览次数：260 次浏览
发布日期：2023-06-22 12:00:02
本文链接：https://www.kaopuke.com/article/k-p-k_14_uzo_26_fz_12__7__22_3.html

【MATLAB强化学习工具箱】学习笔记--在Simulink环境中训练智能体Create Simulink Environment and Train Agent创建接口生成DDPG Agent训练网络

创建接口

观测器信息

动作器信息

构建接口【最核心的步骤】

定义重置函数

配置仿真总时间和仿真步长

重置随机数种子

生成DDPG Agent

训练网络

最后

评论列表共有 0 条评论

发表评论取消回复

【MATLAB强化学习工具箱】学习笔记--在Simulink环境中训练智能体Create Simulink Environment and Train Agent创建接口生成DDPG Agent训练网络

创建接口

观测器信息

动作器信息

构建接口【最核心的步骤】

定义重置函数

配置仿真总时间和仿真步长

重置随机数种子

生成DDPG Agent

训练网络

最后

相关文章

评论列表共有 0 条评论

发表评论 取消回复

发表评论取消回复