RLMessageHandler.communicate()
统一接收,因为DSSim中的逻辑保证了环境的接收会在收动作之前,所以只需要对action\env的总数进行统计,信号接收数量够之后就说明收够了。 def rl_run(early_stop_event, episodes):
rl_message_handler = RLMessageHandler(early_stop_event)
rl_message_handler.set_action_names(['action_a', 'action_b'])
rl_message_handler.set_environment_names(['environment_a', 'environment_b'])
for episode in range(episodes):
print("\nRL开始episode ", episode)
action_a = 1
action_b = 1
while True:
rl_message_handler.set_action('action_a', action_a)
rl_message_handler.set_action('action_b', action_b)
dssim_terminate, environment = rl_message_handler.communicate()
if dssim_terminate:
break
action_a = environment['environment_a']
action_b = environment['environment_b']
if environment['environment_a'] == [[5]] and episode == 1:
print("\nstop!!!!!\n")
rl_message_handler.stop()
rl_message_handler.reset()
print("RL结束episode ", episode, "\n")
print("rl_run全部结束了")
本文章使用limfx的vscode插件快速发布