Skip to content

Actor critic reinforcement learning + motion and task planning under LTL tasks + wireless sensor network routing

License

Notifications You must be signed in to change notification settings

MengGuo/ac_ltl_wsn

Repository files navigation

ac_ltl_wsn

Temporal Motion and Communicaiton Planning in Wirelessly Connected Environments via Actor-critic Reinforcement Learning

@INPROCEEDINGS{8264271,
  author={M. {Guo} and M. M. {Zavlanos}},
  booktitle={2017 IEEE 56th Annual Conference on Decision and Control (CDC)}, 
  title={Temporal task planning in wirelessly connected environments with unknown channel quality}, 
  year={2017},
  volume={},
  number={},
  pages={4161-4168},
  doi={10.1109/CDC.2017.8264271}}

Description

This package contains the implementation for motion and communication control of a mobile robot, which is tasked with gathering data in an environment of interest and transmitting these data to a data center. The task is specified as a high-level Linear Temporal Logic (LTL) formula that captures the data to be gathered at various regions in the workspace. The robot has a limited buffer to store the data, which needs to be transmitted to the data center before the buffer overflows. Communication between the robot and the data center is through a dedicated wireless network to which the robot can upload data with rates that are uncertain and unknown.


Content

  • Wireless routing based on Linear Programming (LP), see [wsn_routing] folder

  • Policy synthesis given a fully-known system model as Markov Decision Processes (MDP)

    • Product automaton between MDP and Deterministic Robin Automaton (DRA), based on [MDP_TG].

    • Policy generated via LP, see [lp_policy.p]

    • Policy generated via actor-critic RL in the product, see [ac_policy.p]

    • Example log, see [log.txt]

  • Implementation of the least-squares temporal difference (LSTD) method of the actor-critic type. [ref1] [ref2]

    • Task execution, workspace exploration, parameterized-policy learning all performed online and simultaneously, see [ac.py]
    • For single critical segment, see [one_cri_seg_ac_learn.py]
    • For a given high-level discrete plan, see [ltl_ac_learn.py]
    • Indirect learning mode via simulated experience, and direct learning via real experience.
from crm import build_crm 
from ac import actor_critic

# load roadmap with wsn rate info, as combined road map (crm)
crm = build_crm()

# set up actor_critic learner
actor_critic_learner = actor_critic(crm, data_bound, quant_size,
		       		Ts, uncertainty_prob, clambda,
				Gamma, Beta, D)
actor_critic_learner.set_init_goal(new_init, new_goal)                
actor_critic_learner.set_theta(theta)

# indrect learn via simulation
print '|||||||Indirect learning for %d episodes|||||||' %static_learn_episodes
indirect_learn_log = actor_critic_learner.complete_learn(static_learn_episodes,
							mode ='model')

# direct learn via robot moving
print '|||||||Direct learning for 1 episode|||||||'
direct_learn_log = actor_critic_learner.one_episode_learn(gamma,
						beta, mode='experiment')


Dependence

About

Actor critic reinforcement learning + motion and task planning under LTL tasks + wireless sensor network routing

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages