Skip to content

dayanbatuofu/icra2023

Repository files navigation

Approximating-Discontinuous-Nash-Equilibrial-Values-of-Two-Player-General-Sum-Differential-Games


Lei Zhang, Mukesh Ghimire, Wenlong Zhang, Zhe Xu, Yi Ren
Arizona State University

This is the implementation of the paper "Approximating-Discontinuous-Nash-Equilibrial-Values-of-Two-Player-General-Sum-Differential-Games"

Get started

There exists two different environment, you can set up a conda environment with all dependencies like so:

For Toy Case, Collision_avoidance, Uncontrolled_intersection_complete_information_game and Uncontrolled_intersection_incomplete_information_game

conda env create -f environment.yml
conda activate siren

For BVP_generation

conda env create -f environment.yml
conda activate hji

Code structure

There are five folders with different functions

BVP_generation: use standard BVP solver to collect the Nash equilibrial (NE) values for uncontrolled intersection (case 1) and collision avoidance (case 2)

The code is organized as follows:

  • generate_intersection.py: generate 5D NE values under four player type configurations (a, a), (na, a), (a, na), and (na, na) for case 1.
  • generate_collision_avoidance.py: generate 9D NE values for case 2.
  • ./utilities/BVP_solver.py: BVP solver.
  • ./example/vehicle/problem_def_intersection.py: dynamic, PMP equation setting for case 1.
  • ./example/vehicle/problem_def_collision_avoidance.py: dynamic, PMP equation setting for case 2.

run generate_intersection.py or generate_collision_avoidance.py to collect NE values. Please notice there is four player types in case 1. You should give setting in generate_intersection.py. Data size can be set in ./example/vehicle/problem_def_intersection.py or ./example/vehicle/problem_def_collision_avoidance.py.

Collision_avoidance: train supervised(SL), self-supervised(SSL), hybrid(HL) and value hardening(VH) model to complete generalization and saftety performance test for case 2

The code is organized as follows:

  • dataio.py: load training data for SL, SSL, HL and VH.
  • training_supervised.py: contains SL training routine.
  • training_selfsupervised.py: contains SSL training routine.
  • training_hybrid.py: contains HL training routine.
  • training_supervised.py: contains SL training routine.
  • training_valuehardening.py: contains VH training routine.
  • loss_functions.py: contains loss functions for SL, SSL, HL and VH.
  • modules.py: contains layers and full neural network modules.
  • utils.py: contains utility functions.
  • diff_operators.py: contains implementations of differential operators.
  • sim_draw_HD_lane_orientation.py: animation of one case for SL, SSL, HL and VH on paper, reader can dirctly run and watch. Reader needs to create folder image_recodring manually.
  • sim_draw_transparent_lane_orientation.py: visualization of one case for SL, SSL, HL and VH on paper, reader can dirctly run and watch. Reader needs to create folder image_recodring manually.
  • ./experiment_scripts/train_collision_HJI.py: contains scripts to train the model, which can reproduce experiments in the paper.
  • ./validation_scripts/closedloop_traj_generation_HD_tanh.py: use value network (tanh as activation function) as closed-loop controllers to generate data including generalization and saftety performance.
  • ./validation_scripts/closedloop_traj_generation_HD_relu.py: use value network (relu as activation function) as closed-loop controllers to generate data including generalization and saftety performance.
  • ./validation_scripts/closedloop_traj_generation_HD_sine.py: use value network (sine as activation function) as closed-loop controllers to generate data including generalization and saftety performance.
  • ./validation_scripts/trajectory_with_value_HD_tanh.py: visualize generalization and saftety performance for value network (tanh as activation function), data used for paper is ready.
  • ./validation_scripts/trajectory_with_value_HD_relu.py: visualize generalization and saftety performance for value network (relu as activation function).
  • ./validation_scripts/trajectory_with_value_HD_sine.py: visualize generalization and saftety performance for value network (sine as activation function).
  • ./validation_scripts/model: experimental model in the paper.
  • ./validation_scripts/train_data: training data in the paper.
  • ./validation_scripts/test_data: testing data in the paper.
  • ./validation_scripts/closed_loop: store data by using value network as closed-loop controllers, data used for paper is ready.

Toy Case: train supervised(SL), self-supervised(SSL), hybrid(HL) and value hardening(VH) model to visualize the toy case shown in the paper

The code is organized as follows:

  • dataio.py: load training data for SL, SSL, HL and VH.
  • training_supervised.py: contains SL training routine.
  • training_selfsupervised.py: contains SSL training routine.
  • training_hybrid.py: contains HL training routine.
  • training_supervised.py: contains SL training routine.
  • training_valuehardening.py: contains VH training routine.
  • loss_functions.py: contains loss functions for SL, SSL, HL and VH.
  • modules.py: contains layers and full neural network modules.
  • utils.py: contains utility functions.
  • diff_operators.py: contains implementations of differential operators.
  • ./experiment_scripts/toy_case.py: contains scripts to train the model, which can reproduce experiments in the paper.
  • ./validation_scripts/toy_case_value_plot.py: use this script to plot Fig 1(a) in the paper.
  • ./validation_scripts/toy_case_value_hardening_plots.py: use this script to plot Fig 1(b) in the paper.

Uncontrolled_intersection_complete_information_game: train supervised(SL), self-supervised(SSL), hybrid(HL) and value hardening(VH) model to complete generalization and saftety performance test for case 1 with complete information

The code is organized as follows:

  • dataio.py: load training data for SL, SSL, HL and VH.
  • training_supervised.py: contains SL training routine.
  • training_selfsupervised.py: contains SSL training routine.
  • training_hybrid.py: contains HL training routine.
  • training_supervised.py: contains SL training routine.
  • training_valuehardening.py: contains VH training routine.
  • loss_functions.py: contains loss functions for SL, SSL, HL and VH.
  • modules.py: contains layers and full neural network modules.
  • utils.py: contains utility functions.
  • diff_operators.py: contains implementations of differential operators.
  • sim_draw_HD_lane_orientation.py: animation of one case for SL, SSL, HL and VH on paper, reader can dirctly run and watch.
  • sim_draw_transparent_lane_orientation.py: visualization of one case for SL, SSL, HL and VH on paper, reader can dirctly run and watch.
  • ./experiment_scripts/train_intersection_HJI.py: contains scripts to train the model, which can reproduce experiments in the paper.
  • ./validation_scripts/closedloop_traj_generation_HD_tanh.py: use value network (tanh as activation function) as closed-loop controllers to generate data for saftety performance.
  • ./validation_scripts/closedloop_traj_generation_HD_relu.py: use value network (relu as activation function) as closed-loop controllers to generate data for saftety performance.
  • ./validation_scripts/closedloop_traj_generation_HD_sine.py: use value network (sine as activation function) as closed-loop controllers to generate data for saftety performance.
  • ./validation_scripts/closedloop_traj_generation_HD_gelu.py: use value network (gelu as activation function) as closed-loop controllers to generate data for saftety performance.
  • ./validation_scripts/trajectory_with_value_HD_tanh.py: visualize generalization and saftety performance for value network (tanh as activation function), data used for paper is ready.
  • ./validation_scripts/trajectory_with_value_HD_relu.py: visualize generalization and saftety performance for value network (relu as activation function), data used for paper is ready.
  • ./validation_scripts/trajectory_with_value_HD_sine.py: visualize generalization and saftety performance for value network (sine as activation function), data used for paper is ready.
  • ./validation_scripts/trajectory_with_value_HD_gelu.py: visualize generalization and saftety performance for value network (gelu as activation function).
  • ./validation_scripts/action_compute_tanh_initial state.py: present measure the MAEs of control input prediction for initial state space, data used for paper is ready.
  • ./validation_scripts/action_compute_tanh_expanded state.py: present measure the MAEs of control input prediction for expanded state space, data used for paper is ready
  • ./validation_scripts/value_compute_tanh_initial state.py: present measure the MAEs of value prediction for initial state space, data used for paper is ready.
  • ./validation_scripts/value_compute_tanh_expanded state.py: present measure the MAEs of value prediction for expanded state space, data used for paper is ready.
  • ./validation_scripts/model: experimental model in the paper.
  • ./validation_scripts/train_data: training data in the paper.
  • ./validation_scripts/test_data: testing data in the paper.
  • ./validation_scripts/closed_loop: store data by using value network as closed-loop controllers, data used for paper is ready.
  • ./validation_scripts/value: store data to measure MAE of value and control input predictions, data used for paper is ready.

Uncontrolled_intersection_complete_information_game: use supervised(SL) and hybrid(HL) model to complete incomplete information games for case 1

The code is organized as follows:

  • main.py: run the simulatio with initial setting including agent's belief, empathetic or non-empathetic.
  • enviroment.py: simulation environment is generated here, using the parameters from main.py.
  • savi_simulation.py: initial conditions are processed for the simulation here, such as agent parameters (beta) and action set. The initialization belief table is also done here through the function get_initial_belief().
  • inference_model.py: inference is done after observing the state. There are several models implemented here: bvp, baseline, etc. The inference algorithm updates the belief table at each time step using the selected model defined in main.py.
  • decision.py: decision model returns an action for each agent, depending on the type of agent defined in main.py. Models include bvp_empathetic, bvp_non_empathetic, baseline, etc.
  • plot_loss_traj.py: convert .cvs file into .mat file to generate two-player trajectories projected into d1-d2 frame.
  • modules.py: contains layers and full neural network modules.
  • utils.py: contains utility functions.
  • diff_operators.py: contains implementations of differential operators.
  • ./experiment/store the data of simulation.
  • ./validation_scripts/model: experimental model in the paper.
  • ./validation_scripts/Hamilton_generation.py: use value network to predict the state in the simulation.
  • ./validation_scripts/trajectory_policy_consistent.py: plot two-player trajectories projected into d1-d2 frame when players' initla belief is consistent with their true parameter, data used for paper is ready.
  • ./validation_scripts/trajectory_policy_consistent.py: plot two-player trajectories projected into d1-d2 frame when players' initla belief is not consistent with their true parameter, data used for paper is ready.

Contact

If you have any questions, please feel free to email the authors.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages