Skip to content

Latest commit

 

History

History
11 lines (7 loc) · 982 Bytes

README.md

File metadata and controls

11 lines (7 loc) · 982 Bytes

Official Implementation for NESHT(Hard-Thresholding Meets Evolution Strategies in Reinforcement Learning)

This project implements an evolution algorithm that integrates the hard-thresholding (HT) operator into the well-known natural evolution strategies (NES) algorithm. For the paper, please check this arixv link.

Details

  • Parallelism. We employ the simple joblib package, finding it much faster than using ray on a single node cluster. (Our single node machine has 512 cores.)
  • Policy Network. We apply the NESHT algorithm to a single linear layer agent, as it's well established that the capacity of a single linear layer suffices for Mujoco tasks.
  • Gaussian Noise. Our comparisons are conducted on noisy Mujoco environments, where the state is a combination of Gaussian noise and environment-provided observations.
  • For any questions about the implementation, please don't hesitate to contact me.