This repository contains projects experimenting with OpenAI's Gymnasium environments provided by Farama Foundation. The goal is to explore the application of various reinforcement learning algorithms to autonomous driving tasks and understand how games could use RL.
In this repository, we experiment with different reinforcement learning algorithms on the intersection and racetrack environments. Each file demonstrates the use of specific algorithms and provides insights into their performance and behavior.
download.1.mp4
- Description: This notebook explores the intersection environment using Deep Q-Network (DQN) and Proximal Policy Optimization (PPO).
- Deep Q-Network (DQN): Uses Q-learning with a neural network to approximate the state-action value function (Q).
- Proximal Policy Optimization (PPO): A policy gradient method optimizing a surrogate objective function.
- Inference: PPO converged faster and more reliably than DQN, which had fluctuating performance initially but improved over time.
rl-video-episode-0.1.mp4
- Description: This notebook demonstrates the use of Soft Actor-Critic (SAC) in the racetrack environment.
- Soft Actor-Critic (SAC): An off-policy actor-critic algorithm that balances exploration and exploitation by maximizing entropy.
- Inference: SAC showed robust performance, effectively managing the continuous action space of the racetrack environment.
- Description: This notebook applies Deep Deterministic Policy Gradient (DDPG) to the racetrack environment.
- Deep Deterministic Policy Gradient (DDPG): Combines the advantages of DQN and policy gradient methods for continuous control.
- Inference: DDPG demonstrated high precision and stability, efficiently learning to navigate the racetrack.
Feel free to fork this repository, make improvements, and submit a pull request. Your contributions are welcome!
This project is licensed under the MIT License - see the LICENSE file for details.