Skip to content

Commit

Permalink
Add a small explication of the Q-Table
Browse files Browse the repository at this point in the history
  • Loading branch information
VOCdevShy committed Mar 20, 2024
1 parent 352cbfb commit 2fe21d1
Showing 1 changed file with 14 additions and 0 deletions.
14 changes: 14 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,17 @@ The list of predefined maps are in the map files in the tools folder. Here you c
If you want more information about Q-Learning and the Frozen Lake game, please read the article from medium, he help me a lot to understand what to do in the code: https://medium.com/towards-data-science/q-learning-for-beginners-2837b777741

Do your own test by moving values if you want!

For those who are interested by the calculation of the Q-Table here is an explacation:

`qtable[state, action] = qtable[state, action] + alpha * (reward + gamma * np.max(qtable[next_state, :]) - qtable[state, action])`

- `qtable[state, action]`: This refers to the current value of action 'action' in state 'state' of the Q-table. This is the value we will update.

- `alpha`: This is the learning rate. It controls the extent to which new information will be integrated into the old values of the Q-table. A high value means that new information will have a greater impact on existing values, while a low value means they will have a lesser impact.

- `reward`: This is the immediate reward obtained after taking action 'action' in state 'state'. This reward can be positive, negative, or zero.

- `gamma`: This is the discount factor. It represents the importance of future rewards compared to immediate rewards. A gamma close to 1 gives great importance to future rewards, while a gamma close to 0 gives similar importance to all rewards, whether immediate or future.

- `np.max(qtable[next_state, :])`: This is the maximum value among all possible actions in the next state (next_state). This represents the best estimate of the future value that the agent can obtain from the next state.

0 comments on commit 2fe21d1

Please sign in to comment.