From 8692616b7a4bed299c129d6ea0dc3fb9b2cb028b Mon Sep 17 00:00:00 2001 From: "DESKTOP-AENDA0E\\Fardin" Date: Mon, 11 Dec 2023 11:08:54 +0100 Subject: [PATCH] unit3 | deep-q-algorithm | catastrophic forgetting --- units/en/unit3/deep-q-algorithm.mdx | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/units/en/unit3/deep-q-algorithm.mdx b/units/en/unit3/deep-q-algorithm.mdx index adbe44a6..28e7fd50 100644 --- a/units/en/unit3/deep-q-algorithm.mdx +++ b/units/en/unit3/deep-q-algorithm.mdx @@ -40,8 +40,8 @@ Experience replay helps by **using the experiences of the training more efficien ⇒ This allows the agent to **learn from the same experiences multiple times**. -2. **Avoid forgetting previous experiences and reduce the correlation between experiences**. -- The problem we get if we give sequential samples of experiences to our neural network is that it tends to forget **the previous experiences as it gets new experiences.** For instance, if the agent is in the first level and then in the second, which is different, it can forget how to behave and play in the first level. +2. **Avoid forgetting previous experiences (aka catastrophic interference, or catastrophic forgetting) and reduce the correlation between experiences**. +- **[catastrophic forgetting](https://en.wikipedia.org/wiki/Catastrophic_interference)**: The problem we get if we give sequential samples of experiences to our neural network is that it tends to forget **the previous experiences as it gets new experiences.** For instance, if the agent is in the first level and then in the second, which is different, it can forget how to behave and play in the first level. The solution is to create a Replay Buffer that stores experience tuples while interacting with the environment and then sample a small batch of tuples. This prevents **the network from only learning about what it has done immediately before.**