Skip to content

Commit

Permalink
feat: added more clarification to backpropagation.
Browse files Browse the repository at this point in the history
  • Loading branch information
Panadestein committed Nov 28, 2024
1 parent b755e9a commit 0a2d512
Showing 1 changed file with 8 additions and 1 deletion.
9 changes: 8 additions & 1 deletion src/nn.org
Original file line number Diff line number Diff line change
Expand Up @@ -79,12 +79,19 @@ the total derivative and the chain rule come to rescue once again to express the
\delta^{(l)} = \left({W^{(l+1)}}^\top \delta^{(l+1)}\right) \odot \sigma'\left( z^{(l)} \right)
\end{equation*}

where we have introduced the matrix form of the weights \(W^{(l)}\). Finally, the gradient of the cost function is:
where we have introduced the matrix form of the weights \(W^{(l)}\). The gradient of the cost function is:

\begin{equation*}
\nabla C = \left\{ \frac{\partial C}{\partial W^{(l)}} = \delta^{(l)} \left( a^{(l-1)} \right)^\top, \quad \frac{\partial C}{\partial b^{(l)}} = \delta^{(l)} \right\}_{l=1}^{L}
\end{equation*}

Finally, we can do a gradient descent step with a learning rate \(\eta\), which can be possibly annealed:

\begin{equation*}
\left\{W^{(l)}, b^{(l)}\right\}_{l=1}^{L} = \left\{W^{(l)}, b^{(l)}\right\}_{l=1}^{L} -\eta\nabla C
\end{equation*}

For full, no-nonsense derivation, see the dedicated section on Nielsen's [[http://neuralnetworksanddeeplearning.com/chap2.html#proof_of_the_four_fundamental_equations_(optional)][book]].

#+begin_export html
</details>
Expand Down

0 comments on commit 0a2d512

Please sign in to comment.