Skip to content

Commit 220e860

Browse files
authored
Updated optimization-2
Updated the first back-propagation example to reflect the multiplication order used in the preceding paragraph. dfdq * dqdx, instead of the other way around 1.0 * dfdq. Also got rid of the magic number "1.0".
1 parent bf3acc3 commit 220e860

File tree

1 file changed

+4
-2
lines changed

1 file changed

+4
-2
lines changed

optimization-2.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -82,9 +82,11 @@ f = q * z # f becomes -12
8282
# first backprop through f = q * z
8383
dfdz = q # df/dz = q, so gradient on z becomes 3
8484
dfdq = z # df/dq = z, so gradient on q becomes -4
85+
dqdx = 1.0
86+
dqdy = 1.0
8587
# now backprop through q = x + y
86-
dfdx = 1.0 * dfdq # dq/dx = 1. And the multiplication here is the chain rule!
87-
dfdy = 1.0 * dfdq # dq/dy = 1
88+
dfdx = dfdq * dqdx # The multiplication here is the chain rule!
89+
dfdy = dfdq * dqdy
8890
```
8991

9092
At the end we are left with the gradient in the variables `[dfdx,dfdy,dfdz]`, which tell us the sensitivity of the variables `x,y,z` on `f`!. This is the simplest example of backpropagation. Going forward, we will want to use a more concise notation so that we don't have to keep writing the `df` part. That is, for example instead of `dfdq` we would simply write `dq`, and always assume that the gradient is with respect to the final output.

0 commit comments

Comments
 (0)