Skip to content

Commit f8c0c9b

Browse files
authored
Update Attention Maths.md
1 parent a0efc45 commit f8c0c9b

File tree

1 file changed

+1
-0
lines changed
  • Chapter-wise code/Code - PyTorch/7. Attention Models/2. Neural Text Summarization/1. Transformer Models

1 file changed

+1
-0
lines changed

Chapter-wise code/Code - PyTorch/7. Attention Models/2. Neural Text Summarization/1. Transformer Models/Attention Maths.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,7 @@ product of vectors. So Q and K are similar iff `Q dot K` is large. <br>
3030

3131
5. To make attention more focused on best matching keys, use softmax `(softmax(Q.KTranspose))`. Hence, we now calculate a matrix of Q-K probabailities
3232
often called *attention weights*. The shape of this matrix is `[Lq, Lk]`.<br>
33+
<img src="../images/17. step - 3 - 1.png" width="50%"></img> <br><br>
3334

3435
6. In the final step, we take values and get weighted sum of values, weighting each value Vi by the probability that the key Ki matches the query.<br>
3536
<img src="../images/17. step - 3.png" width="50%"></img> <br><br>

0 commit comments

Comments
 (0)