Skip to content

Commit d2515bd

Browse files
committed
master: dot-product attention.
1 parent 6c63f2a commit d2515bd

File tree

1 file changed

+2
-2
lines changed
  • Chapter-wise code/Code - PyTorch/7. Attention Models/2. Neural Text Summarization/1. Transformer Models

1 file changed

+2
-2
lines changed

Chapter-wise code/Code - PyTorch/7. Attention Models/2. Neural Text Summarization/1. Transformer Models/Dot Product Attention.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ Below steps describe in detail as to how a *dot-product attention* works:
66

77
1. Let's consider the phrase in English, *"I am happy"*.
88
First, the word *I* is embedded, to obtain a vector representation that holds continuous values which is unique for every single word.<br><br>
9-
<img src="../images/7.step - 1.png" width="50%"></img><br>
9+
<img src="../images/7. step - 1.png" width="50%"></img><br>
1010

1111
2. By feeding three distinct linear layers, you get three different vectors for queries, keys and values.<br><br>
1212
<img src="../images/8. step - 2.png" width="50%"></img><br>
@@ -17,7 +17,7 @@ First, the word *I* is embedded, to obtain a vector representation that holds co
1717
4. Finally the word *happy* to get a third vector and form the queries, keys and values matrix.<br><br>
1818
<img src="../images/10. step - 4.png" width="50%"></img><br>
1919

20-
5. From both the Q matrix and the K matrix, the attention model calculates weights or scores representing the relative importance of the keys for a specific query.
20+
5. From both the Q matrix and the K matrix, the attention model calculates weights or scores representing the relative importance of the keys for a specific query.<br><br>
2121
<img src="../images/11. step - 5.png" width="50%"></img><br>
2222

2323
6. These attention weights can be understood as alignment scores as they come from a dot product. <br><br>

0 commit comments

Comments
 (0)