Skip to content

Commit b0ac517

Browse files
authored
Update Multi Head Attention.md
1 parent c216efc commit b0ac517

File tree

1 file changed

+2
-2
lines changed
  • Chapter-wise code/Code - PyTorch/7. Attention Models/2. Neural Text Summarization/1. Transformer Models

1 file changed

+2
-2
lines changed

Chapter-wise code/Code - PyTorch/7. Attention Models/2. Neural Text Summarization/1. Transformer Models/Multi Head Attention.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
11
# Multi-Head Attention
22

33
1. Input to multi-head attention is a set of 3 values: Queries, Keys and Values.<br><br>
4-
<img src="../images/26. step - 1.png" width="50%"></img><br>
4+
<img src="../images/26. step -1 .png" width="50%"></img><br>
55
2. To achieve the multiple lookups, you first use a fully-connected, dense linear layer on each query, key, and value. This layer will create the representations for parallel attention heads. <br><br>
6-
<img src="../images/27. step - 2" width="50%"></img><br>
6+
<img src="../images/27. step - 2.png" width="50%"></img><br>
77
3. Here, you split these vectors into number of heads and perform attention on them as each head was different.<br><br>
88
4. Then the result of the attention will be concatenated back together.<br><br>
99
<img src="../images/28. step - 3.png" width="50%"></img><br>

0 commit comments

Comments
 (0)