Update Multi Head Attention.md

purvasingh96 · web-flow · commit b0ac51732daf · 2021-03-11T10:34:02.000+05:30
diff --git a/Chapter-wise code/Code - PyTorch/7. Attention Models/2. Neural Text Summarization/1. Transformer Models/Multi Head Attention.md b/Chapter-wise code/Code - PyTorch/7. Attention Models/2. Neural Text Summarization/1. Transformer Models/Multi Head Attention.md
@@ -1,9 +1,9 @@
 # Multi-Head Attention
 
 1. Input to multi-head attention is a set of 3 values: Queries, Keys and Values.<br><br>
-<img src="../images/26. step - 1.png" width="50%"></img><br>
+<img src="../images/26. step -1 .png" width="50%"></img><br>
 2. To achieve the multiple lookups, you first use a fully-connected, dense linear layer on each query, key, and value. This layer will create the representations for parallel attention heads. <br><br>
-<img src="../images/27. step - 2" width="50%"></img><br>
+<img src="../images/27. step - 2.png" width="50%"></img><br>
 3. Here, you split these vectors into number of heads and perform attention on them as each head was different.<br><br>
 4. Then the result of the attention will be concatenated back together.<br><br>
 <img src="../images/28. step - 3.png" width="50%"></img><br>