Skip to content

Commit fc66bb6

Browse files
authored
master: updating image's size.
1 parent 056c1e9 commit fc66bb6

File tree

1 file changed

+7
-7
lines changed
  • Chapter-wise code/Code - PyTorch/7. Attention Models/1. NMT

1 file changed

+7
-7
lines changed

Chapter-wise code/Code - PyTorch/7. Attention Models/1. NMT/Readme.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -6,12 +6,12 @@ In NMT, we use an encoder and a decoder to translate from one language to anothe
66
<br><br>
77
It takes in a hidden states and a string of words, such as a single sentence. The encoder takes the inputs one step at a time, collects information for that piece of inputs, then moves it forward. The orange rectangle represents the encoders final hidden states, which tries to capture all the information collected from each input step, before feeding it to the decoder. This final hidden state provides the initial states for the decoder to begin predicting the sequence.
88

9-
<img src="./images/1. basic seq-to-seq model.png"><img> <br><br>
9+
<img src="./images/1. basic seq-to-seq model.png" width="50%"><img> <br><br>
1010

1111
### Limitation of a basic Seq-to-Seq Model
1212

1313
One major limitation of a basic seq-to-seq model is *information bottle-neck* represented by the figure below:
14-
<img src="./images/2.NMT basic model.png"><img> <br><br>
14+
<img src="./images/2.NMT basic model.png" width="60%"><img> <br><br>
1515

1616
In case of long sequences of sentences, when the end-user stacks up multiple layers of words, words that are entered at a later stage are given more importance than the words that were entered first.<br><br>
1717
Because the encoder hidden states is of a fixed size, and longer inputs become *bottlenecked* on their way to the decoder.
@@ -20,12 +20,12 @@ Hence, inputs that contain short sentences will work for NMT but long sentences
2020

2121
## Word Alignment
2222

23-
Word Alignment is the task of finding the correspondence between source and target words in a pair of sentences that are translations of each other.
24-
<img src="./images/3. word alignment.png"><img> <br><br>
23+
Word Alignment is the task of finding the correspondence between source and target words in a pair of sentences that are translations of each other.<br>
24+
<img src="./images/3. word alignment.png" width="50%"><img> <br><br>
2525
When performing word alignment, your model needs to be able to identify relationships among the words in order to make accurate predictions in case the words are out of order or not exact translations.
2626

27-
In a model that has a vector for each input, there needs to be a way to focus more attention in the right places. Many languages don't translate exactly into another language. To be able to align the words correctly, you need to add a layer to help the decoder understand which inputs are more important for each prediction.
28-
<img src="./images/4. alignment and attention.png"><img> <br><br>
27+
In a model that has a vector for each input, there needs to be a way to focus more attention in the right places. Many languages don't translate exactly into another language. To be able to align the words correctly, you need to add a layer to help the decoder understand which inputs are more important for each prediction.<br>
28+
<img src="./images/4. alignment and attention.png" width="70%"><img> <br><br>
2929

3030
### Attention and Alignment
3131
Below is a step-by-step algorithm for NMT:
@@ -36,4 +36,4 @@ Below is a step-by-step algorithm for NMT:
3636
4. *Take each encoder hidden state, and multiply it by its softmax score, which is a number between 0 and 1, this results in the alignments vector.*
3737
5. *Now just add up everything in the alignments vector to arrive at what's called the context vector, which is then fed to the decoder.*
3838

39-
<img src="./images/5. Calculating alignment for NMT model.png"><img> <br><br>
39+
<img src="./images/5. Calculating alignment for NMT model.png"><img> <br><br>

0 commit comments

Comments
 (0)