Skip to content

Commit ca734bb

Browse files
committed
master: teacher forcing and training NMT.
1 parent cf5baa7 commit ca734bb

File tree

2 files changed

+6
-3
lines changed

2 files changed

+6
-3
lines changed

Chapter-wise code/Code - PyTorch/7. Attention Models/1. NMT/NMT SetUp.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -19,12 +19,15 @@ We are going to have a great many of these inputs. One thing to note here is tha
1919
5. Keep track of index mappings with word2index and index2word mappings.
2020
5. Use start-of-sentence `<SOS>` and end-of-sentence `<EOS>` tokens to represent the same.
2121

22-
## Training NMT
23-
2422
### Teacher Forcing
2523

2624
Let us assume we want to train an image captioning model, and the ground truth caption for an image is “Two people reading a book”. Our model makes a mistake in predicting the 2nd word and we have “Two” and “birds” for the 1st and 2nd prediction respectively.
2725
1. *Without Teacher Forcing*, we would feed “birds” back to our RNN to predict the 3rd word. Let’s say the 3rd prediction is “flying”. Even though it makes sense for our model to predict “flying” given the input is “birds”, it is different from the ground truth.
2826
<br><img src="./images/13. No teacher forcing.png"></img><br>
2927
2. *With Teacher Forcing*, we would feed “people” to our RNN for the 3rd prediction, after computing and recording the loss for the 2nd prediction.
30-
<br><img src="./images/14. with teacher forcing.png"></img><br>
28+
<br><img src="./images/14. with teacher forcing.png"></img><br>
29+
30+
## Training NMT
31+
32+
1. The initial `select` makes two copies. Each of the input tokens represented by zero (English words) and the target tokens (German words) represented by one.
33+
<img src="./images/15. step - 1.png" width="50%"></img><br><br>
146 KB
Loading

0 commit comments

Comments
 (0)