Skip to content

Commit cf5baa7

Browse files
committed
master: teacher forcing analogy.
1 parent 00f49ea commit cf5baa7

File tree

5 files changed

+767
-1
lines changed

5 files changed

+767
-1
lines changed

Chapter-wise code/Code - PyTorch/7. Attention Models/1. NMT/NMT SetUp.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,3 +18,13 @@ We are going to have a great many of these inputs. One thing to note here is tha
1818
<img src="./images/12. NMT setup - german.png" width="50%"></img><br><br>
1919
5. Keep track of index mappings with word2index and index2word mappings.
2020
5. Use start-of-sentence `<SOS>` and end-of-sentence `<EOS>` tokens to represent the same.
21+
22+
## Training NMT
23+
24+
### Teacher Forcing
25+
26+
Let us assume we want to train an image captioning model, and the ground truth caption for an image is “Two people reading a book”. Our model makes a mistake in predicting the 2nd word and we have “Two” and “birds” for the 1st and 2nd prediction respectively.
27+
1. *Without Teacher Forcing*, we would feed “birds” back to our RNN to predict the 3rd word. Let’s say the 3rd prediction is “flying”. Even though it makes sense for our model to predict “flying” given the input is “birds”, it is different from the ground truth.
28+
<br><img src="./images/13. No teacher forcing.png"></img><br>
29+
2. *With Teacher Forcing*, we would feed “people” to our RNN for the 3rd prediction, after computing and recording the loss for the 2nd prediction.
30+
<br><img src="./images/14. with teacher forcing.png"></img><br>

Chapter-wise code/Code - PyTorch/7. Attention Models/1. NMT/Readme.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -83,7 +83,7 @@ The first four tokens, the agreements on the, are pretty straightforward, but th
8383

8484
## Next Up
8585

86-
Next, we will learn about the set-up required to build a NMT model and what kind of dataset is used to build a NMT model. The readme file for the same is here.
86+
Next, we will learn about the set-up required to build a NMT model and what kind of dataset is used to build a NMT model. The readme file for the same is [here](./NMT%20SetUp.md).
8787

8888

8989

335 KB
Loading
326 KB
Loading

0 commit comments

Comments
 (0)