Skip to content

Commit af5b0b6

Browse files
committed
RNN update
1 parent 163c727 commit af5b0b6

File tree

2 files changed

+26
-0
lines changed

2 files changed

+26
-0
lines changed

assets/rnn/multilayer_rnn.png

355 KB
Loading

rnn.md

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ Table of Contents:
88

99
- [Intro to RNN](#intro)
1010
- [RNN example as Character-level language model](#char)
11+
- [Multilayer RNNs](#multi)
1112
- [Long-Short Term Memory (LSTM)](#lstm)
1213

1314

@@ -167,6 +168,31 @@ how to scale up the training of the model over larger training dataset.
167168

168169

169170

171+
<a name='multi'></a>
172+
173+
## Multilayer RNNs
174+
175+
So far we have only shown RNNs with just one layer. However, we're not limited to only a single layer architectures.
176+
One of the ways, RNNs are used today is in more complex manner. RNNs can be stacked together in multiple layers,
177+
which gives more depth, and empirically deeper architectures tend to work better (Figure 4).
178+
179+
<div class="fig figcenter fighighlight">
180+
<img src="/assets/rnn/multilayer_rnn.png" width="40%" >
181+
<div class="figcaption">Figure 4. Multilayer RNN example.</div>
182+
</div>
183+
184+
For example, in Figure 4, there are three separate RNNs each with their own set of weights. Three RNNs
185+
are stacked on top of each other, so the input of the second RNN (second RNN layer in Figure 4) is the
186+
vector of the hidden state vector of the first RNN (first RNN layer in Figure 4). All stacked RNNs
187+
are trained jointly, and the diagram in Figure 4 represents one computational graph.
188+
189+
190+
191+
170192
<a name='lstm'></a>
171193

172194
## Long-Short Term Memory (LSTM)
195+
196+
So far we have seen only a simple recurrence formula for the Vanilla RNN. In practice, we actually will
197+
rarely ever use Vanilla RNN formula. Instead, we will use what we call a Long-Short Term Memory (LSTM)
198+
RNN.

0 commit comments

Comments
 (0)