You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: rnn.md
+71-2Lines changed: 71 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,6 +7,8 @@ permalink: /rnn/
7
7
Table of Contents:
8
8
9
9
-[Intro to RNN](#intro)
10
+
-[RNN example as Character-level language model](#char)
11
+
-[Long-Short Term Memory (LSTM)](#lstm)
10
12
11
13
12
14
@@ -31,9 +33,76 @@ a sequence of words of a sentence in French, for example (forth model in Figure
31
33
we can have a video classification RNN where we might imagine classifying every single frame of
32
34
video with some number of classes, and most importantly we don't want the prediction to be only a
33
35
function of the current timestep (current frame of the video), but also all the timesteps (frames)
34
-
that have come before it in the video (rightmost model in Figure 1).
36
+
that have come before it in the video (rightmost model in Figure 1). In general Recurrent Neural
37
+
Networks allow us to wire up an architecture, where the prediction at every single timestep is a
38
+
function of all the timesteps that have come up to that point.
35
39
36
40
<divclass="fig figcenter fighighlight">
37
41
<imgsrc="/assets/rnn/types.png"width="100%">
38
-
<divclass="figcaption">Different (non-exhaustive) types of Recurrent Neural Network architectures. Red boxes are input vectors. Green boxes are hidden layers. Blue boxes are output vectors.</div>
42
+
<divclass="figcaption">Figure 1. Different (non-exhaustive) types of Recurrent Neural Network architectures. Red boxes are input vectors. Green boxes are hidden layers. Blue boxes are output vectors.</div>
39
43
</div>
44
+
45
+
A Recurrent Neural Network is basically a blackbox (Figure 2), where it has a state and it receives through
46
+
timesteps input vectors. At every single timestep we feed in an input vectors into the RNN and it
47
+
can modify that state as a function of what it receives at every single timestep. There are weights
48
+
inside the RNN and when we tune those weights, the RNN will have a different behavior in terms of
49
+
how its state evolves, as it receives these inputs. Usually we are also interested in producing an
50
+
output based on the RNN state, so we can produce these output vectors on top of the RNN (as depicted
0 commit comments