Skip to content

Commit 7d74a3d

Browse files
authored
Merge pull request #42 from danielforsyth/typo-1
Fixed Typo
2 parents 7945fc9 + 302bd21 commit 7d74a3d

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

neural-networks-3.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -345,7 +345,7 @@ As we've seen, training Neural Networks can involve many hyperparameter settings
345345
- learning rate decay schedule (such as the decay constant)
346346
- regularization strength (L2 penalty, dropout strength)
347347

348-
But as saw, there are many more relatively less sensitive hyperparameters, for example in per-parameter adaptive learning methods, the setting of momentum and its schedule, etc. In this section we describe some additional tips and tricks for performing the hyperparameter search:
348+
But as we saw, there are many more relatively less sensitive hyperparameters, for example in per-parameter adaptive learning methods, the setting of momentum and its schedule, etc. In this section we describe some additional tips and tricks for performing the hyperparameter search:
349349

350350
**Implementation**. Larger Neural Networks typically require a long time to train, so performing hyperparameter search can take many days/weeks. It is important to keep this in mind since it influences the design of your code base. One particular design is to have a **worker** that continuously samples random hyperparameters and performs the optimization. During the training, the worker will keep track of the validation performance after every epoch, and writes a model checkpoint (together with miscellaneous training statistics such as the loss over time) to a file, preferably on a shared file system. It is useful to include the validation performance directly in the filename, so that it is simple to inspect and sort the progress. Then there is a second program which we will call a **master**, which launches or kills workers across a computing cluster, and may additionally inspect the checkpoints written by workers and plot their training statistics, etc.
351351

0 commit comments

Comments
 (0)