Skip to content

Commit acb3da6

Browse files
author
ercbk
committed
readme update
1 parent 0439f97 commit acb3da6

File tree

3 files changed

+18
-8
lines changed

3 files changed

+18
-8
lines changed

README.Rmd

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,9 @@ output: github_document
44

55
# Nested Cross-Validation: Comparing Methods and Implementations
66

7-
Nested cross-validation has become a recommended technique for situations in which the size of our dataset is insufficient to handle both hyperparameter tuning and algorithm comparison. Using standard k-fold cross-validation in such situations results in significant optimization bias. Nested cross-validation has been shown to provide an unbiased estimation of out-of-sample error using datasets with only a few hundred rows.
7+
Nested cross-validation has become a recommended technique for situations in which the size of our dataset is insufficient to handle both hyperparameter tuning and algorithm comparison. Using standard methods such as k-fold cross-validation in such situations results in significant increases in optimization bias. Nested cross-validation has been shown to reduce the bias in out-of-sample error estimates even using datasets with only a few hundred rows.
88

9-
The primary issue with this technique is that it is computationally very expensive with potentially tens of 1000s of models being trained in the process. This experiment seeks to answer two questions:
9+
The primary issue with this technique is that it is computationally very expensive with potentially tens of 1000s of models being trained during the process. This experiment seeks to answer two questions:
1010
1. Which implementation is fastest?
1111
2. How many *repeats*, given the size of the training set, should we expect to need to obtain a reasonably accurate out-of-sample error estimate?
1212

README.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -4,14 +4,14 @@
44
Nested cross-validation has become a recommended technique for
55
situations in which the size of our dataset is insufficient to handle
66
both hyperparameter tuning and algorithm comparison. Using standard
7-
k-fold cross-validation in such situations results in significant
8-
optimization bias. Nested cross-validation has been shown to provide an
9-
unbiased estimation of out-of-sample error using datasets with only a
10-
few hundred rows.
7+
methods such as k-fold cross-validation in such situations results in
8+
significant increases in optimization bias. Nested cross-validation has
9+
been shown to reduce the bias in out-of-sample error estimates even
10+
using datasets with only a few hundred rows.
1111

1212
The primary issue with this technique is that it is computationally very
13-
expensive with potentially tens of 1000s of models being trained in the
14-
process. This experiment seeks to answer two questions:
13+
expensive with potentially tens of 1000s of models being trained during
14+
the process. This experiment seeks to answer two questions:
1515
1\. Which implementation is fastest?
1616
2\. How many *repeats*, given the size of the training set, should we
1717
expect to need to obtain a reasonably accurate out-of-sample error
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
Run ID,Name,Source Type,Source Name,User,Status,duration
2+
9380632a9dca4717816f26d39b428483,,LOCAL,C:\Users\tbats\Documents\R\Projects\nested-cv-comp-temp\duration-experiment\Raschka\mlflow\nested-cv-retic-raschka.R,tbats,FINISHED,1990.83
3+
ad7f1849211b42a6b6bb0aa10921e846,,LOCAL,C:\Users\tbats\Documents\R\Projects\nested-cv-comp-temp\duration-experiment\Raschka\mlflow\nested-cv-py-raschka2.py,tbats,FINISHED,1979.49
4+
180345e9ddad49d393cf0482087176c5,,LOCAL,C:\Users\tbats\Documents\R\Projects\nested-cv-comp-temp\duration-experiment\Raschka\mlflow\nested-cv-kj-raschka.R,tbats,FINISHED,317.68
5+
b54322c7bd1a4993a100dc6b44b78cb8,,LOCAL,C:\Users\tbats\Documents\R\Projects\nested-cv-comp-temp\duration-experiment\Raschka\mlflow\nested-cv-mlr3-raschka.R,tbats,FINISHED,307.45
6+
2c66543390bc4183a288602c572d3514,,LOCAL,nested-cv-tune-kj.R,tbats,FINISHED,7034.82
7+
1e519c7e647845f79a027f2aba6ab89e,,LOCAL,nested-cv-h2o-kj.R,tbats,FINISHED,12374.44
8+
56d45204fb45490b8fe02c57377d70ef,,LOCAL,nested-cv-sklearn-kj.R,tbats,FINISHED,7405.58
9+
f89e830aa4744acfadcdcc30dbdb7f31,,LOCAL,nested-cv-parsnip-kj.R,tbats,FINISHED,4622.9
10+
f4a72f4bedf94af9a4c3748247bfa189,,LOCAL,nested-cv-ranger-kj.R,tbats,FINISHED,2593.17

0 commit comments

Comments
 (0)