Skip to content

Commit 56c43ca

Browse files
author
ercbk
committed
readme changes and gt pkg added to renv
1 parent 2335466 commit 56c43ca

File tree

3 files changed

+92
-49
lines changed

3 files changed

+92
-49
lines changed

README.Rmd

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ Nested cross-validation has become a recommended technique for situations in whi
88

99
The primary issue with this technique is that it is computationally very expensive with potentially tens of 1000s of models being trained during the process. This experiment seeks to answer two questions:
1010

11-
1. Which implementation is fastest?
11+
1. What's the fastest implementation of each method?
1212
2. How many *repeats*, given the size of the training set, should we expect to need to obtain a reasonably accurate out-of-sample error estimate?
1313

1414
While researching this technique, I found two *methods* of performing nested cross-validation — one authored by [Sabastian Raschka](https://github.com/rasbt/stat479-machine-learning-fs19/blob/master/11_eval4-algo/code/11-eval4-algo__nested-cv_verbose1.ipynb) and the other by [Max Kuhn and Kjell Johnson](https://tidymodels.github.io/rsample/articles/Applications/Nested_Resampling.html).
@@ -28,7 +28,8 @@ Duration experiment details:
2828
+ outer loop: 5 folds
2929
+ inner loop: 2 folds
3030

31-
(Size of the data sets are the same as those in the original scripts by the authors)
31+
(Size of the data sets are the same as those in the original scripts by the authors)
32+
3233

3334
Various elements of the technique can be altered to improve performance. These include:
3435

@@ -37,7 +38,7 @@ Various elements of the technique can be altered to improve performance. These i
3738
3. Inner-Loop CV strategy
3839
4. Grid search strategy
3940

40-
For the performance experiment (question 2), I'll be varying the repeats of the outer-loop cv strategy for each method. The fastest implementation of each method will be tuned with different sizes of data ranging from 100 to 5000 observations. The mean absolute error will be calculated for each combination of repeat, data size, and method.
41+
For the performance experiment (question 2), the fastest implementation of each method will be used in running a nested cross-validation with different sizes of data ranging from 100 to 5000 observations and different numbers of repeats of the outer-loop cv strategy. The chosen algorithm and hyperparameters will predict on a 100K row simulated dataset and the mean absolute error will be calculated for each combination of repeat, data size, and method.
4142

4243

4344

README.md

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ The primary issue with this technique is that it is computationally very
1313
expensive with potentially tens of 1000s of models being trained during
1414
the process. This experiment seeks to answer two questions:
1515

16-
1. Which implementation is fastest?
16+
1. What’s the fastest implementation of each method?
1717
2. How many *repeats*, given the size of the training set, should we
1818
expect to need to obtain a reasonably accurate out-of-sample error
1919
estimate?
@@ -52,11 +52,13 @@ These include:
5252
3. Inner-Loop CV strategy
5353
4. Grid search strategy
5454

55-
For the performance experiment (question 2), I’ll be varying the repeats
56-
of the outer-loop cv strategy for each method. The fastest
57-
implementation of each method will be tuned with different sizes of data
58-
ranging from 100 to 5000 observations. The mean absolute error will be
59-
calculated for each combination of repeat, data size, and method.
55+
For the performance experiment (question 2), the fastest implementation
56+
of each method will be used in running a nested cross-validation with
57+
different sizes of data ranging from 100 to 5000 observations and
58+
different numbers of repeats of the outer-loop cv strategy. The chosen
59+
algorithm and hyperparameters will predict on a 100K row simulated
60+
dataset and the mean absolute error will be calculated for each
61+
combination of repeat, data size, and method.
6062

6163
Progress (duration in seconds)
6264

renv.lock

Lines changed: 80 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -119,6 +119,13 @@
119119
"Repository": "CRAN",
120120
"Hash": "c6faf038ba4346b1de19ad7c99b8f94a"
121121
},
122+
"Rttf2pt1": {
123+
"Package": "Rttf2pt1",
124+
"Version": "1.3.8",
125+
"Source": "Repository",
126+
"Repository": "CRAN",
127+
"Hash": "8c4137a9ab70de4787d57758f8190617"
128+
},
122129
"SQUAREM": {
123130
"Package": "SQUAREM",
124131
"Version": "2020.1",
@@ -224,6 +231,13 @@
224231
"Repository": "CRAN",
225232
"Hash": "5173d8ab28680cf263636b110f4f3220"
226233
},
234+
"clipr": {
235+
"Package": "clipr",
236+
"Version": "0.7.0",
237+
"Source": "Repository",
238+
"Repository": "CRAN",
239+
"Hash": "08cf4045c149a0f0eaf405324c7495bd"
240+
},
227241
"codetools": {
228242
"Package": "codetools",
229243
"Version": "0.2-16",
@@ -245,19 +259,12 @@
245259
"Repository": "CRAN",
246260
"Hash": "98ca919385a634e5d558e6938755e0bf"
247261
},
248-
"corrplot": {
249-
"Package": "corrplot",
250-
"Version": "0.84",
262+
"commonmark": {
263+
"Package": "commonmark",
264+
"Version": "1.7",
251265
"Source": "Repository",
252266
"Repository": "CRAN",
253-
"Hash": "b55c32ae818a84109a51f172290c95f2"
254-
},
255-
"countrycode": {
256-
"Package": "countrycode",
257-
"Version": "1.1.1",
258-
"Source": "Repository",
259-
"Repository": "CRAN",
260-
"Hash": "947b61a2a21b5a50af567b591b845f72"
267+
"Hash": "0f22be39ec1d141fd03683c06f3a6e67"
261268
},
262269
"crayon": {
263270
"Package": "crayon",
@@ -280,27 +287,13 @@
280287
"Repository": "CRAN",
281288
"Hash": "2b7d10581cc730804e9ed178c8374bd6"
282289
},
283-
"d3r": {
284-
"Package": "d3r",
285-
"Version": "0.8.7",
286-
"Source": "Repository",
287-
"Repository": "CRAN",
288-
"Hash": "4c1677c45eb1dff74f3863e773a8b26a"
289-
},
290290
"data.table": {
291291
"Package": "data.table",
292292
"Version": "1.12.8",
293293
"Source": "Repository",
294294
"Repository": "CRAN",
295295
"Hash": "cd711af60c47207a776213a368626369"
296296
},
297-
"data.tree": {
298-
"Package": "data.tree",
299-
"Version": "0.7.11",
300-
"Source": "Repository",
301-
"Repository": "CRAN",
302-
"Hash": "9087f2826e50c659ba54ade20d4c8676"
303-
},
304297
"desc": {
305298
"Package": "desc",
306299
"Version": "1.2.0",
@@ -317,10 +310,10 @@
317310
},
318311
"digest": {
319312
"Package": "digest",
320-
"Version": "0.6.23",
313+
"Version": "0.6.25",
321314
"Source": "Repository",
322315
"Repository": "CRAN",
323-
"Hash": "931fd68809dab4609b4d4b5702206066"
316+
"Hash": "f697db7d92b7028c4b3436e9603fb636"
324317
},
325318
"doFuture": {
326319
"Package": "doFuture",
@@ -357,13 +350,6 @@
357350
"Repository": "CRAN",
358351
"Hash": "716869fffc16e282c118f8894e082a7d"
359352
},
360-
"echarts4r": {
361-
"Package": "echarts4r",
362-
"Version": "0.2.3",
363-
"Source": "Repository",
364-
"Repository": "CRAN",
365-
"Hash": "2604014e6b28deb9dc2be4062c96a58a"
366-
},
367353
"ellipsis": {
368354
"Package": "ellipsis",
369355
"Version": "0.3.0",
@@ -378,6 +364,20 @@
378364
"Repository": "CRAN",
379365
"Hash": "ec8ca05cffcc70569eaaad8469d2a3a7"
380366
},
367+
"extrafont": {
368+
"Package": "extrafont",
369+
"Version": "0.17",
370+
"Source": "Repository",
371+
"Repository": "CRAN",
372+
"Hash": "7f2f50e8f998a4bea4b04650fc4f2ca8"
373+
},
374+
"extrafontdb": {
375+
"Package": "extrafontdb",
376+
"Version": "1.0",
377+
"Source": "Repository",
378+
"Repository": "CRAN",
379+
"Hash": "a861555ddec7451c653b40e713166c6f"
380+
},
381381
"fansi": {
382382
"Package": "fansi",
383383
"Version": "0.4.1",
@@ -490,6 +490,18 @@
490490
"Repository": "CRAN",
491491
"Hash": "7d7f283939f563670a697165b2cf5560"
492492
},
493+
"gt": {
494+
"Package": "gt",
495+
"Version": "0.1.0",
496+
"Source": "GitHub",
497+
"RemoteType": "github",
498+
"RemoteHost": "api.github.com",
499+
"RemoteRepo": "gt",
500+
"RemoteUsername": "rstudio",
501+
"RemoteRef": "master",
502+
"RemoteSha": "9782e790daed8a903cb94451aabff54400f0ec1b",
503+
"Hash": "5cadddcef4aaf49e1f7e6092f5b180b9"
504+
},
493505
"gtable": {
494506
"Package": "gtable",
495507
"Version": "0.3.0",
@@ -525,6 +537,13 @@
525537
"Repository": "CRAN",
526538
"Hash": "4dc5bb88961e347a0f4d8aad597cbfac"
527539
},
540+
"hms": {
541+
"Package": "hms",
542+
"Version": "0.5.3",
543+
"Source": "Repository",
544+
"Repository": "CRAN",
545+
"Hash": "726671f634529d470545f9fd1a9d1869"
546+
},
528547
"htmltools": {
529548
"Package": "htmltools",
530549
"Version": "0.4.0",
@@ -840,6 +859,13 @@
840859
"Repository": "CRAN",
841860
"Hash": "ececc6518695f3390f5dd7b45558c0e7"
842861
},
862+
"patchwork": {
863+
"Package": "patchwork",
864+
"Version": "1.0.0",
865+
"Source": "Repository",
866+
"Repository": "CRAN",
867+
"Hash": "16eee5b5edc41eec5af1149ccdc6b2c9"
868+
},
843869
"pillar": {
844870
"Package": "pillar",
845871
"Version": "1.4.3",
@@ -931,6 +957,13 @@
931957
"Repository": "CRAN",
932958
"Hash": "8c8298583adbbe76f3c2220eef71bebc"
933959
},
960+
"readr": {
961+
"Package": "readr",
962+
"Version": "1.3.1",
963+
"Source": "Repository",
964+
"Repository": "CRAN",
965+
"Hash": "af8ab99cd936773a148963905736907b"
966+
},
934967
"recipes": {
935968
"Package": "recipes",
936969
"Version": "0.1.9",
@@ -940,10 +973,10 @@
940973
},
941974
"remotes": {
942975
"Package": "remotes",
943-
"Version": "2.1.0",
976+
"Version": "2.1.1",
944977
"Source": "Repository",
945978
"Repository": "CRAN",
946-
"Hash": "824a9fab6c4b3f3afd78e9e285d9c365"
979+
"Hash": "57c3009534f805f0f6476ffee68483cc"
947980
},
948981
"renv": {
949982
"Package": "renv",
@@ -1041,6 +1074,13 @@
10411074
"Repository": "CRAN",
10421075
"Hash": "33a5b27a03da82ac4b1d43268f80088a"
10431076
},
1077+
"sass": {
1078+
"Package": "sass",
1079+
"Version": "0.1.2.1",
1080+
"Source": "Repository",
1081+
"Repository": "CRAN",
1082+
"Hash": "bd7168e8f7710ee96b2d5bf94d9c1a38"
1083+
},
10441084
"scales": {
10451085
"Package": "scales",
10461086
"Version": "1.1.0",
@@ -1099,10 +1139,10 @@
10991139
},
11001140
"stringi": {
11011141
"Package": "stringi",
1102-
"Version": "1.4.5",
1142+
"Version": "1.4.6",
11031143
"Source": "Repository",
11041144
"Repository": "CRAN",
1105-
"Hash": "ced3b63472796155f74abc4eb5266c78"
1145+
"Hash": "e99d8d656980d2dd416a962ae55aec90"
11061146
},
11071147
"stringr": {
11081148
"Package": "stringr",
@@ -1232,10 +1272,10 @@
12321272
},
12331273
"vctrs": {
12341274
"Package": "vctrs",
1235-
"Version": "0.2.2",
1275+
"Version": "0.2.3",
12361276
"Source": "Repository",
12371277
"Repository": "CRAN",
1238-
"Hash": "a1de558a76d2843a10f766209b9a545f"
1278+
"Hash": "2c0f41d87be7a186139a6d3d5215848e"
12391279
},
12401280
"viridisLite": {
12411281
"Package": "viridisLite",

0 commit comments

Comments
 (0)