readme edits

ercbk · ercbk · commit 90436a4c318b · 2020-02-27T09:14:23.000-05:00
diff --git a/README.Rmd b/README.Rmd
@@ -6,13 +6,12 @@ output: github_document
 
 Nested cross-validation has become a recommended technique for situations in which the size of our dataset is insufficient to handle both hyperparameter tuning and algorithm comparison. Using standard methods such as k-fold cross-validation in such situations results in  significant increases in optimization bias. Nested cross-validation has been shown to produce low bias in out-of-sample error estimates even using datasets with only a few hundred rows.  
 
-The primary issue with this technique is that it is computationally very expensive with potentially tens of 1000s of models being trained during the process. This experiment seeks to answer two questions:  
+The primary issue with this technique is that it is computationally very expensive with potentially tens of 1000s of models being trained during the process. While researching this technique, I found two methods of performing nested cross-validation — one authored by [Sabastian Raschka](https://github.com/rasbt/stat479-machine-learning-fs19/blob/master/11_eval4-algo/code/11-eval4-algo__nested-cv_verbose1.ipynb) and the other by [Max Kuhn and Kjell Johnson](https://tidymodels.github.io/rsample/articles/Applications/Nested_Resampling.html).  
+This experiment seeks to answer two questions:  
 
 1. What's the fastest implementation of each method?  
 2. How many *repeats*, given the size of the training set, should we expect to need to obtain a reasonably accurate out-of-sample error estimate?  
 
-While researching this technique, I found two *methods* of performing nested cross-validation — one authored by [Sabastian Raschka](https://github.com/rasbt/stat479-machine-learning-fs19/blob/master/11_eval4-algo/code/11-eval4-algo__nested-cv_verbose1.ipynb) and the other by [Max Kuhn and Kjell Johnson](https://tidymodels.github.io/rsample/articles/Applications/Nested_Resampling.html).  
-
 With regards to the question of speed, I'll will be testing implementations of both methods from various packages which include {tune}, {mlr3}, {h2o}, and {sklearn}.  
 
 Duration experiment details:  
diff --git a/README.md b/README.md
@@ -11,19 +11,18 @@ using datasets with only a few hundred rows.
 
 The primary issue with this technique is that it is computationally very
 expensive with potentially tens of 1000s of models being trained during
-the process. This experiment seeks to answer two questions:
+the process. While researching this technique, I found two methods of
+performing nested cross-validation — one authored by [Sabastian
+Raschka](https://github.com/rasbt/stat479-machine-learning-fs19/blob/master/11_eval4-algo/code/11-eval4-algo__nested-cv_verbose1.ipynb)
+and the other by [Max Kuhn and Kjell
+Johnson](https://tidymodels.github.io/rsample/articles/Applications/Nested_Resampling.html).  
+This experiment seeks to answer two questions:
 
 1.  What’s the fastest implementation of each method?  
 2.  How many *repeats*, given the size of the training set, should we
     expect to need to obtain a reasonably accurate out-of-sample error
     estimate?
 
-While researching this technique, I found two *methods* of performing
-nested cross-validation — one authored by [Sabastian
-Raschka](https://github.com/rasbt/stat479-machine-learning-fs19/blob/master/11_eval4-algo/code/11-eval4-algo__nested-cv_verbose1.ipynb)
-and the other by [Max Kuhn and Kjell
-Johnson](https://tidymodels.github.io/rsample/articles/Applications/Nested_Resampling.html).
-
 With regards to the question of speed, I’ll will be testing
 implementations of both methods from various packages which include
 {tune}, {mlr3}, {h2o}, and {sklearn}.