Skip to content

Commit 50d1c32

Browse files
TSNE benchmark docs (#122)
* Add CIFAR_10 dataset loading and available for benchmarking * Remove line according to PEP8 * Add docs for TSNE benchmark * Add epsilon dataset to TSNE benchmark
1 parent 5c78370 commit 50d1c32

File tree

4 files changed

+43
-9
lines changed

4 files changed

+43
-9
lines changed

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -113,6 +113,7 @@ The configuration of benchmarks allows you to select the frameworks to run, sele
113113
|**[PCA](https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html)**|pca|:white_check_mark:|:x:|:white_check_mark:|:white_check_mark:|:x:|
114114
|**[Ridge](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Ridge.html)**|ridge|:white_check_mark:|:x:|:white_check_mark:|:white_check_mark:|:x:|
115115
|**[SVM](https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html)**|svm|:white_check_mark:|:x:|:white_check_mark:|:white_check_mark:|:x:|
116+
|**[TSNE](https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html)**|tsne|:white_check_mark:|:x:|:x:|:white_check_mark:|:x:|
116117
|**[train_test_split](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html)**|train_test_split|:white_check_mark:|:x:|:x:|:white_check_mark:|:x:|
117118
|**[GradientBoostingClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.GradientBoostingClassifier.html)**|gbt|:x:|:x:|:x:|:x:|:white_check_mark:|
118119
|**[GradientBoostingRegressor](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html)**|gbt|:x:|:x:|:x:|:x:|:white_check_mark:|

configs/sklearn/performance/tsne.json

Lines changed: 18 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -24,15 +24,24 @@
2424
"y": "data/mnist_y_test.npy"
2525
}
2626
},
27-
{
28-
"source": "npy",
29-
"name": "cifar_10",
30-
"training":
31-
{
32-
"x": "data/cifar_10_x_train.npy",
33-
"y": "data/cifar_10_y_train.npy"
34-
}
35-
}
27+
{
28+
"source": "npy",
29+
"name": "cifar_10",
30+
"training":
31+
{
32+
"x": "data/cifar_10_x_train.npy",
33+
"y": "data/cifar_10_y_train.npy"
34+
}
35+
},
36+
{
37+
"source": "npy",
38+
"name": "epsilon_30K",
39+
"training":
40+
{
41+
"x": "data/epsilon_30K_x_train.npy",
42+
"y": "data/epsilon_30K_y_train.npy"
43+
}
44+
}
3645
],
3746
"workload-size": "medium"
3847
}

cuml_bench/README.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@ You can launch benchmarks for each algorithm separately. The tables below list a
1818
- [PCA](#pca)
1919
- [Ridge Regression](#ridge)
2020
- [SVC](#svc)
21+
- [TSNE](#tsne)
2122
- [train_test_split](#train_test_split)
2223

2324
#### General
@@ -141,6 +142,17 @@ You can launch benchmarks for each algorithm separately. The tables below list a
141142
| tol | float | 1e-16 | Tolerance passed to sklearn.svm.SVC |
142143
| probability | action | True | Use probability for SVC |
143144

145+
### TSNE
146+
147+
| parameter Name | Type | default value | description |
148+
| ----- | ---- |---- |---- |
149+
| n-components | int | 2 | Dimension of the embedded space |
150+
| early-exaggeration | float | 12.0 | This factor increases the attractive forces between points <br/>and allows points to move around more freely finding their nearest neighbors more easily |
151+
| learning-rate | float | 200.0 | The learning rate for t-SNE is usually in the range [10.0, 1000.0] |
152+
| angle | float | 0.5 | Angular size. This is the trade-off between speed and accuracy |
153+
| min-grad-norm | float | 1e-7 | If the gradient norm is below this threshold, the optimization is stopped |
154+
| random-state | int | 1234 | Determines the random number generator |
155+
144156
#### train_test_split
145157

146158
| parameter Name | Type | default value | description |

sklearn_bench/README.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ You can launch benchmarks for each algorithm separately. The tables below list a
2424
- [PCA](#pca)
2525
- [Ridge Regression](#ridge)
2626
- [SVC](#svc)
27+
- [TSNE](#tsne)
2728
- [train_test_split](#train_test_split)
2829

2930
### General
@@ -152,6 +153,17 @@ You can launch benchmarks for each algorithm separately. The tables below list a
152153
| tol | float | 1e-16 | Tolerance passed to sklearn.svm.SVC |
153154
| probability | action | True | Use probability for SVC |
154155

156+
### TSNE
157+
158+
| parameter Name | Type | default value | description |
159+
| ----- | ---- |---- |---- |
160+
| n-components | int | 2 | Dimension of the embedded space |
161+
| early-exaggeration | float | 12.0 | This factor increases the attractive forces between points <br/>and allows points to move around more freely finding their nearest neighbors more easily |
162+
| learning-rate | float | 200.0 | The learning rate for t-SNE is usually in the range [10.0, 1000.0] |
163+
| angle | float | 0.5 | Angular size. This is the trade-off between speed and accuracy |
164+
| min-grad-norm | float | 1e-7 | If the gradient norm is below this threshold, the optimization is stopped |
165+
| random-state | int | 1234 | Determines the random number generator |
166+
155167
### train_test_split
156168

157169
| parameter Name | Type | default value | description |

0 commit comments

Comments
 (0)