You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
"In the following data set, we are going to simulate patients with Myeloid Leukemia. We are going to analyze two features, Progression and Mutational Signature. Patients with a faster progression and higher mutational signature are considered with Acute Myeloid Leukemia (AML). "
"Before we train our k-means method, we need to split the annotated data into two subsets. Goal is to enable unbiased validation. We train on the first half of the annotated data points and measure the quality on the second half."
107
+
]
108
+
},
109
+
{
110
+
"cell_type": "code",
111
+
"execution_count": null,
112
+
"id": "2fcb456d-90c9-4cc2-90c6-9c69a43e72bb",
113
+
"metadata": {
114
+
"id": "2fcb456d-90c9-4cc2-90c6-9c69a43e72bb"
115
+
},
116
+
"outputs": [],
117
+
"source": [
118
+
"train_data = data[:200]\n",
119
+
"validation_data = data[200:250]"
120
+
]
121
+
},
122
+
{
123
+
"cell_type": "markdown",
124
+
"id": "22c17985-c857-4170-84ce-66e96f8e4971",
125
+
"metadata": {
126
+
"id": "22c17985-c857-4170-84ce-66e96f8e4971"
127
+
},
128
+
"source": [
129
+
"## Training\n",
130
+
"With the selected data we can train our k-means model"
"After training and validation of the classifier, we can reuse it to process other data sets. \n",
218
+
"It is uncommon to classify test- and validation data, as those should be used for making the classifier only. We here apply the classifier to the remaining data points."
219
+
]
220
+
},
221
+
{
222
+
"cell_type": "code",
223
+
"execution_count": null,
224
+
"id": "fa01dc06-1a8a-4f79-a6c9-de7c54087f54",
225
+
"metadata": {
226
+
"id": "fa01dc06-1a8a-4f79-a6c9-de7c54087f54"
227
+
},
228
+
"outputs": [],
229
+
"source": [
230
+
"remaining_data = data[250:]\n",
231
+
"\n",
232
+
"prediction = kmeans.predict(remaining_data)"
233
+
]
234
+
},
235
+
{
236
+
"cell_type": "code",
237
+
"execution_count": null,
238
+
"id": "f35dce87-89e0-4fec-ba16-020188f6bdb3",
239
+
"metadata": {
240
+
"id": "f35dce87-89e0-4fec-ba16-020188f6bdb3"
241
+
},
242
+
"outputs": [],
243
+
"source": [
244
+
"predicted_colors = [colors[i-1] for i in prediction]\n",
0 commit comments