Skip to content

Commit d801389

Browse files
export data notebook and restructure fodlers (#1129)
Co-authored-by: ezekielemerson <eemerson2325@gmail.com>
1 parent 5b61ad9 commit d801389

File tree

6 files changed

+398
-1060
lines changed

6 files changed

+398
-1060
lines changed

examples/README.md

Lines changed: 54 additions & 63 deletions
Large diffs are not rendered by default.

examples/basics/export_data.ipynb

Lines changed: 343 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,343 @@
1+
{
2+
"nbformat": 4,
3+
"nbformat_minor": 0,
4+
"metadata": {},
5+
"cells": [
6+
{
7+
"metadata": {},
8+
"source": [
9+
"<td>\n",
10+
" <a target=\"_blank\" href=\"https://labelbox.com\" ><img src=\"https://labelbox.com/blog/content/images/2021/02/logo-v4.svg\" width=256/></a>\n",
11+
"</td>"
12+
],
13+
"cell_type": "markdown"
14+
},
15+
{
16+
"metadata": {},
17+
"source": [
18+
"<td>\n",
19+
"<a href=\"https://colab.research.google.com/github/Labelbox/labelbox-python/blob/master/examples/basics/export_data.ipynb\" target=\"_blank\"><img\n",
20+
"src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"></a>\n",
21+
"</td>\n",
22+
"\n",
23+
"<td>\n",
24+
"<a href=\"https://github.com/Labelbox/labelbox-python/tree/master/examples/basics/export_data.ipynb\" target=\"_blank\"><img\n",
25+
"src=\"https://img.shields.io/badge/GitHub-100000?logo=github&logoColor=white\" alt=\"GitHub\"></a>\n",
26+
"</td>"
27+
],
28+
"cell_type": "markdown"
29+
},
30+
{
31+
"metadata": {},
32+
"source": [
33+
"# Export data\n",
34+
"How to export data for projects, datasets, slices, and models, with examples for each type of v2 export along with details on optional parameters and filters."
35+
],
36+
"cell_type": "markdown"
37+
},
38+
{
39+
"metadata": {},
40+
"source": [
41+
"!pip install -q \"labelbox[data]\""
42+
],
43+
"cell_type": "code",
44+
"outputs": [],
45+
"execution_count": null
46+
},
47+
{
48+
"metadata": {},
49+
"source": [
50+
"import labelbox as lb"
51+
],
52+
"cell_type": "code",
53+
"outputs": [],
54+
"execution_count": null
55+
},
56+
{
57+
"metadata": {},
58+
"source": [
59+
"# API Key and Client\n",
60+
"See the developer guide for [creating an API key](https://docs.labelbox.com/reference/create-api-key)."
61+
],
62+
"cell_type": "markdown"
63+
},
64+
{
65+
"metadata": {},
66+
"source": [
67+
"API_KEY = \"\"\n",
68+
"client = lb.Client(api_key=API_KEY)"
69+
],
70+
"cell_type": "code",
71+
"outputs": [],
72+
"execution_count": null
73+
},
74+
{
75+
"metadata": {},
76+
"source": [
77+
"## Export data rows from a project\n",
78+
"For complete details on the supported filters and parameters, including how they are used and what information is included, please see the [Export overview](https://docs.labelbox.com/reference/label-export#optional-parameters-and-filters) developer guide.\n",
79+
"\n",
80+
"### Parameters\n",
81+
"When you export data rows from a project, you may choose to include or exclude certain attributes, including:\n",
82+
"- `attachments`\n",
83+
"- `metadata_fields`\n",
84+
"- `data_row_details`\n",
85+
"- `project_details`\n",
86+
"- `label_details`\n",
87+
"- `performance_details`\n",
88+
"- `interpolated_frames`\n",
89+
" - Only applicable for video data rows.\n",
90+
"\n",
91+
"### Filters\n",
92+
"When you export data rows from a project, you can specify the included data rows with the following filters:\n",
93+
"- `last_activity_at`\n",
94+
"- `label_created_at`\n",
95+
"- `data_row_ids`\n",
96+
"\n",
97+
"#### Filter details\n",
98+
"You can set the range for `last_activity_at` and `label_created_at` in the following formats: \n",
99+
"- `YYYY-MM-DD`\n",
100+
"- `YYYY-MM-DD hh:mm:ss`\n",
101+
"- `YYYY-MM-DDThh:mm:ss\u00b1hhmm` (ISO 8601)\n",
102+
"\n",
103+
"The ISO 8061 format allows you to specify the timezone, while the other two formats assume timezone from the user's workspace settings.\n",
104+
"\n",
105+
"The `last_activity_at` filter captures the creation and modification of labels, metadata, workflow status, comments, and reviews.\n",
106+
"\n",
107+
"If you wish to specify data rows to export, uncomment the `data_row_ids` filter and provide a list of applicable IDs. The data rows must be part of a batch attached to the project in question. You can provide up to 2,000 data row IDs."
108+
],
109+
"cell_type": "markdown"
110+
},
111+
{
112+
"metadata": {},
113+
"source": [
114+
"# Insert the project ID of the project from which you wish to export data rows.\n",
115+
"PROJECT_ID = \"\"\n",
116+
"project = client.get_project(PROJECT_ID)"
117+
],
118+
"cell_type": "code",
119+
"outputs": [],
120+
"execution_count": null
121+
},
122+
{
123+
"metadata": {},
124+
"source": [
125+
"# Set the export params to include/exclude certain fields. \n",
126+
"export_params= {\n",
127+
" \"attachments\": True,\n",
128+
" \"metadata_fields\": True,\n",
129+
" \"data_row_details\": True,\n",
130+
" \"project_details\": True,\n",
131+
" \"label_details\": True,\n",
132+
" \"performance_details\": True,\n",
133+
" \"interpolated_frames\": True\n",
134+
"}\n",
135+
"\n",
136+
"# Note: Filters follow AND logic, so typically using one filter is sufficient.\n",
137+
"filters= {\n",
138+
" \"last_activity_at\": [\"2000-01-01 00:00:00\", \"2050-01-01 00:00:00\"],\n",
139+
" \"label_created_at\": [\"2000-01-01 00:00:00\", \"2050-01-01 00:00:00\"],\n",
140+
" # \"data_row_ids\": [\"<data_row_id>\", \"<data_row_id>\"] \n",
141+
"}\n",
142+
"\n",
143+
"export_task = project.export_v2(params=export_params, filters=filters)\n",
144+
"export_task.wait_till_done()\n",
145+
"\n",
146+
"if export_task.errors:\n",
147+
" print(export_task.errors)\n",
148+
"\n",
149+
"export_json = export_task.result\n",
150+
"print(\"results: \", export_json)"
151+
],
152+
"cell_type": "code",
153+
"outputs": [],
154+
"execution_count": null
155+
},
156+
{
157+
"metadata": {},
158+
"source": [
159+
"## Export data rows from a dataset\n",
160+
"For complete details on the supported filters and parameters, including how they are used and what information is included, please see the [Export overview](https://docs.labelbox.com/reference/label-export#optional-parameters-and-filters) developer guide.\n",
161+
"\n",
162+
"### Parameters\n",
163+
"When you export data rows from a dataset, you may choose to include or exclude certain attributes, including:\n",
164+
"- `attachments`\n",
165+
"- `metadata_fields`\n",
166+
"- `data_row_details`\n",
167+
"- `project_details`\n",
168+
"- `label_details`\n",
169+
"- `performance_details`\n",
170+
"- `interpolated_frames`\n",
171+
" - Only applicable for video data rows.\n",
172+
"- `project_ids`\n",
173+
" - Accepts a list of project IDs. If provided, the labels created _in these projects_ on the exported data rows will be included. \n",
174+
"- `model_run_ids`\n",
175+
" - Accepts a list of model run IDs. If provided, the labels and predicitions created _in these model runs_ will be included. \n",
176+
"\n",
177+
"### Filters\n",
178+
"When you export data rows from a project, you can specify the included data rows with the following filters:\n",
179+
"- `last_activity_at`\n",
180+
"- `label_created_at`\n",
181+
"- `data_row_ids`\n",
182+
"\n",
183+
"See the _Export data rows from a project_ section above for additional details on each filter. "
184+
],
185+
"cell_type": "markdown"
186+
},
187+
{
188+
"metadata": {},
189+
"source": [
190+
"# Insert the dataset ID of the dataset from which you wish to export data rows.\n",
191+
"DATASET_ID = \"\"\n",
192+
"dataset = client.get_dataset(DATASET_ID)"
193+
],
194+
"cell_type": "code",
195+
"outputs": [],
196+
"execution_count": null
197+
},
198+
{
199+
"metadata": {},
200+
"source": [
201+
"# Set the export params to include/exclude certain fields.\n",
202+
"export_params= {\n",
203+
" \"attachments\": True,\n",
204+
" \"metadata_fields\": True,\n",
205+
" \"data_row_details\": True,\n",
206+
" \"project_details\": True,\n",
207+
" \"label_details\": True,\n",
208+
" \"performance_details\": True,\n",
209+
" \"interpolated_frames\": True,\n",
210+
" # \"project_ids\": [\"<project_id>\", \"<project_id>\"],\n",
211+
" # \"model_run_ids\": [\"<model_run_id>\", \"<model_run_id>\"] \n",
212+
"}\n",
213+
"\n",
214+
"# Note: Filters follow AND logic, so typically using one filter is sufficient.\n",
215+
"filters= {\n",
216+
" \"last_activity_at\": [\"2000-01-01 00:00:00\", \"2050-01-01 00:00:00\"]\n",
217+
"}\n",
218+
"\n",
219+
"export_task = dataset.export_v2(params=export_params, filters=filters)\n",
220+
"export_task.wait_till_done()\n",
221+
"\n",
222+
"if export_task.errors:\n",
223+
" print(export_task.errors)\n",
224+
"\n",
225+
"export_json = export_task.result\n",
226+
"print(\"results: \", export_json)"
227+
],
228+
"cell_type": "code",
229+
"outputs": [],
230+
"execution_count": null
231+
},
232+
{
233+
"metadata": {},
234+
"source": [
235+
"## Export data rows from a slice\n",
236+
"For complete details on the supported filters and parameters, including how they are used and what information is included, please see the [Export overview](https://docs.labelbox.com/reference/label-export#optional-parameters-and-filters) developer guide.\n",
237+
"\n",
238+
"### Parameters\n",
239+
"When exporting from a slice, you can apply the same parameters as exporting from a dataset.\n",
240+
"\n",
241+
"### Filters\n",
242+
"No filters are applicable to exports from a slice. All the data rows of the slice must be exported."
243+
],
244+
"cell_type": "markdown"
245+
},
246+
{
247+
"metadata": {},
248+
"source": [
249+
"# Insert the Catalog slice ID of the slice from which you wish to export data rows.\n",
250+
"CATALOG_SLICE_ID = \"\"\n",
251+
"catalog_slice = client.get_catalog_slice(CATALOG_SLICE_ID)"
252+
],
253+
"cell_type": "code",
254+
"outputs": [],
255+
"execution_count": null
256+
},
257+
{
258+
"metadata": {},
259+
"source": [
260+
"# Set the export params to include/exclude certain fields.\n",
261+
"export_params = {\n",
262+
" \"attachments\": True,\n",
263+
" \"metadata_fields\": True,\n",
264+
" \"data_row_details\": True,\n",
265+
" \"project_details\": True,\n",
266+
" \"label_details\": True,\n",
267+
" \"performance_details\": True,\n",
268+
" \"interpolated_frames\": True,\n",
269+
" # \"project_ids\": [\"<project_id>\", \"<project_id>\"],\n",
270+
" # \"model_run_ids\": [\"<model_run_id>\", \"<model_run_id>\"]\n",
271+
"}\n",
272+
"\n",
273+
"export_task = catalog_slice.export_v2(params=export_params)\n",
274+
"export_task.wait_till_done()\n",
275+
"\n",
276+
"if export_task.errors:\n",
277+
" print(export_task.errors)\n",
278+
"\n",
279+
"export_json = export_task.result\n",
280+
"print(\"results: \", export_json)"
281+
],
282+
"cell_type": "code",
283+
"outputs": [],
284+
"execution_count": null
285+
},
286+
{
287+
"metadata": {},
288+
"source": [
289+
"## Export data rows from a model run\n",
290+
"For complete details on the supported filters and parameters, including how they are used and what information is included, please see the [Export overview](https://docs.labelbox.com/reference/label-export#optional-parameters-and-filters) developer guide.\n",
291+
"\n",
292+
"### Parameters\n",
293+
"- `attachments`\n",
294+
"- `metadata_fields`\n",
295+
"- `data_row_details`\n",
296+
"- `interpolated_frames`\n",
297+
" - Only applicable for video data rows.\n",
298+
"- `predictions`\n",
299+
" - If true, all predictions made in the model run will be included for each data row in the export.\n",
300+
"\n",
301+
"### Filters\n",
302+
"No filters are applicable to exports from a model run. All the data rows of the model run must be exported.\n"
303+
],
304+
"cell_type": "markdown"
305+
},
306+
{
307+
"metadata": {},
308+
"source": [
309+
"# Insert the model run ID of the model run from which you wish to export data rows.\n",
310+
"MODEL_RUN_ID = \"\"\n",
311+
"model_run = client.get_model_run(MODEL_RUN_ID)"
312+
],
313+
"cell_type": "code",
314+
"outputs": [],
315+
"execution_count": null
316+
},
317+
{
318+
"metadata": {},
319+
"source": [
320+
"# Set the export params to include/exclude certain fields.\n",
321+
"export_params = {\n",
322+
" \"attachments\": True,\n",
323+
" \"metadata_fields\": True,\n",
324+
" \"data_row_details\": True,\n",
325+
" \"interpolated_frames\": True,\n",
326+
" \"predictions\": True\n",
327+
"}\n",
328+
"\n",
329+
"export_task = model_run.export_v2(params=export_params)\n",
330+
"export_task.wait_till_done()\n",
331+
"\n",
332+
"if export_task.errors:\n",
333+
" print(export_task.errors)\n",
334+
"\n",
335+
"export_json = export_task.result\n",
336+
"print(\"results: \", export_json)"
337+
],
338+
"cell_type": "code",
339+
"outputs": [],
340+
"execution_count": null
341+
}
342+
]
343+
}

examples/basics/labels.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@
3838
"metadata": {},
3939
"source": [
4040
"#### *** This section explains how to use the Label object but It is reccomended that you use bulk export for exporting labels *** \n",
41-
"* [Bulk export examples](https://github.com/Labelbox/labelbox-python/tree/master/examples/label_export)\n",
41+
"* [Export data](https://github.com/Labelbox/labelbox-python/tree/master/examples/basics/export_data.ipynb)\n",
4242
"* [Label format documentation](https://docs.labelbox.com/data-model/en/index-en#label)"
4343
],
4444
"cell_type": "markdown"

0 commit comments

Comments
 (0)