Skip to content

Commit 2f74fe2

Browse files
author
Matt Sokoloff
committed
added deps for colab
1 parent ca1130e commit 2f74fe2

File tree

16 files changed

+750
-316
lines changed

16 files changed

+750
-316
lines changed

examples/basics/basics.ipynb

Lines changed: 36 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -13,15 +13,15 @@
1313
"id": "smaller-syndication",
1414
"metadata": {},
1515
"source": [
16-
"#### Quick install instructions\n",
16+
"### Quick install instructions\n",
1717
"The quick version is basically just\n",
1818
"1. `!pip install labelbox`\n",
1919
"2. `export LABELBOX_API_KEY=\"<your_api_key>\"`\n",
2020
"* Get this from the UI under (Account -> API -> Create API Key)\n",
21-
"\n",
21+
"* You can also set the api_key below in the notebook.\n",
2222
"\n",
2323
"This only works for cloud deployments.\n",
24-
"* For more details : https://docs.labelbox.com/python-sdk/en/index-en#labelbox-python-sdk"
24+
"* For more details : https://docs.labelbox.com/python-sdk/en/index-en#labelbox-python-sdk\n"
2525
]
2626
},
2727
{
@@ -34,6 +34,16 @@
3434
" * https://docs.labelbox.com/python-sdk/en/index-en#fundamental-concepts"
3535
]
3636
},
37+
{
38+
"cell_type": "code",
39+
"execution_count": null,
40+
"id": "indie-bracket",
41+
"metadata": {},
42+
"outputs": [],
43+
"source": [
44+
"!pip install labelbox"
45+
]
46+
},
3747
{
3848
"cell_type": "code",
3949
"execution_count": 5,
@@ -42,7 +52,8 @@
4252
"outputs": [],
4353
"source": [
4454
"from labelbox import Client\n",
45-
"from labelbox import Project, Dataset"
55+
"from labelbox import Project, Dataset\n",
56+
"import os"
4657
]
4758
},
4859
{
@@ -74,15 +85,19 @@
7485
"PROJECT_ID = \"ckk4q1viuc0w20704eh69u28h\"\n",
7586
"DATASET_ID = \"ckk4q1vjznyhu087203wlghfr\"\n",
7687
"PROJECT_NAME = \"Sample Project\"\n",
77-
"DATASET_NAME = \"Example Jellyfish Dataset\""
88+
"DATASET_NAME = \"Example Jellyfish Dataset\"\n",
89+
"# Set this if running in colab. Otherwise it should work if you have the LABELBOX_API_KEY set.\n",
90+
"API_KEY = os.environ[\"LABELBOX_API_KEY\"]\n",
91+
"# Only update this if you have an on-prem deployment\n",
92+
"ENDPOINT = \"https://api.labelbox.com/graphql\" "
7893
]
7994
},
8095
{
8196
"cell_type": "markdown",
8297
"id": "chinese-playing",
8398
"metadata": {},
8499
"source": [
85-
"#### Client\n",
100+
"### Client\n",
86101
"* Starting point for all db interactions"
87102
]
88103
},
@@ -93,9 +108,7 @@
93108
"metadata": {},
94109
"outputs": [],
95110
"source": [
96-
"#Client is used for all DB interactions.\n",
97-
"#This is usually the starting point for all usage.\n",
98-
"client = Client()"
111+
"client = Client(api_key = API_KEY, endpoint = ENDPOINT)"
99112
]
100113
},
101114
{
@@ -136,7 +149,7 @@
136149
"id": "popular-nylon",
137150
"metadata": {},
138151
"source": [
139-
"#### Fields\n",
152+
"### Fields\n",
140153
"* All db objects have fields (look at the source code to see them https://github.com/Labelbox/labelbox-python/blob/develop/labelbox/schema/project.py)\n",
141154
"* These fields are attributes of the object"
142155
]
@@ -187,7 +200,7 @@
187200
"id": "viral-power",
188201
"metadata": {},
189202
"source": [
190-
"#### Pagination\n",
203+
"### Pagination\n",
191204
"* Queries that return a list of database objects return them as a PaginatedCollection\n",
192205
"* The goal here is to limit the data being returned to only the necessary data."
193206
]
@@ -232,17 +245,18 @@
232245
}
233246
],
234247
"source": [
235-
"#Iterate over them to get the items out.\n",
248+
"# Note that if you selected a project_id without any labels this will raise StopIteration\n",
249+
"# Iterate over them to get the items out.\n",
236250
"next(labels_paginated_collection)\n",
237-
"#Be careful not to call list(paginated_collection) on a large collection"
251+
"# Be careful not to call list(paginated_collection) on a large collection"
238252
]
239253
},
240254
{
241255
"cell_type": "markdown",
242256
"id": "widespread-startup",
243257
"metadata": {},
244258
"source": [
245-
"#### Query parameters\n",
259+
"### Query parameters\n",
246260
"* Query with the following conventions:\n",
247261
" * `DbObject.Field`"
248262
]
@@ -273,21 +287,21 @@
273287
" (Project.description == \"new description field\")\n",
274288
"))\n",
275289
" \n",
276-
"#The above two queries return PaginatedCollections because the filter parameters aren't guarenteed to be unique.\n",
277-
"#So even if there is one element returned it is in a paginatedCollection.\n",
290+
"# The above two queries return PaginatedCollections because the filter parameters aren't guarenteed to be unique.\n",
291+
"# So even if there is one element returned it is in a paginatedCollection.\n",
278292
"print(projects)\n",
279293
"print(next(projects, None))\n",
280294
"print(next(projects, None))\n",
281295
"print(next(projects, None))\n",
282-
"#We can see there is only one."
296+
"# We can see there is only one."
283297
]
284298
},
285299
{
286300
"cell_type": "markdown",
287301
"id": "french-toner",
288302
"metadata": {},
289303
"source": [
290-
"#### Querying Limitations\n",
304+
"### Querying Limitations\n",
291305
"* The DbObject used for the query must be the same as the DbObject returned by the querying function. \n",
292306
"* eg. is not valid since get_project returns a Project but we are filtering on a Dataset\n",
293307
"> `>>> projects = client.get_projects(where = Dataset.name == \"dataset_name\")`\n"
@@ -298,7 +312,7 @@
298312
"id": "defensive-bidder",
299313
"metadata": {},
300314
"source": [
301-
"#### Relationship\n",
315+
"### Relationship\n",
302316
"* This solves the above problem of querying by a relationship\n",
303317
"* You can find all realtionships of a DB object in the source code\n",
304318
" * eg. for a Project ( https://github.com/Labelbox/labelbox-python/blob/develop/labelbox/schema/project.py))"
@@ -322,9 +336,9 @@
322336
}
323337
],
324338
"source": [
325-
"#Dataset has a Relationship to a Project so we can use the following\n",
339+
"# Dataset has a Relationship to a Project so we can use the following\n",
326340
"list(dataset.projects())\n",
327-
"#This will return all projects that are attached to this dataset"
341+
"# This will return all projects that are attached to this dataset"
328342
]
329343
},
330344
{
@@ -354,7 +368,7 @@
354368
"id": "metric-speaker",
355369
"metadata": {},
356370
"source": [
357-
"#### Delete\n",
371+
"### Delete\n",
358372
"* Most DBObjects support deletion"
359373
]
360374
},

examples/basics/data_rows.ipynb

Lines changed: 59 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -20,30 +20,62 @@
2020
"* A datarow is a member of a dataset \n",
2121
"* A datarow cannot exist without belonging to a dataset.\n",
2222
"* Datarows are staged to be labeled by attaching the dataset that they are members of to a project\n",
23-
" * See dataset notebook on information about datasets\n",
24-
"---\n",
25-
"To run this notebook on your own data, set the variables in the next cell"
23+
" * See dataset notebook on information about datasets"
2624
]
2725
},
2826
{
2927
"cell_type": "code",
3028
"execution_count": null,
31-
"id": "rural-fellow",
29+
"id": "posted-nation",
3230
"metadata": {},
3331
"outputs": [],
3432
"source": [
35-
"PROJECT_ID = \"ckk4q1viuc0w20704eh69u28h\""
33+
"!pip install labelbox"
3634
]
3735
},
3836
{
3937
"cell_type": "code",
4038
"execution_count": null,
41-
"id": "filled-disaster",
39+
"id": "beautiful-ready",
4240
"metadata": {},
4341
"outputs": [],
4442
"source": [
4543
"from labelbox import Project, Dataset, DataRow, Client\n",
46-
"import uuid"
44+
"import uuid\n",
45+
"import os"
46+
]
47+
},
48+
{
49+
"cell_type": "markdown",
50+
"id": "legendary-harvard",
51+
"metadata": {},
52+
"source": [
53+
"* Set the following cell with your data to run this notebook"
54+
]
55+
},
56+
{
57+
"cell_type": "code",
58+
"execution_count": null,
59+
"id": "rural-fellow",
60+
"metadata": {},
61+
"outputs": [],
62+
"source": [
63+
"# Pick a project that has a dataset attached\n",
64+
"PROJECT_ID = \"ckk4q1viuc0w20704eh69u28h\"\n",
65+
"# Set this if running in colab. Otherwise it should work if you have the LABELBOX_API_KEY set.\n",
66+
"API_KEY = os.environ[\"LABELBOX_API_KEY\"]\n",
67+
"# Only update this if you have an on-prem deployment\n",
68+
"ENDPOINT = \"https://api.labelbox.com/graphql\" "
69+
]
70+
},
71+
{
72+
"cell_type": "code",
73+
"execution_count": null,
74+
"id": "proof-detective",
75+
"metadata": {},
76+
"outputs": [],
77+
"source": [
78+
"client = Client(api_key = API_KEY, endpoint = ENDPOINT)"
4779
]
4880
},
4981
{
@@ -53,11 +85,10 @@
5385
"metadata": {},
5486
"outputs": [],
5587
"source": [
56-
"client = Client()\n",
5788
"project = client.get_project(PROJECT_ID)\n",
5889
"dataset = next(project.datasets())\n",
59-
"#This is the same as\n",
60-
"#-> dataset = client.get_dataset(dataset_id)"
90+
"# This is the same as\n",
91+
"# -> dataset = client.get_dataset(dataset_id)"
6192
]
6293
},
6394
{
@@ -86,7 +117,7 @@
86117
"metadata": {},
87118
"outputs": [],
88119
"source": [
89-
"#Url\n",
120+
"# Url\n",
90121
"print(\"Associated dataset\", datarow.dataset())\n",
91122
"print(\"Associated label(s)\", next(datarow.labels()))\n",
92123
"print(\"External id\", datarow.external_id)"
@@ -99,7 +130,7 @@
99130
"metadata": {},
100131
"outputs": [],
101132
"source": [
102-
"#External ids can be a reference to your internal datasets\n",
133+
"# External ids can be a reference to your internal datasets\n",
103134
"datarow = dataset.data_row_for_external_id(datarow.external_id)\n",
104135
"print(datarow)"
105136
]
@@ -123,8 +154,8 @@
123154
"dataset = client.create_dataset(name = \"testing-dataset\")\n",
124155
"dataset.create_data_row(row_data = \"https://picsum.photos/200/300\")\n",
125156
"\n",
126-
"#It is reccomended that you use external ids but optional.\n",
127-
"#These are useful for users to maintain references to a data_row.\n",
157+
"# It is reccomended that you use external ids but optional.\n",
158+
"# These are useful for users to maintain references to a data_row.\n",
128159
"dataset.create_data_row(row_data = \"https://picsum.photos/200/300\", external_id = str(uuid.uuid4()))\n"
129160
]
130161
},
@@ -135,7 +166,7 @@
135166
"metadata": {},
136167
"outputs": [],
137168
"source": [
138-
"#Bulk create datarows\n",
169+
"# Bulk create datarows\n",
139170
"task1 = dataset.create_data_rows([{DataRow.row_data : \"https://picsum.photos/200/300\"}\n",
140171
" , {DataRow.row_data : \"https://picsum.photos/200/300\"}])"
141172
]
@@ -147,7 +178,7 @@
147178
"metadata": {},
148179
"outputs": [],
149180
"source": [
150-
"#Local paths\n",
181+
"# Local paths\n",
151182
"local_data_path = '/tmp/test_data_row.txt'\n",
152183
"with open(local_data_path, 'w') as file:\n",
153184
" file.write(\"sample data\")\n",
@@ -162,7 +193,7 @@
162193
"metadata": {},
163194
"outputs": [],
164195
"source": [
165-
"#You can mix local files with urls\n",
196+
"# You can mix local files with urls\n",
166197
"task3 = dataset.create_data_rows([{DataRow.row_data : \"https://picsum.photos/200/300\"}, local_data_path])"
167198
]
168199
},
@@ -173,8 +204,8 @@
173204
"metadata": {},
174205
"outputs": [],
175206
"source": [
176-
"#Note that you cannot set external_ids at this time when uploading from local files.\n",
177-
"#To do this you have to first\n",
207+
"# Note that you cannot set external_ids at this time when uploading from local files.\n",
208+
"# To do this you have to first\n",
178209
"item_url = client.upload_file(local_data_path)\n",
179210
"task4 = dataset.create_data_rows([{DataRow.row_data : item_url, DataRow.external_id : str(uuid.uuid4())}])"
180211
]
@@ -186,7 +217,7 @@
186217
"metadata": {},
187218
"outputs": [],
188219
"source": [
189-
"#Blocking wait until complete\n",
220+
"# Blocking wait until complete\n",
190221
"task1.wait_till_done()\n",
191222
"task2.wait_till_done()\n",
192223
"task3.wait_till_done()\n",
@@ -210,7 +241,7 @@
210241
"metadata": {},
211242
"outputs": [],
212243
"source": [
213-
"#Useful for resigning urls\n",
244+
"# Useful for resigning urls\n",
214245
"new_id = str(uuid.uuid4())\n",
215246
"datarow.update(external_id = new_id)\n",
216247
"print(datarow.external_id, new_id)\n"
@@ -223,12 +254,12 @@
223254
"metadata": {},
224255
"outputs": [],
225256
"source": [
226-
"#We can also attach metadata (See metadata tutorial for more)\n",
227-
"#Metadata is visible for all projects with this datarow attached\n",
257+
"# We can also attach metadata (See metadata tutorial for more)\n",
258+
"# Metadata is visible for all projects with this datarow attached\n",
228259
"datarow.create_metadata(meta_type = \"TEXT\", meta_value = \"LABELERS WILL SEE THIS \")\n",
229-
"#See more information here:\n",
230-
"#https://docs.labelbox.com/en/import-data/attachments\n",
231-
"#Note that meta_value must always be a string (url to a video/image or a text value to display)"
260+
"# See more information here:\n",
261+
"# https://docs.labelbox.com/en/import-data/attachments\n",
262+
"# Note that meta_value must always be a string (url to a video/image or a text value to display)"
232263
]
233264
},
234265
{
@@ -247,7 +278,7 @@
247278
"outputs": [],
248279
"source": [
249280
"datarow.delete()\n",
250-
"#Will remove from the dataset too"
281+
"# Will remove from the dataset too"
251282
]
252283
},
253284
{
@@ -257,7 +288,7 @@
257288
"metadata": {},
258289
"outputs": [],
259290
"source": [
260-
"#Bulk delete a list of datarows (in this case all of them we just uploaded)\n",
291+
"# Bulk delete a list of datarows (in this case all of them we just uploaded)\n",
261292
"DataRow.bulk_delete(list(dataset.data_rows()))"
262293
]
263294
}

0 commit comments

Comments
 (0)