Skip to content

Commit da2b96e

Browse files
author
Johannes Hötter
authored
Merge pull request #7 from code-kern-ai/dev
Dev
2 parents 3efeb39 + e33badf commit da2b96e

File tree

8 files changed

+207
-23
lines changed

8 files changed

+207
-23
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
.vscode/
2+
secrets.json
23

34
# Jupyter
45
*.ipynb

README.md

Lines changed: 16 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,14 @@
11
![kern-python](https://uploads-ssl.webflow.com/61e47fafb12bd56b40022a49/62766400bd3c57b579d289bf_kern-python%20Banner.png)
22
[![Python 3.9](https://img.shields.io/badge/python-3.9-blue.svg)](https://www.python.org/downloads/release/python-390/)
3+
[![pypi 0.0.3](https://img.shields.io/badge/pypi-0.0.3-yellow.svg)](https://pypi.org/project/kern-sdk/0.0.3/)
34

45
# Kern AI API for Python
56

67
This is the official Python SDK for Kern AI, your IDE for programmatic data enrichment and management.
78

89
## Installation
910

10-
You can set up this library via either running `$ pip install kern-sdk`, or via cloning this repository and running `$ pip install -r requirements.txt` in your repository.
11+
You can set up this library via either running `$ pip install kern-sdk`, or via cloning this repository and running `$ pip install -r requirements.txt` in this repository.
1112

1213
## Usage
1314
Once you installed the package, you can access the application from any Python terminal as follows:
@@ -24,13 +25,23 @@ client = Client(username, password, project_id)
2425
# client = Client(username, password, project_id, uri="http://localhost:4455")
2526
```
2627

27-
Alternatively, you can also set up a `secrets.json` file and load it via `Client.from_secrets_file`. If you use a `secrets.json`, you can also use the CLI commands directly (e.g. `kern pull`).
28+
Alternatively, you can provide a `secrets.json` file in your repository, looking as follows:
29+
```json
30+
{
31+
"user_name": "your-username",
32+
"password": "your-password",
33+
"project_id": "your-project-id"
34+
}
35+
```
36+
Again, if you run on your local machine, you should provide also `"uri": "http://localhost:4455"`.
2837

2938
Now, you can easily fetch the data from your project:
3039
```python
31-
df = client.fetch_export()
40+
df = client.get_record_export()
3241
```
3342

43+
Alternatively, you can also just run `kern pull` in your CLI given that you have provided the `secrets.json` file.
44+
3445
The `df` contains data of the following scheme:
3546
- all your record attributes are stored as columns, e.g. `headline` or `running_id` if you uploaded records like `{"headline": "some text", "running_id": 1234}`
3647
- per labeling task three columns:
@@ -42,7 +53,8 @@ With the `client`, you easily integrate your data into any kind of system; may i
4253

4354
## Roadmap
4455
- [ ] Register information sources via wrappers
45-
- [ ] Fetch project statistics
56+
- [ ] Add project upload
57+
- [x] Fetch project statistics
4658

4759

4860
If you want to have something added, feel free to open an [issue](https://github.com/code-kern-ai/kern-python/issues).

cli.py

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
from kern import Client
2+
import sys
3+
4+
5+
def pull():
6+
client = Client.from_secrets_file("secrets.json")
7+
project_name = client.get_project_details()["name"]
8+
download_to = f"{project_name}.json"
9+
client.get_record_export(download_to=download_to)
10+
11+
12+
def main():
13+
cli_args = sys.argv[1:]
14+
15+
# currently only need to easily pull data;
16+
# in the near future, this might be expanded
17+
cli_arg = cli_args[0]
18+
if cli_arg == "pull":
19+
pull()
20+
21+
22+
if __name__ == "__main__":
23+
main()

kern/__init__.py

Lines changed: 66 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,8 @@
33
from wasabi import msg
44
import pandas as pd
55
from kern import authentication, api_calls, settings, exceptions
6-
from typing import Optional
6+
from typing import Optional, Dict
7+
import json
78

89

910
class Client:
@@ -20,7 +21,7 @@ class Client:
2021
"""
2122

2223
def __init__(
23-
self, user_name: str, password: str, project_id: str, uri="https://app.kern.ai"
24+
self, user_name: str, password: str, project_id: str, uri=settings.DEFAULT_URI
2425
):
2526
settings.set_base_uri(uri)
2627
self.session_token = authentication.create_session_token(
@@ -33,16 +34,75 @@ def __init__(
3334
raise exceptions.get_api_exception_class(401)
3435
self.project_id = project_id
3536

36-
def fetch_export(self, num_samples: Optional[int] = None) -> pd.DataFrame:
37+
@classmethod
38+
def from_secrets_file(cls, path_to_file: str):
39+
with open(path_to_file, "r") as file:
40+
content = json.load(file)
41+
uri = content.get("uri")
42+
if uri is None:
43+
uri = settings.DEFAULT_URI
44+
return cls(
45+
user_name=content["user_name"],
46+
password=content["password"],
47+
project_id=content["project_id"],
48+
uri=uri,
49+
)
50+
51+
def get_project_details(self) -> Dict[str, str]:
52+
"""Collect high-level information about your project: name, description, and tokenizer
53+
54+
Returns:
55+
Dict[str, str]: dictionary containing the above information
56+
"""
57+
url = settings.get_project_url(self.project_id)
58+
api_response = api_calls.get_request(url, self.session_token)
59+
return api_response
60+
61+
def get_record_export(
62+
self, num_samples: Optional[int] = None, download_to: Optional[str] = None
63+
) -> pd.DataFrame:
3764
"""Collects the export data of your project (i.e. the same data if you would export in the web app).
3865
3966
Args:
4067
num_samples (Optional[int], optional): If set, only the first `num_samples` records are collected. Defaults to None.
4168
4269
Returns:
43-
pd.DataFrame: DataFrame containing your record data. For more details, see https://docs.kern.ai
70+
pd.DataFrame: DataFrame containing your record data.
4471
"""
45-
url = settings.get_export_url(self.project_id, num_samples=num_samples)
46-
api_response = api_calls.get_request(url, self.session_token)
72+
url = settings.get_export_url(self.project_id)
73+
api_response = api_calls.get_request(
74+
url, self.session_token, **{"num_samples": num_samples}
75+
)
4776
df = pd.read_json(api_response)
77+
if download_to is not None:
78+
df.to_json(download_to, orient="records")
79+
msg.good(f"Downloaded export to {download_to}")
4880
return df
81+
82+
# TODO: issue #6
83+
# def post_file_import(self, upload_from: str):
84+
# upload_from = f"{upload_from}_SCALE"
85+
# file_type = "records"
86+
# import_file_options = None
87+
# config_url = settings.get_config_url()
88+
# config_api_response = api_calls.get_request(config_url, self.session_token)
89+
# endpoint = config_api_response["KERN_S3_ENDPOINT"]
90+
91+
# import_url = settings.get_import_url(self.project_id)
92+
# import_api_response = api_calls.post_request(
93+
# import_url,
94+
# {
95+
# "file_name": upload_from,
96+
# "file_type": file_type,
97+
# "import_file_options": import_file_options,
98+
# },
99+
# self.session_token,
100+
# )
101+
102+
# credentials = import_api_response["Credentials"]
103+
# access_key = credentials["AccessKeyId"]
104+
# secret_key = credentials["SecretAccessKey"]
105+
# session_token = credentials["SessionToken"]
106+
107+
# upload_task_id = import_api_response["uploadTaskId"]
108+
# return endpoint, access_key, secret_key, session_token, upload_task_id

kern/api_calls.py

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,13 @@
11
# -*- coding: utf-8 -*-
2+
import json
23
from json.decoder import JSONDecodeError
34
import pkg_resources
45
from kern import exceptions
56
import requests
67
from typing import Any, Dict
78

89
try:
9-
version = pkg_resources.get_distribution("kern-python-client").version
10+
version = pkg_resources.get_distribution("kern-sdk").version
1011
except pkg_resources.DistributionNotFound:
1112
version = "noversion"
1213

@@ -17,24 +18,27 @@ def post_request(url: str, body: Dict[str, Any], session_token: str) -> str:
1718
return _handle_response(response)
1819

1920

20-
def get_request(url: str, session_token: str) -> str:
21+
def get_request(url: str, session_token: str, **query_params) -> str:
2122
headers = _build_headers(session_token)
22-
response = requests.get(url=url, headers=headers)
23+
response = requests.get(url=url, headers=headers, params=query_params)
2324
return _handle_response(response)
2425

2526

2627
def _build_headers(session_token: str) -> Dict[str, str]:
2728
return {
28-
"Content-Type": "application/json",
29-
"User-Agent": f"python-sdk-{version}",
30-
"Authorization": f"Bearer {session_token}",
29+
"content-type": "application/json",
30+
"user-agent": f"python-sdk-{version}",
31+
"authorization": f"Bearer {session_token}",
32+
"identifier": session_token,
3133
}
3234

3335

3436
def _handle_response(response: requests.Response) -> str:
3537
status_code = response.status_code
3638
if status_code == 200:
3739
json_data = response.json()
40+
if type(json_data) == str:
41+
json_data = json.loads(json_data)
3842
return json_data
3943
else:
4044
try:

kern/settings.py

Lines changed: 19 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
# -*- coding: utf-8 -*-
22
BASE_URI: str
3+
DEFAULT_URI: str = "https://app.kern.ai"
34

45

56
def set_base_uri(uri: str):
@@ -23,7 +24,21 @@ def get_authentication_url() -> str:
2324
return f"{BASE_URI}/.ory/kratos/public/self-service/login/api"
2425

2526

26-
def get_export_url(project_id: str, **kwargs) -> str:
27-
url = f"{BASE_URI}/api/project/{project_id}/export"
28-
url = add_query_params(url, **kwargs)
29-
return url
27+
def get_config_url():
28+
return f"{BASE_URI}/api/config/"
29+
30+
31+
def get_project_url(project_id: str):
32+
return f"{BASE_URI}/api/project/{project_id}"
33+
34+
35+
def get_records_url(project_id: str):
36+
return f"{get_project_url(project_id)}/records"
37+
38+
39+
def get_export_url(project_id: str) -> str:
40+
return f"{get_project_url(project_id)}/export"
41+
42+
43+
def get_import_url(project_id: str) -> str:
44+
return f"{get_project_url(project_id)}/import"

requirements.txt

Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,83 @@
1+
appnope==0.1.3
2+
argon2-cffi==21.3.0
3+
argon2-cffi-bindings==21.2.0
4+
asttokens==2.0.5
5+
attrs==21.4.0
6+
backcall==0.2.0
7+
beautifulsoup4==4.11.1
18
black==22.3.0
9+
bleach==5.0.0
210
certifi==2021.10.8
11+
cffi==1.15.0
312
charset-normalizer==2.0.12
413
click==8.1.3
14+
debugpy==1.6.0
15+
decorator==5.1.1
16+
defusedxml==0.7.1
17+
entrypoints==0.4
18+
executing==0.8.3
19+
fastjsonschema==2.15.3
520
idna==3.3
21+
ipykernel==6.13.0
22+
ipython==8.3.0
23+
ipython-genutils==0.2.0
24+
ipywidgets==7.7.0
25+
jedi==0.18.1
26+
Jinja2==3.1.2
27+
jsonschema==4.5.1
28+
jupyter==1.0.0
29+
jupyter-client==7.3.1
30+
jupyter-console==6.4.3
31+
jupyter-core==4.10.0
32+
jupyterlab-pygments==0.2.2
33+
jupyterlab-widgets==1.1.0
34+
kern-python-client @ file:///Users/jhoetter/repos/kern-python
35+
MarkupSafe==2.1.1
36+
matplotlib-inline==0.1.3
37+
minio==7.1.8
38+
mistune==0.8.4
639
mypy-extensions==0.4.3
40+
nbclient==0.6.3
41+
nbconvert==6.5.0
42+
nbformat==5.4.0
43+
nest-asyncio==1.5.5
44+
notebook==6.4.11
745
numpy==1.22.3
46+
packaging==21.3
847
pandas==1.4.2
48+
pandocfilters==1.5.0
49+
parso==0.8.3
950
pathspec==0.9.0
51+
pexpect==4.8.0
52+
pickleshare==0.7.5
1053
platformdirs==2.5.2
54+
prometheus-client==0.14.1
55+
prompt-toolkit==3.0.29
56+
psutil==5.9.0
57+
ptyprocess==0.7.0
58+
pure-eval==0.2.2
59+
pycparser==2.21
60+
Pygments==2.12.0
61+
pyparsing==3.0.9
62+
pyrsistent==0.18.1
1163
python-dateutil==2.8.2
1264
pytz==2022.1
65+
pyzmq==22.3.0
66+
qtconsole==5.3.0
67+
QtPy==2.1.0
1368
requests==2.27.1
69+
Send2Trash==1.8.0
1470
six==1.16.0
71+
soupsieve==2.3.2.post1
72+
stack-data==0.2.0
73+
terminado==0.15.0
1574
tinycss2==1.1.1
1675
tomli==2.0.1
76+
tornado==6.1
77+
traitlets==5.2.1.post0
1778
typing_extensions==4.2.0
1879
urllib3==1.26.9
1980
wasabi==0.9.1
81+
wcwidth==0.2.5
82+
webencodings==0.5.1
83+
widgetsnbextension==3.6.0

setup.py

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9,11 +9,11 @@
99
long_description = file.read()
1010

1111
setup(
12-
name="kern-python-client",
13-
version="0.0.1",
12+
name="kern-sdk",
13+
version="0.0.3",
1414
author="jhoetter",
1515
author_email="johannes.hoetter@kern.ai",
16-
description="Official Python SDK for the Kern AI API",
16+
description="Official SDK for the Kern AI API",
1717
long_description=long_description,
1818
long_description_content_type="text/markdown",
1919
url="https://github.com/code-kern-ai/kern-python",
@@ -44,4 +44,9 @@
4444
"urllib3==1.26.9",
4545
"wasabi==0.9.1",
4646
],
47+
entry_points={
48+
"console_scripts": [
49+
"kern=cli:main",
50+
],
51+
},
4752
)

0 commit comments

Comments
 (0)