|
| 1 | +# Oracle Accelerated Data Science SDK (ADS) |
| 2 | + |
| 3 | +The [Oracle Accelerated Data Science (ADS) SDK](https://docs.oracle.com/en-us/iaas/tools/ads-sdk/latest/index.html) is maintained by the [Oracle Cloud Infrastructure Data Science service](https://docs.oracle.com/en-us/iaas/data-science/using/data-science.htm) team. It speeds up common data science activities by providing tools that automate and/or simplify common data science tasks, along with providing a data scientist friendly pythonic interface to Oracle Cloud Infrastructure (OCI) services, most notably OCI Data Science, Data Flow, Object storage, and the Autonomous Database. ADS gives you an interface to manage the lifecycle of machine learning models, from data acquisition to model evaluation, interpretation, and model deployment. |
| 4 | + |
| 5 | +With ADS you can: |
| 6 | + |
| 7 | + - Read datasets from Oracle Object Storage, Oracle RDBMS (ATP/ADW/On-prem), AWS S3 and other sources into `Pandas dataframes`. |
| 8 | + - Easily compute summary statistics on your dataframes and perform data profiling. |
| 9 | + - Tune models using hyperparameter optimization with the `ADSTuner` tool. |
| 10 | + - Generate detailed evaluation reports of your model candidates with the `ADSEvaluator` module. |
| 11 | + - Save machine learning models to the [OCI Data Science Model Catalog](https://docs.oracle.com/en-us/iaas/data-science/using/models-about.htm). |
| 12 | + - Deploy those models as HTTP endpoints with [Model Deployment](https://docs.oracle.com/en-us/iaas/data-science/using/model-dep-about.htm). |
| 13 | + - Launch distributed ETL, data processing, and model training jobs in Spark with [OCI Data Flow](https://docs.oracle.com/en-us/iaas/data-flow/using/home.htm). |
| 14 | + - Train machine learning models in OCI Data Science [Jobs](https://docs.oracle.com/en-us/iaas/data-science/using/jobs-about.htm). |
| 15 | + - Manage the lifecycle of conda environments through the `ads conda` command line interface (CLI). |
| 16 | + |
| 17 | +## Installation |
| 18 | + |
| 19 | +You have various options when installing ADS. |
| 20 | + |
| 21 | +### Installing the oracle-ads base package |
| 22 | + |
| 23 | +```bash |
| 24 | + $ python3 -m pip install oracle-ads |
| 25 | +``` |
| 26 | + |
| 27 | +### Installing extras libraries |
| 28 | + |
| 29 | +To use ADS within a [Notebook Session](https://docs.oracle.com/en-us/iaas/data-science/using/manage-notebook-sessions.htm) of the OCI Data Science service: |
| 30 | + |
| 31 | +```bash |
| 32 | + $ python3 -m pip install oracle-ads[notebook] |
| 33 | +``` |
| 34 | + |
| 35 | +For machine learning tasks install |
| 36 | + |
| 37 | +```bash |
| 38 | + $ python3 -m pip install oracle-ads[boosted] |
| 39 | +``` |
| 40 | + |
| 41 | +To work on text related tasks run |
| 42 | + |
| 43 | +```bash |
| 44 | + $ python3 -m pip install oracle-ads[text] |
| 45 | +``` |
| 46 | + |
| 47 | +For access to a broad set of data formats (for example, Excel, Avro, etc.) run |
| 48 | + |
| 49 | +```bash |
| 50 | + $ python3 -m pip install oracle-ads[data] |
| 51 | +``` |
| 52 | + |
| 53 | +**Note** |
| 54 | + |
| 55 | +Multiple extra dependencies can be installed together. For example: |
| 56 | + |
| 57 | +```bash |
| 58 | + $ python3 -m pip install oracle-ads[notebook,boosted,text] |
| 59 | +``` |
| 60 | + |
| 61 | +## Documentation |
| 62 | + |
| 63 | + - [Oracle Accelerated Data Science SDK (ADS) Documentation](https://docs.oracle.com/en-us/iaas/tools/ads-sdk/latest/index.html) |
| 64 | + - [Oracle Cloud Infrastructure Data Science and AI services Examples](https://github.com/oracle/oci-data-science-ai-samples) |
| 65 | + - [Oracle AI & Data Science Blog](https://blogs.oracle.com/ai-and-datascience/) |
| 66 | + - [Oracle Cloud Infrastructure Documentation](https://docs.oracle.com/en-us/iaas/data-science/using/data-science.htm) |
| 67 | + |
| 68 | +## Examples |
| 69 | + |
| 70 | +### Load data from Object Storage |
| 71 | + |
| 72 | +```python |
| 73 | + import ads |
| 74 | + from ads.common.auth import default_signer |
| 75 | + |
| 76 | + ads.set_auth(auth="api_key", profile="DEFAULT") |
| 77 | + bucket_name = <bucket-name> |
| 78 | + file_name = <file-name> |
| 79 | + namespace = <namespace> |
| 80 | + df = pd.read_csv(f"oci://{bucket_name}@{namespace}/{file_name}", storage_options=default_signer()) |
| 81 | +``` |
| 82 | + |
| 83 | +### Load data from ADB (simple) |
| 84 | + |
| 85 | +```python |
| 86 | + connection_parameters = { |
| 87 | + "user_name": "<username>", |
| 88 | + "password": "<password>", |
| 89 | + "service_name": "<service_name_{high|med|low}>", |
| 90 | + "wallet_location": "/full/path/to/my_wallet.zip", |
| 91 | + } |
| 92 | + import pandas as pd |
| 93 | + import ads |
| 94 | + |
| 95 | + # simple read of a SQL query into a dataframe with no bind variables |
| 96 | + df = pd.DataFrame.ads.read_sql( |
| 97 | + "SELECT * FROM SH.SALES", |
| 98 | + connection_parameters=connection_parameters, |
| 99 | + ) |
| 100 | +``` |
| 101 | + |
| 102 | +### Load data from ADB (using sql-injection-safe bind variables) |
| 103 | + |
| 104 | +```python |
| 105 | + df = pd.DataFrame.ads.read_sql( |
| 106 | + """ |
| 107 | + SELECT |
| 108 | + * |
| 109 | + FROM |
| 110 | + SH.SALES |
| 111 | + WHERE |
| 112 | + ROWNUM <= :max_rows |
| 113 | + """, |
| 114 | + bind_variables={ |
| 115 | + max_rows : 100 |
| 116 | + }, |
| 117 | + connection_parameters=connection_parameters, |
| 118 | + ) |
| 119 | +``` |
| 120 | + |
| 121 | +## Contributing |
| 122 | + |
| 123 | +This project welcomes contributions from the community. Before submitting a pull request, please review our contribution guide. |
| 124 | + |
| 125 | +Find Getting Started instructions for developers in [README-development.md](./README-development.md) |
| 126 | + |
| 127 | +## Security |
| 128 | + |
| 129 | +Please consult the security guide for our responsible security vulnerability disclosure process. |
| 130 | + |
| 131 | +## License |
| 132 | + |
| 133 | +Copyright (c) 2020, 2022 Oracle and/or its affiliates. Licensed under the Universal Permissive License v 1.0 as shown at https://oss.oracle.com/licenses/upl/ |
0 commit comments