diff --git a/LEARN_MORE.md b/LEARN_MORE.md new file mode 100644 index 0000000..cec5d4d --- /dev/null +++ b/LEARN_MORE.md @@ -0,0 +1,70 @@ +# Churn Predictor: Learn More + +This document provides more detail about the sample churn predictor. +You can learn more about building churn predictors with GraphLab Create in +the [user guide](https://dato.com/learn/userguide/churn_prediction/quick-start.html). + +## How it works + +The `churn_predictor.py` script performs the following actions: + +``` +Load Data -> Prepare Data -> Train Model -> Explore Model +``` + +### Load Data + +This sample uses a table of product purchase data (541909 product purchases): + +| InvoiceNo | StockCode | Description | Quantity | InvoiceDate | UnitPrice | CustomerID | Country | +| --------- | --------- | ------------------------------ | -------- | ------------ | --------- | ---------- | -------------- | +| 536365 | 85123A | WHITE HANGING HEART T-LIGH... | 6 | 12/1/10 8:26 | 2.55 | 17850 | United Kingdom | +| 536365 | 71053 | WHITE METAL LANTERN | 6 | 12/1/10 8:26 | 3.39 | 17850 | United Kingdom | +| 536365 | 84406B | CREAM CUPID HEARTS COAT HANGER | 8 | 12/1/10 8:26 | 2.75 | 17850 | United Kingdom | +| 536365 | 84029G | KNITTED UNION FLAG HOT WAT... | 6 | 12/1/10 8:26 | 3.39 | 17850 | United Kingdom | +| 536365 | 84029E | RED WOOLLY HOTTIE WHITE HEART. | 6 | 12/1/10 8:26 | 3.39 | 17850 | United Kingdom | + + +### Prepare Data + +Once the data is loaded, you typically want to clean it up or transform it in various ways. +The Churn Prediction toolkit uses GraphLab Create's TimeSeries data structure. +The sample code parses the `InvoiceDate` text column and converts +it to Python datetime objects. From this dataset, we create a TimeSeries. + + +### Train Churn Predictor Model + +Next, we randomly split the data into a training set and validation set, +and define the churn period and boundary. These settings define time +after which a user will be considered "churned". To learn more, +check out the [user guide](https://dato.com/learn/userguide/churn_prediction/quick-start.html#how-is-churn-defined). + +We use [`gl.churn_predictor.create`](https://dato.com/products/create/docs/generated/graphlab.churn_predictor.create.html) +to build the prediction model on the training set. + + +### Explore the Model + +Once you've trained the model, you can use it to make churn forecasts. +The sample script launches an interactive web-based view for exploring +and evaluating the model: + +![Screenshot of Exploration](/assets/churn-explore.png) + +![Screenshot of Evaluation](/assets/churn-evaluate.png) + +Find more information about how to use your model +in the [user guide](https://dato.com/learn/userguide/churn_prediction/using-a-trained-model.html). + + +## Use your own data + +You can replace the sample data provided in this project with your own +dataset. At minimum, your data must contain a time column, a user id column, and a feature column. + +You will need to replace the file name and column names in the +`churn_predictor.py` script to refer to your own data set. + +For more information about setting up a churn predictor model, +check out the [user guide](https://dato.com/learn/userguide/churn_prediction/quick-start.html). diff --git a/README.md b/README.md index 2db36b9..9bb52b0 100644 --- a/README.md +++ b/README.md @@ -27,8 +27,9 @@ customers who are likely to stop using your product or service. python download_data.py ``` -4. Making sure you are working in a Python environment with GraphLab Create installed, - run the `churn_predictor.py` script to build and explore the model on your machine: +4. Make sure you are working in a Python environment with GraphLab Create installed + (e.g. if you installed GraphLab Create using the Dato Launcher, you can use the Launcher to open a GraphLab Create Terminal). + Then, run the `churn_predictor.py` script to build and explore the model on your machine: ```bash python -i churn_predictor.py @@ -37,17 +38,33 @@ customers who are likely to stop using your product or service. The `-i` flag causes Python to drop into an interactive interpreter after the script executes. - Once the model has been created, a browser window should open - to let you explore and interact with your model. - Alternatively, you can also run the provided IPython Notebook: ```bash ipython notebook churn_predictor.ipynb ``` + Once the model has been created, a browser window should open + to let you explore and interact with your model. + +## Learn More and Next Steps + +Once you have the sample project running, you can try the following: + + - [Learn more about how the sample works](./LEARN_MORE.md#how-it-works) + - [Try it on your own data set](./LEARN_MORE.md#use-your-own-data) + +To find out more about building churn prediction models, check out the +[user guide](https://dato.com/learn/userguide/churn_prediction/quick-start.html) +or [API documentation](https://dato.com/products/create/docs/graphlab.toolkits.churn_predictor.html). + ## Troubleshooting If you are having trouble, please [create a Github Issue](https://github.com/dato-code/sample-churn-predictor/issues/new) or start a discussion on the [user forum](http://forum.dato.com/). + + +## Acknowledgments + +The online retail data set used in this sample was obtained from the [UCI Machine Learning Repository](http://archive.ics.uci.edu/ml/datasets/Online+Retail). diff --git a/assets/churn-evaluate.png b/assets/churn-evaluate.png new file mode 100644 index 0000000..96af317 Binary files /dev/null and b/assets/churn-evaluate.png differ diff --git a/assets/churn-explore.png b/assets/churn-explore.png new file mode 100644 index 0000000..1cb763b Binary files /dev/null and b/assets/churn-explore.png differ