Tutorial

This example shows how to deploy a classifier trained on the famous iris data set using scikit-learn.
Train your model
Create a Python file trainer.py.
Use scikit-learn's LogisticRegression to train your model.
Add code to pickle your model (you can use other serialization libraries such as joblib).
Upload it to S3 (boto3 will need access to valid AWS credentials).
import boto3
import pickle
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
​
# Train the model
​
iris = load_iris()
data, labels = iris.data, iris.target
training_data, test_data, training_labels, test_labels = train_test_split(data, labels)
​
model = LogisticRegression(solver="lbfgs", multi_class="multinomial")
model.fit(training_data, training_labels)
accuracy = model.score(test_data, test_labels)
print("accuracy: {:.2f}".format(accuracy))
​
# Upload the model
​
pickle.dump(model, open("model.pkl", "wb"))
s3 = boto3.client("s3")
s3.upload_file("model.pkl", "my-bucket", "sklearn/iris-classifier/model.pkl")
Run the script locally:
# Install scikit-learn and boto3
$ pip3 install sklearn boto3
​
# Run the script
$ python3 trainer.py
Implement a predictor
Create another Python file predictor.py.
Add code to load and initialize your pickled model.
Add a prediction function that will accept a sample and return a prediction from your model.
# predictor.py
​
import pickle
import numpy as np
​
​
model = None
labels = ["setosa", "versicolor", "virginica"]
​
​
def init(model_path, metadata):
    global model
    model = pickle.load(open(model_path, "rb"))
​
​
def predict(sample, metadata):
    measurements = [
        sample["sepal_length"],
        sample["sepal_width"],
        sample["petal_length"],
        sample["petal_width"],
    ]
​
    label_id = model.predict(np.array([measurements]))[0]
    return labels[label_id]
Specify Python dependencies
Create a requirements.txt file to specify the dependencies needed by predictor.py. Cortex will automatically install them into your runtime once you deploy:
# requirements.txt
​
numpy
You can skip dependencies that are pre-installed to speed up the deployment process. Note that pickle is part of the Python standard library so it doesn't need to be included.
Configure a deployment
Create a cortex.yaml file and add the configuration below. A deployment specifies a set of resources that are deployed together. An api provides a runtime for inference and makes our predictor.py implementation available as a web service that can serve real-time predictions:
# cortex.yaml
​
- kind: deployment
  name: iris
​
- kind: api
  name: classifier
  predictor:
    path: predictor.py
    model: s3://cortex-examples/sklearn/iris-classifier/model.pkl
Deploy to AWS
cortex deploy takes the declarative configuration from cortex.yaml and creates it on your Cortex cluster:
$ cortex deploy
​
creating classifier api
Track the status of your deployment using cortex get:
$ cortex get classifier --watch
​
status   up-to-date   available   requested   last update   avg latency
live     1            1           1           8s            -
​
endpoint: http://***.amazonaws.com/iris/classifier
The output above indicates that one replica of the API was requested and is available to serve predictions. Cortex will automatically launch more replicas if the load increases and spin down replicas if there is unused capacity.
Serve real-time predictions
We can use curl to test our prediction service:
$ curl http://***.amazonaws.com/iris/classifier \
    -X POST -H "Content-Type: application/json" \
    -d '{"sepal_length": 5.2, "sepal_width": 3.6, "petal_length": 1.4, "petal_width": 0.3}'
​
"setosa"
Configure prediction tracking
Add a tracker to your cortex.yaml and specify that this is a classification model:
# cortex.yaml
​
- kind: deployment
  name: iris
​
- kind: api
  name: classifier
  predictor:
    path: predictor.py
  tracker:
    model_type: classification
Run cortex deploy again to perform a rolling update to your API with the new configuration:
$ cortex deploy
​
updating classifier api
After making more predictions, your cortex get command will show information about your API's past predictions:
$ cortex get classifier --watch
​
status   up-to-date   available   requested   last update   avg latency
live     1            1           1           16s           28ms
​
class        count
setosa       8
versicolor   2
virginica    4
Configure compute resources
This model is fairly small but larger models may require more compute resources. You can configure this in your cortex.yaml:
- kind: deployment
  name: iris
​
- kind: api
  name: classifier
  predictor:
    path: predictor.py
  tracker:
    model_type: classification
  compute:
    cpu: 0.5
    mem: 1G
You could also configure GPU compute here if your cluster supports it. Adding compute resources may help reduce your inference latency. Run cortex deploy again to update your API with this configuration:
$ cortex deploy
​
updating classifier api
Run cortex get again:
$ cortex get classifier --watch
​
status   up-to-date   available   requested   last update   avg latency
live     1            1           1           16s           24 ms
​
class        count
setosa       8
versicolor   2
virginica    4
Add another API
If you trained another model and want to A/B test it with your previous model, simply add another api to your configuration and specify the new model:
- kind: deployment
  name: iris
​
- kind: api
  name: classifier
  predictor:
    path: predictor.py
    model: s3://cortex-examples/sklearn/iris-classifier/model.pkl
  tracker:
    model_type: classification
  compute:
    cpu: 0.5
    mem: 1G
​
- kind: api
  name: another-classifier
  predictor:
    path: predictor.py
    model: s3://cortex-examples/sklearn/iris-classifier/another-model.pkl
  tracker:
    model_type: classification
  compute:
    cpu: 0.5
    mem: 1G
Run cortex deploy to create the new API:
$ cortex deploy
​
creating another-classifier api
cortex deploy is declarative so the classifier API is unchanged while another-classifier is created:
$ cortex get --watch
​
api                  status   up-to-date   available   requested   last update
classifier           live     1            1           1           5m
another-classifier   live     1            1           1           8s
Add a batch API
First, implement batch-predictor.py with a predict function that can process an array of samples:
# batch-predictor.py
​
import pickle
import numpy as np
​
​
model = None
labels = ["setosa", "versicolor", "virginica"]
​
​
def init(model_path, metadata):
    global model
    model = pickle.load(open(model_path, "rb"))
​
​
def predict(sample, metadata):
    measurements = [
        [s["sepal_length"], s["sepal_width"], s["petal_length"], s["petal_width"]] for s in sample
    ]
​
    label_ids = model.predict(np.array(measurements))
    return [labels[label_id] for label_id in label_ids]
Next, add the api to cortex.yaml:
- kind: deployment
  name: iris
​
- kind: api
  name: classifier
  predictor:
    path: predictor.py
    model: s3://cortex-examples/sklearn/iris-classifier/model.pkl
  tracker:
    model_type: classification
  compute:
    cpu: 0.5
    mem: 1G
​
- kind: api
  name: another-classifier
  predictor:
    path: predictor.py
    model: s3://cortex-examples/sklearn/iris-classifier/another-model.pkl
  tracker:
    model_type: classification
  compute:
    cpu: 0.5
    mem: 1G
​
​
- kind: api
  name: batch-classifier
  predictor:
    path: batch-predictor.py
    model: s3://cortex-examples/sklearn/iris-classifier/model.pkl
  compute:
    cpu: 0.5
    mem: 1G
Run cortex deploy to create the batch API:
$ cortex deploy
​
creating batch-classifier api
cortex get should show all three APIs now:
$ cortex get --watch
​
api                  status   up-to-date   available   requested   last update
classifier           live     1            1           1           10m
another-classifier   live     1            1           1           5m
batch-classifier     live     1            1           1           8s
Try a batch prediction
$ curl http://***.amazonaws.com/iris/classifier \
    -X POST -H "Content-Type: application/json" \
    -d '[
          {
                "sepal_length": 5.2,
                "sepal_width": 3.6,
                "petal_length": 1.5,
                "petal_width": 0.3
            },
            {
                "sepal_length": 7.1,
                "sepal_width": 3.3,
                "petal_length": 4.8,
                "petal_width": 1.5
            },
            {
                "sepal_length": 6.4,
                "sepal_width": 3.4,
                "petal_length": 6.1,
                "petal_width": 2.6
            }
        ]'
​
["setosa","versicolor","virginica"]
Clean up
Run cortex delete to spin down your API:
$ cortex delete
​
deleting classifier api
deleting another-classifier api
deleting batch-classifier api
Running cortex delete will free up cluster resources and allow Cortex to scale down to the minimum number of instances you specified during cluster installation. It will not spin down your cluster.
Any questions? chat with us.
Install
Next - Deployments
Deployments
Last updated 5 days ago