Deploy machine learning models to production

Cortex deploys your machine learning models to your cloud infrastructure

Get Started GitHub

How it works

Define your deployment using declarative configuration

# cortex.yaml - kind: api name: my-api model: s3://my-bucket/my-model.zip request_handler: handler.py compute: min_replicas: 5 max_replicas: 20

Customize request handling before and after inference

# handler.py def pre_inference(sample, metadata): # Python code def post_inference(prediction, metadata): # Python code

Deploy to your cloud infrastructure

$ cortex deploy Deploying ... Ready! https://abc.amazonaws.com/my-api

Serve real time predictions via scalable JSON APIs

$ curl -d '{"a": 1, "b": 2, "c": 3}' \ https://amazonaws.com/my-api { prediction: "def" }

Supported frameworks

Key features

Deployments as code

Deployments are defined using declarative configuration

Autoscaling

Cortex automatically scales APIs for production workloads

Rolling updates

Deployed APIs update without any downtime

Cloud native

Cortex can be deployed on any AWS account in minutes