Deploy machine learning models to production

Cortex deploys your machine learning models to your cloud infrastructure

Get Started GitHub

How it works

Define your deployment using declarative configuration

# cortex.yaml - kind: api name: my-api model: s3://my-bucket/ request_handler: compute: min_replicas: 5 max_replicas: 20

Customize request handling before and after inference

# def pre_inference(sample, metadata): # Python code def post_inference(prediction, metadata): # Python code

Deploy to your cloud infrastructure

$ cortex deploy Deploying ... Ready!

Serve real time predictions via scalable JSON APIs

$ curl -d '{"a": 1, "b": 2, "c": 3}' \ { prediction: "def" }

Supported frameworks

Key features

Deployments as code

Deployments are defined using declarative configuration


Cortex automatically scales APIs for production workloads

Rolling updates

Deployed APIs update without any downtime

Cloud native

Cortex can be deployed on any AWS account in minutes