Since Cortex’s inception, we’ve exclusively supported AWS for cloud deployments. Our reasoning has been that, as a small team, it was best for us to focus on getting the AWS experience as close to perfect as possible before we branched out into supporting other clouds.
As we’ve grown both in community and in full-time team size over the last year, the time is finally right to bring Cortex to other clouds.
With Cortex 0.24, the Cortex cluster will officially be runnable on a second public cloud—GCP.
With this announcement, I want to talk a little bit about how our thinking around multi-cloud support has changed over the last year, how we’re planning on rolling out GCP support, and what the future of multi-cloud support on Cortex looks like.
Multi-cloud support as a feature
When we first began work on Cortex, we understood cloud support in fairly simple terms. AWS seemed to have the most users, be the most popular among bigger companies running large inference workloads, and was the cloud we were personally most familiar with, so we chose it.
Adding support for other clouds, in our minds, was mostly about making Cortex usable for more people, not about adding new features.
What we didn’t see at the time, and what we’ve seen a lot of over the last year, is that cloud usage among ML teams is fluid, and only becoming more so. While many teams run their entire stack on one cloud, we increasingly see teams who run different parts of their pipelines on different clouds.
For example, it might be financially advantageous for a team to run training on GCP while deploying APIs on AWS.
Before, we understood that it was advantageous to be able to migrate pipelines across clouds easily, but now, teams are emphasizing interoperability between clouds within a single pipeline. This shift makes multi-cloud support even more crucial to ML infrastructure moving forward.
Announcing beta support for GCP in Cortex
In Cortex 0.24 (releasing December 8th, 2020), Cortex will support GCP. We’re treating GCP support as an open beta. What that means is that on initial release, core functionality—automated cluster management, packaging/deployment, and log streaming—will be available on GCP, along with:
- Support for all three predictor types (Python, TensorFlow, ONNX)
- GPU and CPU inference
- Live reloading and multi-model caching
- Cluster autoscaling
GCP deployments, however, will not have full feature parity with AWS deployments, at least not immediately. Features like replica autoscaling, request metrics, prediction monitoring, custom SSL certs, and batch workloads will not work out of the box on GCP initially.
To develop that parity, we need some help from the community.
We need your help
We want to make Cortex’s GCP experience as smooth as it is on AWS, and for that to happen, we need your feedback.
If you’re working on GCP and want to try Cortex, we want to hear about how we can make Cortex better for you. What features would you’d like us to prioritize? What are the GCP quirks you hate dealing with the most?
We want to bring Cortex’s GCP support to feature parity with AWS in a way that prioritizes the right features, and implements them with respect to the specifics of developing on GCP—and we’d greatly appreciate your help in doing so.
If you’d like to try Cortex on GCP, you can watch Cortex on GitHub or sign up for the release newsletter to be notified when Cortex 0.24 releases next Tuesday. If you have any feedback, we’d love to hear it via GitHub Issues, feedback in our public Gitter channel, or a direct email to [email protected]