Google’s JAX: Flexible, High-Performance Machine Learning
Server-side batching: Scaling inference throughput in machine learning
Caleb Kaiser
December 2020
How we served 1,000 models on GPUs for $0.47
Caleb Kaiser
December 2020
Designing a machine learning platform for both data scientists and engineers
Caleb Kaiser
December 2020
Netflix's Metaflow: Reproducible machine learning pipelines
Caleb Kaiser
December 2020
How to serve batch predictions with TensorFlow Serving
Caleb Kaiser
December 2020
How to build a pipeline to retrain and deploy models
Caleb Kaiser
December 2020
How to deploy Transformer models for language tasks
Caleb Kaiser
December 2020
Cortex 0.24: Announcing multi-cloud support
Caleb Kaiser
December 2020
How we scale machine learning model deployment on Kubernetes
Caleb Kaiser
December 2020
Why we built a serverless machine learning platform—instead of using AWS Lambda
Caleb Kaiser
November 2020
Why we don’t deploy machine learning models with Flask
Caleb Kaiser
November 2020
How to deploy machine learning models from a notebook to production
Caleb Kaiser
November 2020
Why we use YAML—not notebooks—for machine learning
Caleb Kaiser
November 2020
5 Lessons Learned Building an Open Source MLOps Platform
Caleb Kaiser
November 2020
Machine learning doesn't have to be expensive
Caleb Kaiser
November 2020
A/B testing machine learning models in production
How to deploy 1,000 models on one CPU with TensorFlow Serving
How to reduce the cost of machine learning inference
How to deploy PyTorch Lightning models to production
Improve NLP inference throughput 40x with ONNX and Hugging Face
Catching poachers with machine learning