all things engineering

all things engineering

#kubernetes #devops
Scaling Kubernetes to 2,500 Nodes
Scaling Kubernetes to 2,500 Nodes
We've been running Kubernetes for deep learning research for over two years. While our largest-scale workloads manage bare cloud VMs directly, Kubernetes provides a fast iteration cycle, reasonable scalability, and a lack of boilerplate which makes it ideal for most of our experiments. We now operate several Kubernetes clusters (some
·blog.openai.com·
Scaling Kubernetes to 2,500 Nodes