18

I Stopped Using Kubernetes. Our DevOps Team Is Happier Than Ever

I Stopped Wearing Shoes. My Wife Is Happier Than Ever

I used to own 10,000 pairs of shoes, so I had to buy an extra house just to have room to store all of them. My fridge was completely filled with shoes, and my wife dreaded even walking around in our home.

After going barefoot, I was able to sell the other house and now have room for groceries in my fridge, and my kids can now eat.

This is why shoes are bad.

7 hours agovegardstenvik

> We were managing 47 Kubernetes clusters across three cloud providers.

Not a Kubernetes guy, so perhaps ignorant question. Why would you run 47 clusters?

I thought the point of a Kubernetes clusters is you just throw your workload at it and be happy?

I get you want a few for testing and development etc, and perhaps failover to other provider or similar. But 47?

7 hours agomagicalhippo

> Why would you run 47 clusters?

Entirely possible for an enterprise-y or B2B use-case - some clients might want rigid data / network isolation in a separate account / VPC, plus it reduces the blast radius instead of running everything in one big cluster. There are ways of achieving this in a single cluster with a lot of added complexity, and spinning up a new VPC + K8s might be easier if you have the Terraform modules ready to go.

6 hours agotheovermage

It's not an ignorant question! Running 47 different clusters is insane.

7 hours agoreissbaker

Speaking in absolute numbers without any reference point makes no sense.

6 hours agobenterix

The article states they moved all of their stateless stuff to ECS, stateful stuff to Docker containers on EC2, batch jobs to AWS Batch, and event-driven stuff to AWS Lambda. Previously they ran 47 different K8s clusters on 3 different clouds.

Given that all of their needs could be satisfied with 4 AWS services, and for nearly half a million dollars a year cheaper than their previous setup, I think running 47 different clusters in 3 clouds was insane.

3 hours agoreissbaker

Yes, in this context I don't see any reason for that number of separate clusters.

2 hours agobenterix

While I'm pretty sure the article is clickbait (can't tell, paywalled too soon), having many clusters ain't dumb nowadays.

The automation of Kubernetes maintenance is great nowadays. For any case, bare metal, onprem, public cloud managed. Leveraging that makes it easier to manage multiple clusters and give each team/project it's own cluster, than implement proper mechanisms on a single cluster. Like proper rbac between projects, network boundaries etc..

Nowadays you can easily move that complexity one layer up and treat whole clusters as some volatile component that is defined in code and is also not a snowflake.

That way you can let each team/project have core things different and managed by themselves, like implementing their contrasting opinions on network meshes, or CRDs that would otherwise be in conflict etc.

The overhead is not that huge, or at least doesn't have to be. My test clusters with multiple environments of my apps consume around 4GB of mem in total (aside of my apps themselves), that includes any k8s stuff, logging aggregation, metrics aggregation and so on. You don't even have to manage your own control plane - cloud can give you a shared one (like Azure has in two lower tiers), or use services that provide you just the control plane, while nodes are on whatever hardware (like scaleway kosmos).

So yeah, it's not for everyone, but it surely can grow to those numbers of clusters, especially if you multiply by dev/staging/qa/prod for each team and add some Infra to actual tests of infra/IaC.

Although, why they had such an overhead is a mystery to me, would be cool to see that part described.

6 hours agoszszrk

Indeed…I wonder what the path not taken, a more optimal k8 rearchitecting would look like

7 hours agoblueboo

Simplicity always wins over complexity. I don't think the problem here is Kubernetes, but more like the way they used it. Any system can be made utterly complex, if you don't take the time to make it simple.

7 hours agomdavid626

> Any system can be made utterly complex

Kubernetes makes it easier to end up with something utterly complex.

3 hours agosshine

Strange article. Saying "99.99% uptime maintained" and that they had 4 major outages in a week is kind of strange, since 99.99% uptime only allow for 4 minutes of downtime a month...

7 hours agonosefrog

It would be weird, if these two points were talking about the same time frame.

7 hours agodetaro

If I read the article correctly, that number relates to statistics collected only after they ditched k8s.

7 hours agoavhception

Can we normalize sending the archive link by default? (in the submission)

7 hours agomdavid626

If they have 3 cloud providers and say 5 regions that's potentially 15 clusters. My assumption is that they're running in even more regions..

6 hours agommusc

Edit: https://archive.is/GoiDF, thanks @nosefrog!

Um. Interesting. I don't think anyone should be operating 47 different Kubernetes clusters for an application. You should probably max out at three: production, staging, and dev — if you even need a dev cluster (ideally you can just run your dev server locally); you can probably also get away with colocating staging and production in the same cluster, but in different namespaces or using different sets of services/labels, and ultimately just run one Kubernetes cluster.

They mentioned they run on three different cloud providers at the same time (...why...?), but even then, I'm not clear how that results in forty seven different K8s clusters. 47 isn't even divisible by three!

Sadly the rest of the article post-paywall doesn't explain anything about how they ended up in that mess. Apparently they have "8 senior DevOps engineers," and you... really shouldn't be operating 7x more clusters than you have senior DevOps engineers in my opinion.

7 hours agoreissbaker

This is a useless article without explaining what their use case is exactly. Running 47 kubernetes clusters sounds weird as hell.

7 hours agorapsey

Paywall

7 hours agogberger

DevOps team sound so ridiculous to me. I stopped reading this immediately at the beginning. They should probably try to change their complexity first.

7 hours agoJazgot

It seems like that's what they did?