91

Stop Designing Your Web Application for Millions of Users When You Dont Have 100

At a previous job, there was an argument over a code review where I had done some SQL queries that fixed a problem but were not optimal. The other side were very much "this won't work for 1000 devices! we will not approve it!" whereas my stance was "we have a maximum of 25 devices deployed by our only customer who is going to leave us next week unless we fix this problem today". One of the most disheartening weeks of my software development life.

(that was also the place I had to have a multi-day argument over the precise way to define constants in Perl because they varied in performance except it was a long running mod_perl server process and the constants were only defined at startup and it made absolutely zero difference once it had been running for an hour or more.)

2 hours agozimpenfish

This is why everyone uses microservices and React. We don't know how these work, but we are netfligs/farcebook level companiez, so we must.

40 minutes agolofaszvanitt

> I had to have a multi-day argument over the precise way to define constants in Perl

Couldn't you have used whatever the other person was suggesting even if the change was pointless?

an hour agomeribold

These days, I would, yeah, but this was a long time ago and I was a lot more invested in not letting nonsense go past without a fight.

34 minutes agozimpenfish

totally agree with your view,

the big design upfront moment again, maybe because of the current economy, we need focus more on be profitable in short term, i think it's great and always focus on optimize for now, and test for specs (specs in sense of requirements of customer)

2 hours agorenatovico

I really don't think that's the case. If you ask a CEO/CTO of a startup, he would fire the guy who did the latter instead of the middle approach. Longer term stability and development velocity are very important concerns in engineering management. This is pure inexperience - it wouldn't take too long to setup for an experienced engineer anyways, they probably did it 5 times last month and have a library of knowledge and templates. I can't call a guy who isn't capable of setting up a project this way swiftly without seeing it as a problem "senior".

2 hours agothrowaway48540

What is the middle approach in this scenario? What specifically are the templates you refer to?

2 hours agodambi0

The middle approach would be implementing the infrastructure in a basic way that is simplistic, but provides benefit and can be expanded.

2 hours agothrowaway48540

> The middle approach would be implementing the infrastructure

That's great if you're implementing but it doesn't really work when you're coming in to an existing infrastructure (or codebase) that other people manage.

an hour agozimpenfish

To spend time on thinking about performance, and then not write the code.

44 minutes agoArnt

How does that relate to the scenario and what of the templates?

an hour agodambi0

I mean sure you would want to do that, but the above was very specific situation with existing feature set and an imminent novel problem.

an hour agomewpmewp2

I actually like having room for optimization, especially when running my own infra servers included.

As an example, I can think of half a dozen things I can currently optimize just in the DB layer, but my time is being spent (sensibly!) in other areas that are customer facing and directly impacting them.

So fix what needs to be fixed, but if there was a major load spike due to onboarding of new clients/users, I could in a matter of hours have the DB handling 100x the traffic. That's a nice ace in the back of my pocket.

And yes, if I had endless time I'd have resolved all issues.

2 hours agobbarnett

Usually a good trick is to run small deployments at high logging levels. Then, as soon as there are performance issues, you can dial down the logging and get the hours of respite needed to actually make a bigger algorithmic improvement.

2 hours agowillvarfar

If that optimization is mere hours of work, I would go for it outright. BTW when you have an overwhelming wave of signups, you are likely to have more pressing and unexpected issues, and badly lack spare hours.

Usually serious gains that are postponed would require days and weeks of effort. Maybe mere hours of coding proper, but much longer time testing, migrating data, etc.

2 hours agonine_k

I think that's the pitfall: there are infinite things a skilled developer can do within "mere hours of work".

The key is to find which ones are the most effective use of one's limited hours.

I developed a small daily game and it has now grown to over 10K DAU, so now I've started going back to refine the low hanging fruits which just didn't make sense to touch when I had just 10s of players a day.

2 hours agoxandrius

We might all be operating under different ideas of what "matter of hours" means. Often times in a project, at least for me, the range of things that I can start and finish in mere hours is not actually infinite, but rather constrained. Only the simplest and smallest things can be done so fast or faster. So many other things just take longer, at least a day and a half, and can take extra long when other teams are involved. More experienced developers can learn how to break things down into increments, so while an entire big feature might take weeks to do, you're making clear chunks of progress in much smaller units. Those chunks are probably not mostly sized in mere hours pieces, though...

You're right that priority matters. Just beware of priority queue starvation. Still, if some newly discovered bug isn't urgent, even if I think it'd be rather easy (under an hour) to fully address, I'd rather not break my current flow, and just keep working on the thing I had earlier decided was highest priority. A lot of the time something will prevent direct progress and break the flow anyway, having smaller items available to quickly context switch to and finish is a good use of those gap times.

The "DB handling 100x the traffic" example above isn't quite well defined. I wonder if it's making queries return 100x faster? Or is it making sure the queries return at roughly their current speeds even if there's 100x more traffic? Either way, I can make arguments for doing the work proactively rather than reactively, but I'd at least write down the half a dozen things. Then maybe someone else can do them, and maybe those things can be done in around half a dozen tiny increments of 30 minutes or less each, instead of all at once in hours.

12 minutes agoJach

> if there was a major load spike due to onboarding of new clients/users

This company was a hardware company with reasonably complex installation procedures - even going full whack, I doubt they could have added more than 20 new devices a week (and even then there'd be a hefty lead time to get the stuff manufactured and shipped etc.)

2 hours agozimpenfish

like != need

2 hours agosoco

People seriously underestimate the amount of clients you can serve from a single monolith + SQL database on a VPS or physical hardware. Pretty reliable as well, not many moving parts, simple to understand, fast to setup and keep up to date. Use something like Java, C#, Go or Rust. If you need to scale you can either scale vertically (bigger machines) or horizontally (load-balancer).

The SQL database is probably the hardest part to scale, but depending on your type of app there is a lot of room with optimizing indices or add caching.

In my last company we could easily develop and provide on-call support for multiple production critical deployments with only 3 engineers that way. Got so little calls that I had trouble to remember everything and had to look it up.

2 hours agoManBeardPc

Scaling databases is easy. But you can really blow performance out of the water if you don't know SQL and use an ORM the naive way. Micro services and things like graphql make it worse. Now you are doing joins in memory in your FF-ing graphql endpoint instead of in the database.

A simple cheap database should be able to handle millions of rows and a handful of concurrent users. Meaning that if your users drop by a few times a week, you can have hundreds/thousands of those before your server gets busy. Run two cheap vms so you can do zero downtime deployments. Put a load balancer in front of it. Get a managed database. Grand total should be under 100$ per month. If you are really strapped for cash, you can get away with <40$.

2 hours agojillesvangurp

>And a big, serious database server can handle insane number of rows and concurrent users. Stackoverflow famously runs on [1] a single database server (plus a second for failover, plus another pair for the rest of the stackexchange network).

[1] Or used to run, this factoid is from many years ago, at its peak popularity.

40 minutes agopoincaredisk

Vertical scaling is easy, but horizontal scaling is something that gets complex very fast for SQL databases. More tools, more setup, more things can go wrong and you have to know. If you have no shard like a tenant_id joins easily become something that involves the network.

Managed databases ofc take a lot of that work away from you, but some customers want or need on-premise solutions due to legal requirements or not wanting to get locked into proprietary offerings.

42 minutes agoManBeardPc

People also underestimate how financially predictable this setup is - you purchase a VPS or bare metal box on DigitalOcean/Hetzner/OVH for e.g. $50/mo, and that price will likely stay the same for the next 5 years. Try that with any of the cloud providers.

This part is often neglected when running a company, where owners usually hope infra costs will decrease over time or remain proportional to company income. However, I'm still waiting to see that.

32 minutes agodig1

> The SQL database is probably the hardest part to scale

Use an ORM and SQLite, I bet you a beer that you won’t hit the perf ceiling before getting bored of your project.

2 hours agobrtkdotse

I agree that SQLite is awesome for many use-cases. Running in the same process allows to have a very small latency. No additional server to manage is also a big plus. Concurrency can be an issue with writes though. Plus if you need to scale your backend you can't share the same database.

ORMs are very hit-and-miss for me. Had bad experiences with performance issues and colleagues who don't understand SQL itself, leading to many problems. Micro-ORMs that just do the mapping, reduce boilerplate and otherwise get out of your way are great though.

an hour agoManBeardPc

It really depends on the project. For a CRUD application? Sure. For an application that processes hundreds of events per second, stores them, and makes them wueryable using various filters? Can get tough real quick.

37 minutes agopoincaredisk

That's where SQLite really shines - used it in a massively scaled facial recognition system, with several hundred simultaneous cameras, a very large FR database with over 10K people, each having dozens of face entries, and all the events that generates from live video viewing large public venues. SQLite was never a bottleneck, not at all.

24 minutes agobsenftner

Single sqlite db or did you shard it ?

10 minutes agoleosanchez

This topic lacks nuance.

I agree in focusing on building things that people want, as well as iterating and shipping fast. But guess what? Shipping fast without breaking things requires a lot of infrastructure. Tests are infrastructure. CI and CD are infrastructure. Isolated QA environments are infrastructure. Monitoring and observability are infrastructure. Reproducible builds are infrastructure. Dev environments are infrastructure. If your team is very small, you cannot ship fast, safely, without these things. You will break things for customers, without knowing, and your progress will grind to a halt while you spend days trying to figure out what went wrong and how to fix it, instead of shipping, all while burning good will with your customers. (Source: have joined several startups and seen this first hand.)

There is a middle ground between "designing for millions of users" and "build for the extreme short term." Unfortunately, many non-technical people and inexperienced technical people choose the latter because it aligns with their limited view of what can go wrong in normal growth. The middle ground is orienting the pieces of your infrastructure in the right direction, and growing them as needed. All those things that I mentioned as infrastructure above can be implemented relatively simply, but sets the ground work for future secure growth.

Planning is not the enemy and should not be conflated with premature optimization.

2 hours agomyprotegeai

Nowadays there are a lot of tools to setup this infra in a standard way very quickly though - in terms of CI/CD, tests, e2e tests etc.

41 minutes agomewpmewp2

There is no "standard way", only rough principles, and it all depends on the unique DNA of the company (cloud, stack, app deployment targets, etc). Yes there are a lot of tools, and experienced infrastructure engineers spend a lot of time integrating them. Often times they won't work without enormous effort, because of early "move fast & break things" design decisions made by the org. My experience has been that a startup is only using these tools from early on if they have experienced engineers and a management that understands the importance of building on a reliable foundation.

13 minutes agomyprotegeai

Yes but...

Still make some effort to build as if this were a professional endeavor; use that proof of concept code to test ideas, but rewrite following reasonable code quality and architecture practices so you don't go into production with lack of ability to make those important scaling changes (for if/when you get lucky and get a lot of attention).

If your code is tightly coupled, functions are 50+ lines long, objects are mutated everywhere (and in places you don't even realize), then making those important scaling changes will be difficult and slow. Then you might be tempted to say, "We should have built for 1 million users." Instead, you should be saying, "We should have put a little effort into the software architecture."

There are two languages that start with "P" which seem to often end up in production like this.

an hour agomichaelteter

Figure out your data structure definitions early, along with where those structures come from and where they’re going, and write disposable code around them that builds them and gets them where they need to be. Stable data definitions make it easy to replace bits and pieces of your application as you go. Especially if you view mutability not as the default, but as a performance optimization you can reach for if you need it. (You often do not)

an hour agosevensor

[dead]

an hour agohyperliner

[caveat: not read the text because if you click the partners link onthe nag box, it looks like I need to object to each “legitimate interest” separately and I've not got time for that – congratulations darrenhorrocks.co.uk on keeping your user count lower, so you don't have to design for higher!]

The problem often comes from people solving the problem that they want to have, not the ones that they currently have. There is a pervasive view that if your site/app goes viral and you can't cope with the load, you lose the advantage of that brief glut of attention and might never get it again, if there is a next time some competing site/app might get the luck instead. There is some truth in this, so designing in a way that allows for scaling makes some sense, but perhaps many projects give this too much priority.

Also, designing with scaling in mind from the start makes it easier to implement later, if you didn't you might need a complete rewrite to efficiently scale. Of course keeping scaling in mind might mean that you intend a fairly complete redo at that point, if you consider the current project to be a proof of concept of other elements (i.e. the application's features that are directly useful to the end user), the difference being that in this state you are at least aware of the need rather than it being something you find out when it might already be too late to do a good job.

One thing that a lot of people overengineering for scale from day 1, with a complex mesh of containers running a service based design miss, when they say “with a monolith all you can do is throw hardware at the problem”, is that scaling your container count is essentially throwing (virtual) hardware at the problem, and that this is a valid short-term solution in both cases, and until you need to regularly run at the higher scale day-in-day-out the simpler monolith will likely be more efficient and reduce running costs.

You need to find the right balance of “designing with scalability in mind”, so it can be implemented quickly when you are ready, which is not easy to judge so people tend to err on the side of just going directly for the massively scalable option despite the potential costs of that.

2 hours agodspillett

>It looks like I need to object to each “legitimate interest” separately and I've not got time for that

I absolutely don't understand why some websites do this. Either don't show them or don't make them annoying to disable. Let me explain:

Legitimate interest is one of the lawful reasons for processing personal data. They don't have to ask for your permission. Usually adspam cookies are not in your legitimate interest, so they have to resort to another lawful basis, which is user consent. But they claim "legitimate interest" covers these cookies, so why even ask?

But on the other hand, I often stubbornly disable legitimate interest cookies, and not once I broke the website this way. This is suspicious - "legitimate interest" means that it's crucial to doing what you want to do on the website, for example a session cookie or language selection cookie. If the website works normally without a "legitimate interest" cookie, them the interest was not legitimate at all. I assume this is just some trick abused by advertisers to work around GDPR, and I wish them all 4% of global turnover fine.

4 minutes agopoincaredisk

Guilty! I spent so much time recently asking myself and trying to optimize my app and stack: - what if people upload files that big, what about the bandwidth cost? - what if they trigger thousands of events X and it costs 0.X$ per thousands?

Fast forward months, it's a non issue despite having paying customers. Not only I grossly exagerated the invidual user's resources consumption, but I also grossly exagerated the need for top notch k8s auto-scaling this and that. Turns out you can go a long way with something simpler...

2 hours ago255kb

Yeah, I have to keep fighting the feeling that I'm not doing things the "right" way as they are so unlike the $JOBs$ I had before, with availability zones, auto-scaling, CDNs, etc.

I have the most adhoc, dead simple and straightforward system and I can sleep peacefully at night, while knowing I will never pay more than $10 a month, unless I decide to upgrade it. Truly freeing (and much easier to debug!)

2 hours agoxandrius

When you have more micro-services than users.

2 hours ago1GZ0

This phenomenon needs a term, how about Premature Architecture?

2 hours agodimitar

I like it much. It goes well with Premature celebration from startupers. "We raised 100M USD, we made it!", when the company is on the verge of collapse every day and is losing 100M USD per year and has no business model rather than buying something 2 USD and selling it 1 USD.

2 hours agorvnx

Well, if that isn't a problem for a future me, I don't know what is.

39 minutes agomewpmewp2

Premature architectural optimization.

2 hours agossdd333n6v

Or in other words, you're not Google or Facebook. You almost certainly don't need the level of performance and architectural design those companies need for their billion + user systems.

And it doesn't help that a lot of people seem to drastically underestimate the amount of performance you can get from a simpler setup too. Like, a simple VPS can often do fine with millions of users a day, and a mostly static informational site (like your average blog or news site) could just put it behind Cloudflare and call it a day in most cases.

Alas, the KISS principle seems to have gone the way of the dodo.

an hour agoCM30

Designing for low latency (even if only for a few clients) can be worth it though. Each action taking milliseconds vs. each action taking sections will lead to vastly different user experiences and will affect how users use the application.

an hour agothe8472

I agree with this if you are talking fb/google scale; you will very likely not even get close to that, ever. But millions of users, depending on how it hits the servers over time, is really not very much. I run a site with well over 1m active users, but it runs on $50/mo worth of VPSs with an LB and devving for it doesn't take a millisecond longer than for a non scaling version.

2 hours agoanonzzzies

The truth is in the middle. I've now had two startups where I came in to fix things where the system would fall over if there were more than 1 user. Literally; no transactional logic and messy interactions with the database combined with a front end engineers just making a mess of doing a backend.

In one case the database was "mongo realm", which was something our Android guy randomly picked. No transactions, no security, and 100% of the data was synced client side. Also there was no IOS and web UI. Easiest decision ever to scrap that because it was slow, broken, and there wasn't really a lot there to salvage. And I needed those other platforms supported. It's the combination of over and under engineering that is problematic. There were some tears but about six months later we had replaced 100% of the software with something that actually worked.

In both cases, I ended up just junking the backend system and replacing it with something boring but sane. In both cases getting that done was easy and fast. I love simple. I love monoliths. So no Kubernetes or any of that micro services nonsense. Because that's the opposite of simple. Which usually just means more work that doesn't really add any value.

In a small startup you should spend most of your time iterating on the UX and your product. Like really quickly. You shouldn't get too attached to anything you have. The questions that should be in the back of your mind is 1) how much time would it take a competent team to replicate what you have? and 2) would they end up with a better product?

Those questions should lead your decision making. Because if the answers are "not long" and "yes", you should just wipe out the technical debt you have built up and do things properly. Because otherwise somebody else will do it for you if it really is that good of an idea.

I've seen a lot of startups that get hung up on their own tech when it arguably isn't that great. They have the right ideas and vision but can't execute because they are stuck with whatever they have. That's usually when I get involved actually. The key characteristic of great UX is that things are simple. Which usually also means they are simple to realize if you know what you are doing.

Cumulative effort does not automatically add up to value; often it actually becomes the main obstacle to creating value. Often the most valuable outcome of building software is actually just proving the concept works. Use that to get funding, customer revenue, etc. A valid decision is to then do it properly and get a good team together to do it.

2 hours agojillesvangurp

> In both cases, I ended up just junking the backend system and replacing it with something boring but sane. In both cases getting that done was easy and fast.

This kind of rewrite is usually quick and easy not because of the boring architecture (which can only carry the project from terrible velocity to decent velocity) but because the privilege of hindsight reduces work: the churn of accumulated changes and complications and requirements of the first implementation can be flattened into one initially well designed solution, with most time-consuming discussions and explorations already done and their outcomes ready to copy faithfully.

an hour agoHelloNurse

No, I don't think I will let you share my personal data with 200 select partners (:

an hour agobcye

This also applies when calculating losses from paying third party services.

2 hours agosignaru

I've seen a bunch of things.

Sometimes you have people who try to build a system composed of a bunch of microservices but the team size means that you have more services than people, which is a recipe for failure because you probably also need to work with Kubernetes clusters, manage shared code libraries between some of the services, as well as are suddenly dealing with a hard to debug distributed system (especially if you don't have the needed tracing and APM).

Other times I've seen people develop a monolithic system for something that will need to scale, but develop it in a way where you can only ever have one instance running (some of the system state is stored in the memory) and suddenly when you need to introduce a key value store like Valkey or a message queue like RabbitMQ or scale out horizontally, it's difficult and you instead deal with HTTP thread exhaustion, DB thread pool exhaustion, issues where the occasional DB connection hangs for ~50 seconds and stops everything because a lot of the system is developed for sequential execution instead of eventual consistency.

Yet other times you have people who read about SOLID and DRY and make an enterprise architecture where the project itself doesn't have any tools or codegen to make your experience of writing code easier, but has guidelines and if you need to add a DB table and work with the data, suddenly you need: MyDataDto <--> MyDataResource <--> MyDataDtoMapper <--> MyDataResourceService <--> MyDataService <--> MyDataDao <--> MyDataMapper/Repository with additional logic for auditing, validation, some interfaces in the middle to "make things easier" which break IDE navigation because it goes to where the method is defined instead of the implementation that you care about and handlers for cleaning up related data, which might all be useful in some capacity but makes your velocity plummet. Even more so when the codebase is treated as a "platform" with a lot of bespoke logic due to the "not invented here" syndrome, instead of just using common validation libraries etc.

Other times people use the service layer pattern above liberally and end up with hundreds of DB calls (N+1 problem) instead of just selecting what they need from a DB view, because they want the code to be composable, yet before long you have to figure out how to untangle that structure of nested calls and just throw an in-memory cache in the middle to at least save on the 95% of duplicated calls, so that filling out a table in the UI wouldn't take 30 seconds.

At this point I'm just convinced that I'm cursed to run into all sorts of tricky to work with codebases (including numerous issues with DB drivers, DB pooling libraries causing connections to hang, even OpenJDK updates causing a 10x difference in performance, as well as other just plain weird technical issues), but on the bright side at the end of it all I might have a better idea of what to avoid myself.

Damned if you do, damned if you don't.

The sanest collection of vague architectural advice I've found is the 12 Factor Apps: https://12factor.net/ and maybe choosing the right tools for the job (Valkey, RabbitMQ, instead of just putting everything into your RDBMS, additional negative points for it being Oracle), as well as leaning in the direction of modular monoliths (one codebase initially, feature flags for enabling/disabling your API, scheduled processes, things like sending e-mails etc., which can be deployed as separate containers, or all run in the same one locally for development, or on your dev environments) with as many of the dependencies runnable locally

For the most part, you should optimize for developers, so that they can debug issues easily, change the existing code (loose coupling) while not drowning in a bunch of abstractions, as well as eventually scale, which in practice might mean adding more RAM to your DB server and adding more parallel API containers. KISS and YAGNI for the things that let you pretend that you're like Google. The most you should go in that direction is having your SPA (if you don't use SSR) and API as separate containers, instead of shipping everything together. That way routing traffic to them also becomes easier, since you can just use Caddy/Nginx/Apache/... for that.

2 hours agoKronisLV

> a system composed of a bunch of microservices but the team size means that you have more services than people

The thing I keep trying to get people to recognize in internet discussions of microservices is that they're a solution to the organizational problems of very large companies. The right size is one "service" per team but keeping the team size below the "two pizza limit" (about eight people, including the line manager and anyone else who has to be in all the meetings like scrum masters etc).

If your website needs to scale to hundreds of developers, then you need to split up services in order to get asynchronous deployment so that teams can make progress without deadlocking.

Scaling for a high number of users does not require microservices. It does as you say require multiple instances which is harder to retrofit.

> additional negative points for it being Oracle

Amen.

2 hours agopjc50

All infrastructure please.

I'm currently wrestling a stupid orchestration problem - DNS external from my domain controllers - because the architecture astronaut thought we'd need to innovate on a pillar of the fucking internet

an hour agobravetraveler

I dunno, I've lived the other side of this where people made boneheaded choices early on, the product suddenly got traction, and then we were locked into lousy designs. At my last company, there were loads of engineers dedicated to re-building an entire parallel application stack with a view to an eventual migration.

A relatively small amount of upfront planning could have saved the company millions, but I guess it would have meant less work for engineers so I suppose I should be glad that firms keep doing this.

2 hours agoForHackernews

Attempting to please everyone pleases noone.

2 hours agoDalewyn
[deleted]
3 hours ago
[deleted]
44 minutes ago

Do you mean, don't consider accessibility, security and privacy because statistically speaking your 100 customers won't care about those things?

2 hours agoinjidup

You must have read a different blog post because I don't see anything arguing agsinst accessibility, security or privacy in this one?

2 hours ago42lux

No, your only 100 customers. And here your customers are not huge corporations with thousands of actual users each, but just individual users.

But certainly, you should care about security and privacy even if you have just one customer.