Comment by jwr

21 hours ago

I don't get this scalability craze either. Computers are stupid fast these days and unless you are doing something silly, it's difficult to run into CPU speed limitations.

I've been running a SaaS for 10 years now. Initially on a single server, after a couple of years moved to a distributed database (RethinkDB) and a 3-server setup, not for "scalability" but to get redundancy and prevent data loss. Haven't felt a need for more servers yet. No microservices, no Kubernetes, no AWS, just plain bare-metal servers managed through ansible.

I guess things look different if you're using somebody else's money.

17 comments

jwr

danielmarkbruce 16 hours ago

It's not about scalability. It's about copying what the leaders in a space do, regardless of whether it makes sense or not. It's pervasive in most areas of life.

drob518 21 hours ago

One of the silliest things you can do to cripple your performance is build something that is artificially over distributed, injecting lots of network delays between components, all of which have to be transited to fulfill a single user request. Monoliths are fast. Yes, sometimes you absolutely have to break something into a standalone service, but that’s rare.

hedora 18 hours ago
I've notice a strong correlation between artificially over-distributing, and not understanding things like the CAP theorem. So, you end up with a slow system that's added a bunch of unsolvable distributed systems problems on its fast path.
(Most distributed systems problems are solvable, but only if the person that architected the system knows what they're doing. If they know what they're doing, they won't over-distribute stuff.)
- drob518 17 hours ago
  
  Yes, that too. If you look at the commits for Heisenbugs associated with the system, you have a good chance of seeing artificial waits injected to “fix” things.
- Groxx 16 hours ago
  
  You can solve just about any distributed systems problem by accepting latency, but nobody wants to accept latency :)
  ...despite the vast majority of latency issues being extremely low-hanging fruit, like "maybe don't have tens of megabytes of data required to do first paint on your website" or "hey maybe have an index in that database?".
  
  1 reply →
gowld 15 hours ago

There's no need to deploy separate service on separate machines.

intrasight 17 hours ago

I ran a SaaS for 10 years. Two products. Profitable from day 1 as customers paid $500/month and it ran on a couple of EC2 instances as well as a small RDS database.

Another thing one has to consider is the market size and timeframe window of your SaaS. No sense in building for scalability if the business opportunity is only 100 customers and only for a few years.

ben_w 20 hours ago

> unless you are doing something silly, it's difficult to run into CPU speed limitations.

Yes, but it's not difficult to do something silly without even noticing until too late. Implicitly (and unintentionally) calling something with the wrong big-O, for example.

That said, anyone know what's up with the slow deletion of Safari history? Clearly O(n), but as shown in this blog post still only deleted at a rate of 22 items in 10 seconds: https://benwheatley.github.io/blog/2025/06/19-15.56.44.html

phkahler 20 hours ago
>> Yes, but it's not difficult to do something silly without even noticing until too late. Implicitly (and unintentionally) calling something with the wrong big-O, for example.
On a non-scalable system you're going to notice that big-O problem and correct it quickly. On a scalable system you're not going to notice it until you get your AWS bill.
- hedora 19 hours ago
  
  Also, instead of having a small team of people to fight scalable infrastructure configuration, you could put 1-2 full time engineers on performance engineering. They'd find big-O and constant factor problems way before they mattered in production.
  Of course, those people's weekly status reports would always be "we spent all week tracking down a dumb mistake, wrote one line of code and solved a scaling problem we'd hit at 100x our current scale".
  That's equivalent to waving a "fire me" flag at the bean counters and any borderline engineering managers.

floating-io 21 hours ago

For how many users, and at what transaction rate?

Not disagreeing that you can do a lot on a lot less than in the old days, but your story would be much more impactful with that information. :)

crazygringo 21 hours ago

Scalability isn't just about CPU.

It's just as much about storage and IO and memory and bandwidth.

Different types of sites have completely different resource profiles.

sreekanth850 20 hours ago
Microservice is not a solution for scalability. There are multiple options for building scalable software, even a monolith or a modular monolith with proper loadbalanced setup will drastically reduce the complexity of microservice and get massive scale. Only bottleneck will be db.
- hedora 18 hours ago
  
  Microservices take an organizational problem:
  The teams don't talk, and always blame each other
  and adds distributed systems and additional organizational problems:
  Each team implements one half of dozens of bespoke network protocols, but they still don't talk, and still always blame each other. Also, now they have access to weaponizable uptime and latency metrics, since because each team "owns" the server half of one network endpoint, but not the client half.

JackSlateur 16 hours ago

Is it about scalability, or about resiliency ?

worldsayshi 14 hours ago

Or is it about outsourcing problems?
There's a lot of off the shelf microservices that can solve difficult problems for me. Like keycloak for user management. Isn't that a good reason?
Or Grafana for log visualization?
Should I build that into the monolith too? Or should I just skip it?