How many HTTP requests/second can a single machine handle? (2024)

1 month ago (binaryigor.com)

56 comments

BinaryIgor

From the article: “Huge machine - 8 CPUs, 16 GB of memory”

That’s barely more than a raspberry pi? (4 vs 8 cores) Huge machines today have 20+ TBs of RAM and hundreds of cores. Even top-end consumer machines can have 512GB of RAM!

I do agree with the author that single machines can scale far beyond what most orgs / companies need, but I think they may be underestimating how far that goes by orders-of-magnitude

withinboredom 1 month ago
A large number of issues on an open source server are people wondering why perf is so bad when they give it a single core. Single core performance hasn’t improved much in the last 10-15 years, but more and more of them can be accessed. It blows me away how expensive they are that people need to worry about it.
- arnaudsm 1 month ago
  
  Intel's single core performance has 3.4xed in 15 years (980X vs 285K)
  Single core perf doubled every 8 years, multicore every 6 years, and GPUs every 3 years !
- cosmotic 1 month ago
  
  What evidence are you using for that single core claim?
  In 8 years, Ryzen went from 1166 geekbench 6 single core to 3398.
  
  2 replies →
paulddraper 1 month ago

AWS calls it 2xlarge
BinaryIgor 1 month ago

True :) I did it on purpose to show that even with these modest resources you can achieve amazing performance - better than most systems would ever need
o_m 1 month ago

You can spend less than double, what Digitalocean charges for 8 cores, at Hetzner and get ten times more cores on a single machine

mannyv 1 month ago

Well, you have to understand what you're testing.

With a test like this, you're really testing two different things:

1. How fast your database is,

2. How fast your frontend is

Since the query is simple, your frontend is basically a DB access layer and should be taking no time. And since the table is indexed the query should also take no time.

The only other interesting question is if the database can handle the number of connections and the storage is. The app is using connection pools, but the actual size of the database machine is never mentioned...which is a problem. How big is the DB instance? A small instance could be crushed with 80 connections. A database on a hard drive may not be able to handle the load either (though since the data volume is small, it could be that everything ends up cached anyway).

So this is sort of interesting, but sort of not interesting.

BinaryIgor 1 month ago
It's all described in the blog post and there is a link to the source code as well :)
Both the app and db are hosted on the same machine - they are sharing resources. This fact, type of storage and other details of the setup are contained in this section: https://binaryigor.com/how-many-http-requests-can-a-single-m...
I think you're right that I didn't mention the details of the db connection pool; they are here: https://github.com/BinaryIgor/code-examples/blob/master/sing...
Long story short, there's a Hikari Connection Pool with initial 10 connections, resizable to 20.
- postquantumfax 1 month ago
  
  60000 is near the theoretical max of a 5 tupple with all but the client port being fixed. If you are going to test with this many connections per client you are hopefully using multiple IPs per client or multiple server IPs.
fabian2k 1 month ago

Postgres with unmodified default settings can handle thousands of requests like that per second on relatively small hardware. The connection pool is a potential bottleneck, but one you should be able to avoid. I think the default limit for Postgres would be something like 100 connections, that's plenty with a pool in front of it.

andersmurphy 1 month ago

A single machine can handle much, much more if you use sqlite and batch updates/inserts.

Honestly, unless you're bandwidth/uplink limited (e.g running a CDN) then a single machine will take you really far.

Also simpler systems tend to have better uptime/reliability. Doesn't get much simpler than a single box.

kgeist 1 month ago
On my pretty modest dev machine with 12 CPUs, I once managed to achieve 14k RPS with Go+SQLite in a write+read test on a real project I was developing (it used a framework so there was also some overhead due to all the abstractions). I didn't even batch anything. The only problem was, I quickly found that SQLite's WAL checkpointer couldn't keep up with the write rates, the WAL file quickly grew to 100s of GBs (this is actually a known issue and is mentioned in their docs), so I had to add a special goroutine to monitor the size of the WAL file and force checkpointing manually when it got too big.
So when people say 1k is "highload" and requires a whole cluster, I'm not sure what to think of it. You can squeeze so much more out of a single fairly modest machine.
- andersmurphy 1 month ago
  
  Sqlite has some sharp edges for sure honestly even basic batching all inserts/updates in a transaction every 100ms will get you to 30000+ updates a second on a 4 core shared CPU VPS (assuming nvme drives).
  That's the other thing AWS tends to have really dated SSDs.
  Honestly, it's like the industry has jumped the shark. 1k is not a lot of load. It's like when people say single writer means you can't be performant, it's the opposite most of the time single writer lets you batch and batching is where the magic happens.

swiftcoder 1 month ago

Sometimes it confuses me how much we are just sort of treading water on the server performance front: the C10K problem was solved in 1999. WhatsApp was hosting a million TCP connections per box in 2011.

It is not all that hard to hit 10k requests/second on modern hardware. 100k requests/second is achievable with some careful technology choices.

BobbyTables2 1 month ago

I applaud the author’s curiosity but hope they realize this is like comparing the 0-60 performance of a Cadillac SUV vs a Ford Excursion.

A low end ARM processor(like a raspberry pi) can crank out 1000 requests a second with a CGI program handing the requests — using a single CPU core. Of course this doesn’t happen by with traditional CGI. (Actual performance with traditional CGI will be more like 20-50/s or worse).

Like the stereotypical drivers of such vehicles, the industry has become so fat and stupid that an x86 system handling 500 requests/sec actually sounds impressive. Sadly, considering the bloated nature of modern stacks, it kinda is.

BinaryIgor 1 month ago

True :) My main motivation was to at least realistically move us into a right (simpler) direction - from currently still popular microservices architectures deployed to multi-machine Kubernetes clusters to handle, on average, 5 req/s

dang 1 month ago

Related ongoing thread:

Use One Big Server (2022) - https://news.ycombinator.com/item?id=45085029 - Aug 2025 (61 comments)

kiitos 1 month ago

database on the same machine as the application server, RPS limits enforced via

            var issuedRequests = i + 1;
            if (issuedRequests % REQUESTS_PER_SECOND == 0 && issuedRequests < REQUESTS) {
                System.out.println("%s, %d/%d requests were issued, waiting 1s before sending next batch..."
                    .formatted(LocalDateTime.now(), issuedRequests, REQUESTS));
                Thread.sleep(1000);
            }

don't take any conclusions away from this post, friends

BinaryIgor 1 month ago
That's by intention, I wanted to test REQUESTS_PER_SECOND max, in every test case.
Same with db - I wanted to see, what kind of load a system (not just app) deployed to a single machine can handle.
It can be obviously optimized even further, I didn't try to do that in the article
- akoboldfrying 1 month ago
  
  Based on that code snippet, and making some (possibly unjustified) assumptions about the rest of the code, your actual request rate could be as low as 50% of your claimed request rate:
  Suppose it takes 0.99s to send REQUESTS_PER_SECOND requests. Then you sleep for 1s. Result: You send REQUESTS_PER_SECOND requests every 1.99s. (If sending the batch of requests could take longer than a second, then the situation gets even worse.)
  The issue GP has with app and DB on the same box is a red herring -- that was explicitly the condition under test.
  
  2 replies →
- kiitos 1 month ago
  
  i mean the details are far beyond what can be effectively communicated in a HN comment but if your loadgen tool is doing anything like sleep(1000ms) it is definitely not making any kind of sound request-per-second load against its target
  and, furthermore, if the application and DB are co-located on the same machine, you're co-mingling service loads, and definitely not measuring or capturing any kind of useful load numbers, in the end
  tl;dr is that these benchmarks/results are ultimately unsound, it's not about optimization, it's about validity
  if you want to benchmark the application, then either you (a) mock the DB at as close to 0 cost as you can, or (b) point all application endpoints to the same shared (separate-machine) DB instance, and make sure each benchmark run executes exactly the same set of queries against against a DB instance that is 100% equivalent to the other runs, resetting in-between each run
  
  2 replies →
itsthecourier 1 month ago
you can conclude this may be optimized further and yet conclude his numbers are at least a baseline
- kiitos 1 month ago
  
  it's not about optimization, it's about soundness, and the numbers aren't sound

ted_dunning 1 month ago

This entire post could be 3 paragraphs of test conditions, 2 paragraphs of narrative and one graph. IT would have been more informative to boot.

A picture would have been worth quite a bit more than a thousand words.

rokkamokka 1 month ago

A toy example but it's an interesting read nonetheless. We also host our monolith app on a few bare metal machines (considerably beefier than the example however) and it works well, although the app does considerably more queries (and more complex queries) than this. Transaction locking issues are our bane though.

BinaryIgor 1 month ago

How many queries do you usually handle? Why a few? One doesn't suffice? What resources do they have?

willsmith72 1 month ago

personally I use cloudflare workers not because 1 host couldn't handle the traffic (it could), but the maintenance is a breeze

obviously at high load (1k TPS+) talking in servers is way cheaper than serverless, so the tradeoff can start to swing

spapas82 1 month ago

> External volume for the database - it does not write to the local file system (we use DigitalOcean Block Storage)

Is this common? Why not use the local filesystem? Actually, I thought that using anything else beyond the local filesystem for the database is a no-no. Am I missing something?

crazygringo 1 month ago
Databases on cloud providers are usually not on file systems local to the instance because local instances are meant to fail at any time.
Block storage is meant to be reliable, so databases go there. Yes it's slower but you don't lose data.
Generally, the only time you want a local database in the cloud is if it's being used for short-lived data meaningful only to that particular instance in time.
Or it can work if your database rarely changes and you make regular backups that are easy to revert to, like for a blog.
- fabian2k 1 month ago
  
  Databases have tools to work with storage or servers that can fail. You would need to use replication between multiple database servers and a backup method to some other storage.
  Databases with high availability and robust storage were possible before the cloud.
  
  1 reply →
hu3 1 month ago
Yeah I wouldn't even entertain running RDBMS in network storage for fsync and mmap reasons alone.
- victorbjorklund 1 month ago
  
  Isnt that how most managed postgres work? Or db in kubernetes etc?
  
  1 reply →

ashwindharne 1 month ago

I always find that my regular crud apps kind of grow into something not-so-cruddy due to a single feature (realtime communication, bursty usage profile, large batch jobs to precompute something to expensive to do at request time) and the architecture just explodes from there. s

also, it always feels like I need a second instance at the very least for redundancy, but then we have to ensure they're stateless and that batch jobs are sharded across them (or only run on one), and again we hit an architecture explosion. Wish that I was more comfortable just dropping a single spring boot instance on a vm and calling it a day; spring boot has a lot of bells and whistles and you can get pretty far without the architecture explosion but it is almost inevitable

high_priest 1 month ago
One of the reasons I have completely dropped interpreted languages (Python, Java, JS) and followed the "back to compiled software" hype. I am now writing my software purely in Go and Rust and again using pipes, queues and temp storage, to connect smaller programs, tools, services. The Unix philosophy was revolutionary and (for me) the ultimate solution for software organisation. But, one never knows what he had, until he loses it, so I treat the few years of experimentation with interpreted alternatives, as a positive.
- throwaway7783 1 month ago
  
  Putting java and python in the same bracket on performance because they are interpreted, completely ignores what JVM is and can do today.
  
  1 reply →

BiraIgnacio 1 month ago

Load Testing: how many HTTP requests/second can a Single Machine while doing <insert thing/things> handle?

yencabulator 1 month ago

> very_high_load: 4000 requests per second - 4 machines x 1000 RPS

This is an incredibly naive article.

ayende 1 month ago

Another way to do that is to look at the tech empower benchmarks. It tests on big machines, but you can get > 1 M req/sec across a wide variety of environments.

3eb7988a1663 1 month ago
Should be noted that some of those are cheating for the purposes of the test. Hard coded responses to avoid doing any actual work.
- hu3 1 month ago
  
  The ASP.NET C# numbers are real.
  
  1 reply →

comprev 1 month ago

Needs (2024)

caymanjim 1 month ago
Needs (2000) really. This contrived test is using tiny VPSes (even the "big" machine is tiny), slow network-mounted DB storage, nothing like a production stack that a real API server would use. Bespoke simple profiling mechanism. Nothing wrong with OP learning the basics and experimenting, but there's nothing of value in the findings.
- ytch 1 month ago
  
  The title reminds me of C10K problems:
  https://en.wikipedia.org/wiki/C10k_problem
- BinaryIgor 1 month ago
  
  Regarding the machine's sizes - I did it on purpose, to showcase that even with this limited resources, you can still achieve way better performance than most systems will ever need.
  I know that you can have significantly bigger machines; network-mounted DB storage on the other hand is not slow - it's designed specifically for these kind of use cases
  
  1 reply →