← Back to context

Comment by saidinesh5

1 day ago

> If you have a simple job to do that fits in an AWS Lambda, why not deploy it that way, scalability is essentially free. But the real advantage is that by writing it as a Lambda you are forced to think of it in stateless terms.

What you are describing is already the example of premature optimization. The moment you are thinking of a job in terms of "fits in an AWS Lambda" you are automatically stuck with "Use S3 to store the results" and "use a queue to manage the jobs" decisions.

You don't even know if that job is the bottleneck that needs to scale. For all you know, writing a simple monolithic script to deploy onto a VM/server would be a lot simpler deployment. Just use the ram/filesystem as the cache. Write the results to the filesystem/database. When the time comes to scale you know exactly which parts of your monolith are the bottleneck that need to be split. For all you know - you can simply replicate your monolith, shard the inputs and the scaling is already done. Or just use the DB's replication functionality.

To put things into perspective, even a cheap raspberry pi/entry level cloud VM gives you thousands of postgres queries per second. Most startups I worked at NEVER hit that number. Yet their deployment stories started off with "let's use lambdas, s3, etc..". That's just added complexity. And a lot of bills - if it weren't for the "free cloud credits".

> The moment you are thinking of a job in terms of "fits in an AWS Lambda" you are automatically stuck with "Use S3 to store the results" and "use a queue to manage the jobs" decisions.

I think the most important one you get is that inputs/outputs must always be < 6mb in size. It makes sense as a limitation for Lambda's scalability, but you will definitely dread it the moment a 6.1mb use case makes sense for your application.

  • The counterargument to this point is also incredibly weak: It forces you to have clean interfaces to your functions, and to think about where the application state lives, and how it's passed around inside your application.

    That's equivalent to paying attention in software engineering 101. If you can't get those things right on one machine, you're going to be in world of hurt dealing with something like lambda.

    • I'd say the real advantage is that if you need to change it you don't have to deploy your monolith. Of course, the relative benefit of that is situationally dependent, but I was recently burned by a team that built a new replication handler we needed into their monolith, and every time it had a bug, and the monolith only got deployed once a week. I begged them to put it into a lambda but every week was "we'll get it right next week", for months. So it does happen.