← Back to context

Comment by slopinthebag

1 day ago

It has to be satire right? Like, you aren't out of touch on this. I get engineers maybe making the argument that $300k / year on cloud is the same as 1.5 devops engineers managing in-house solutions, but for just json parsing????

For numbers like that, I can never tell whether it's just a vastly larger-scale dataset than any that I've seen as a non-FAANG engineer, OR, a hilariously-wasteful application of "mAnAgEd cLoUd sErViCeS" to a job that I could do on a $200/month EC2 instance with one sinatra app running per core. This is a made-up comparison of course, not a specific claim. But I've definitely run little $40 k8s clusters that replaced $800/month paid services and never even hit 60% CPU.

  • Right, this is roughly my mental situation, too. I guess that streaming JSON things can eat up compute way faster than I had any intuition for!

I wonder if you've ever worked on a web service at scale. JSON serialization and deserialization is notoriously expensive.

  • They got a 1000x speed up just by switching languages.

    I highly doubt the issue was serialization latency, unless they were doing something stupid like reserializing the same payload over and over again.

    • Well, for starters, they replace the RPC call with an in-process function call. But my point is anybody who's surprised that working with JSON at scale is expensive (because hey it's just JSON!) shouldn't be surprised.

      1 reply →

  • It can be, but $500k/year is absurd. It's like they went from the most inefficient system possible to create, to a regular normal system that an average programmer could manage.

    I have no idea if they are doing orders of magnitude more processing, but I crunch through 60GB of JSON data in about 3000 files regularly on my local 20-thread machine using nodejs workers to do deep and sometimes complicated queries and data manipulation. It's not exactly lightning fast, but it's free and it crunches through any task in about 3 or 4 minutes or less.

    The main cost is downloading the compressed files from S3, but if I really wanted to I could process it all in AWS. It also could go much faster on better hardware. If I have a really big task I want done quickly, I can start up dozens or hundreds of EC2 instances to run the task, and it would take practically no time at all... seconds. Still has to be cheaper than what they were doing.

  • Would it be better or worse if I had that experience and still said it's stupid?

    • You didn't say it was stupid. If you had, I would have just ignored the comment. But you expressed a level of surprised that led me to believe you're unfamiliar with how much of a pain in the ass JSON parsing is.

      1 reply →