Comment by lll-o-lll

2 days ago

> If you're doing anything non-trivial (say, 200+ events/workflow) and you need to run only a couple hundred of them concurrently all day, you're going to spend millions on infra, and it's still going to absolutely suck.

Where are the “millions” on infra going? It’s a handful of services and a Postgres?

> Their sales team is also absolutely appalling and desperate.

You said “on-prem”. It’s open source; why are you dealing with their sales team?

> If you're doing anything non-trivial (say, 200+ events/workflow) and you need to run only a couple hundred of them concurrently all day…

If “millions” were required to obtain such tiny scale, I’d agree there’d be a massive problem. No one would use Temporal; it would be a complete waste of resource. If this were true.

We also hit scaling problems with temporal.

Postgres doesn't scale at all four our workload, so you're into cassandra.

For a medium sized deployment, you're looking at 200+ vcpus, and then lets say standard dev/uat/prod. So now you're at 600 cpus. Now you need two geographic regions, dev can stay in one place, so now you're at 800. Want a failover cluster for prod? Have another 200 cpus.

and 200 CPUs is a medium deployment, assuming something like 36 cpus per cassandra node, then say 4-8 per instance of matching, worker, history, frontend. Then all your other components around it, ingress controller, service mesh, etc.

There's a million a year easy, for a small deployment.

Our prod one is 4x this size.

Not a couple hundred in one day, a couple hundred being started, concurrently, every second in a day. Each with ~200 events.

We need a 12 node cassandra cluster for this, with 64cpu nodes. So no, it's not a couple of services and a postgres.

Sales team, as we are an enterprise, and they want to extract money from us.

  • We’re all enterprise.

    If you have 200 WF’s/sec each with 200 events, it sounds to me that you have a sizeable amount of work flowing through this system. 17 million workflows per day? Can I call these transactions?

    Do these transactions add value to your business? Do you need durable execution for all these workloads?

    Temporal is just a tool; and like any tool it can be misused. For the classic “book a hotel + airline, handle the partial failures” case, 17 million bookings a day would imply you should be thrilled with Temporal.

    If you are using it to perform WAF in a firewall; you would be less thrilled. The scale you are describing, and that you aren’t super excited about the incredible amount of money pouring in, makes me question if the use-cases are fitting the tool.

The same with any "open-source" enterprise ($$$) software. It sucks to run yourself. Docs on running/errors are non-existent. Their helm charts are broken. Instead of degraded performance, it just fails.

  • Yeah, they've had so much VC cash pumped in lately they really need to pump the SAAS side of the business.

  • With all due respect – if that’s the attitude, you have no business running anything on-prem. And that’s fine, there’s a reason the various cloud providers are the go-to for many businesses.

    • It's not an attitude, it's an opinion that comes from experience. Operational burden/overhead is a real thing. Just like knowing that German cars will cost $$$ in maintenance. It doesn't mean I shouldn't drive.