← Back to context

Comment by stephenlf

1 day ago

Can’t wait to see the next iteration of this idea with “Logs are all you need for durable workflows.”

Yep. But we all know that one machine can and will fail (or be patched and restarted), so the log needs to be distributed.

Different workflows should probably go in different buckets or "topics" for clarity. Since it's distributed, the system must guarantee that the log items are stored in the same ordering ("offsets") among the nodes.

Not a bad way to do things.

In all seriousness, I’d take a “s3 is all you need for durable workflows” and use it in data processing applications that move data from s3 -> s3 with no other dependencies.

Are logs all you need for durable workflows? I'm confused here. How'd persist and query nested or related data over logs? By logs I assume you mean something like elasticsearch or meilisearch?

Pardon my ignorance trying to follow up on what is most likely sarcasm but is this not Kafka's claim to fame?

I am joining a new project and need to know to what extent Kafka is still a part of the future for new big data projects. It doesn't seem like there are alternatives at the high end but instead the question is when other technologies (that are easier to manage, require less compute, etc.) max out.

Shortly followed by:

"Sockets are all you need for durable workflows" and then finally "Kernel primitives are all you need for durable workflows."

But seriously, part of being a professional is using the right tool for the job.