Comment by halfcat

7 months ago

This is very cool. How does this compare to tools like Celery and Dagster?

Like we might use Celery when we need to run specific tasks on specific workers (e.g. accessing some database that sits inside a corporate network, not accessible from the cloud). It seems like we can do something like that using DBOS queues or events, but is there a concept of running multiple workers in different locations?

Compared to Dagster, a DBOS workflow seems like a de facto DAG, running the steps in the right order, and being resumable if a step fails, and the difference here would be that with Dagster the steps are more granular, in the sense that we can re-run a specific step in isolation? In other words, a DBOS workflow will retry failed steps, but can we submit a request to “only re-run step 2”? This comes up often in working with ETL-style tasks, where steps 1 and 2 complete, but we need to resubmit steps 3 and 4 of a pipeline.

Dagster also provides nice visual feedback, where even a non-technical user can see that step 3 failed, and right-click it to materialize it. Maybe I need to play with the OpenTelemetry piece to see how that compares.

Celery and Dagster have their own drawbacks (heavier, complexity of more infrastructure, learning curve), so just trying to see where the edges are, and how far we could take a tool like DBOS. From an initial look, it seems like it could address a lot of these scenarios with less complexity.

1 comment

halfcat

KraftyOne 7 months ago

Compared to Celery, DBOS provides a similar queuing abstraction (Docs: https://docs.dbos.dev/python/tutorials/queue-tutorial) DBOS tries to spread out queued tasks among all workers (on DBOS Cloud, the workers autoscale with load), but there isn't yet support for running specific tasks on specific workers. Would love to learn more about that use case!

Compared to Dagster (or Prefect or Airflow), exactly like you said, a DBOS workflow is basically a more flexible and lightweight DAG. The visualization piece is something we're actively developing leveraging OpenTelemetry--look for some cool new viz features by the end of the month! I'm interested in the "retry step 2 only" or "retry from step 2" use cases--would love to learn more about them--we don't currently support them but easily could (because it's all just Postgres tables under the hood). If you're building in this space please reach out, would love to chat!