Comment by duped

3 days ago

> Generating dependence-free subsets for parallel processing.

Unsure how this is defined (*) but graph cutting approaches to concurrent task scheduling is both pessimistic (poor utilization of available resources) and (iirc) NP-hard, so you pay an big cost upfront.

On the other hand, if you know the indegree/outdegree of each node at the time they are visited (meaning the graph is static) you can run Kahn's algorithm concurrently and put each node into a shared work queue. You can optimize that further by having per-thread work queues and stealing between them. Depending on what the nodes represent there are even more optimizations and heuristics, concurrent task scheduling is a hard problem.

* imagine the graph

(a, b) (a, c) (b, d) (c, d)

Is it possible to get nodes b and c in parallel "subsets" in your library?

Yes. It would produce dependence-free subsets. I just ran your sample (assuming a,b means a depends on b).

  Topologically sorted sets: [ { d }  { b c }  { a }  ]
  Topologically sorted list: [ d b c a ]
  Nodes: [ a b c d ]
  Dependency tree:
  [ d ]
    d -> [ b c ]
      b -> [ a ]
        a ->
      c -> [ a ]
        a ->

The dependence-free subset finding is probably not exhausting and optimal. I haven't gone through formal proofing. It's opportunistic and best effort at best currently.

  • How are the subsets defined?

    • At every round of the algorithm, all nodes with 0 in-degree (i.e. they are not depending on anyone) are collected as a dependence-free subset.

      They serve as the root set to the rest of the graph for the current round. The depending nodes reached from root set have their in-degree decremented. When their in-degrees reach 0, they are added to the next root set.

      I'm using double-buffering to maintain the current root set for processing and to collect the next root set for the next round, instead of using a queue as in Kahn's algorithm. At the end of the round, I simply swap the double-buffers. It's very efficient. When the next root set is empty, all nodes have been processed.

      2 replies →