Comment by dec0dedab0de

3 days ago

I really don’t like overloading pipes like this. I would rather chain methods like how the django orm does it.

you could reassign every line, but it would look nicer with chained functions.

  pipeline = task(get_data, branch=True)
  pipeline = pipeline | task(step1, workers=20)
  pipeline = pipeline |  task(step2, workers=20)
  pipeline = pipeline |  task(step3, workers=20, multiprocess=True)

edit:

I would be tempted to do something like this:

  steps = [task(step1, workers=20),
           task(step2, workers=20),
           task(step3, workers=20, multiprocess=True)]
  pipeline = task(get_data, branch=True)

  for step in steps:
      pipeline =   pipeline.__or__(step)

According to the docs, | is syntactic sugar for the .pipe method.

  pipeline = task(get_data, branch=True).pipe(
      task(step1, workers=20)).pipe(
      task(step2, workers=20)).pipe(
      task(step3, workers=20, multiprocess=True))

That's probably the chained method approach for those with this preference.

This style looks pretty good to me:

    pipeline = task(...)
    pipeline |= task(...)

So does this style:

    steps = [task(...), task(...)]
    pipeline = functools.reduce(operator.or_, steps)

But it appears you can just change "task" to "Task" and then:

    pipeline = pyper.Pipeline([Task(...), Task(...)])