← Back to context

Comment by _0ffh

1 year ago

The trick is to make sure the recursive dependency stays linear, that's how you enable parallel training.

0 comments

_0ffh

Reply

No comments yet

Contribute on Hacker News ↗