Comment by NitpickLawyer
19 hours ago
A good rule of thumb is that PP (Prompt Processing) is compute bound while TG (Token Generation) is (V)RAM speed bound.
19 hours ago
A good rule of thumb is that PP (Prompt Processing) is compute bound while TG (Token Generation) is (V)RAM speed bound.
No comments yet
Contribute on Hacker News ↗