Comment by NitpickLawyer
1 day ago
A good rule of thumb is that PP (Prompt Processing) is compute bound while TG (Token Generation) is (V)RAM speed bound.
1 day ago
A good rule of thumb is that PP (Prompt Processing) is compute bound while TG (Token Generation) is (V)RAM speed bound.
No comments yet
Contribute on Hacker News ↗