← Back to context

Comment by swyx

2 days ago

overall i REALLY like this paper and effort, but this part sounds like a bit of bullshit. they dont have the ability to implement retries and backoffs to deal with rate limits?

Because they used wall clock time, not compute time, flops, or watts, to standardize. 24 hours and 36 hours of compute.

They could build a system which gives them equal compute time by ignoring time spent rate limiting and such, but they chose not to.