Comment by zozbot234

1 month ago

I don't think anyone knows for sure how much mileage/scalability LLMs have. Given what we do know, I suspect if you can afford to spend more compute on even longer training runs, you can still get much better results compared to SOTA, even for "simple" domains like text/language.

1 comment

zozbot234

airstrike 1 month ago

I think we're pretty much out of "spend more compute on even longer training runs" atp.