Comment by djsjajah
15 hours ago
> including all previous experiments
How far back do you go? What about experiments into architecture features that didn’t make the cut? What about pre-transformer attention?
But more generally, why are you so sure that they team that built Gemini didn’t exclusively use TPUs while they were developing it?
I think that one of the reasons that Gemini caught up so quickly is because they have so much compute at fraction of the price of everyone else.
No comments yet
Contribute on Hacker News ↗