Comment by pertymcpert
5 months ago
Indeed. I wonder what the architecture for Claude and Grok3 is. If they're still dense models was the MoE excitement with R1 was a tad premature...
5 months ago
Indeed. I wonder what the architecture for Claude and Grok3 is. If they're still dense models was the MoE excitement with R1 was a tad premature...
No comments yet
Contribute on Hacker News ↗