Comment by wiz21c
3 months ago
SWE-Bench is disappointing not because it is lower than Claude, but because improving on all other domains of knowledge didn't help. So does this mean that this is actually a MoE model in the sense that one expert doesn't talk to the other ?
No comments yet
Contribute on Hacker News ↗