Comment by peab
3 days ago
I think it's more so that they push changes quickly without exhaustively testing. Compare that to Google, who sits on a model for years for fear of hurting their reputation, or OpenAI and Anthropic who extensively red teams models
Why does Grok keep "failing" in the same directional way if its just a testing issue?