← Back to context

Comment by debugnik

20 hours ago

It also modified many of the tests to make them pass in mischievous ways. You can't trust a test suite to catch regressions if the new version doesn't use the same test suite.

Do you have some examples?

  • Ah, I just learnt that you don't. Jarred's comment saying exactly that: https://news.ycombinator.com/item?id=48133806

    • I'll actually concede that, on a slower skim, some changes to the test suite and fixtures that first seemed suspicious to me indeed align with what those tests were doing previously, and I wish I could retract that comment.

      I still think it's not such an impressive test suite as it's being claimed; which, if this actually works out, should say more about Claude's skill than the people driving it.

      3 replies →

I think demonstrating broken behavior in the new build would be interesting if you have a non passing test from the original suite