Comment by debugnik

20 hours ago

It also modified many of the tests to make them pass in mischievous ways. You can't trust a test suite to catch regressions if the new version doesn't use the same test suite.

7 comments

debugnik

davidatbu 20 hours ago

Do you have some examples?

davidatbu 16 hours ago
Ah, I just learnt that you don't. Jarred's comment saying exactly that: https://news.ycombinator.com/item?id=48133806
- debugnik 8 hours ago
  
  I'll actually concede that, on a slower skim, some changes to the test suite and fixtures that first seemed suspicious to me indeed align with what those tests were doing previously, and I wish I could retract that comment.
  I still think it's not such an impressive test suite as it's being claimed; which, if this actually works out, should say more about Claude's skill than the people driving it.
  
  3 replies →

torben-friis 19 hours ago

I think demonstrating broken behavior in the new build would be interesting if you have a non passing test from the original suite