Comment by mjaniczek

3 months ago

It's entirely happy paths right now; it would be best to allow the test runner to also test for failures (check expected stderr and return code), then we could write those missing tests.

I think you can find a test somewhere in there with a commented code saying "FAWK can't do this yet, but yadda yadda yadda".

It's funny because I'm evaluating LLMs for just this specific case (covering tests) right now, and it does that a lot.

I say "we need 100% coverage on that critical file". It runs for a while, tries to cover it, fails, then stops and say "Success! We covered 60% of the file (the rest is too hard). I added a comment.". 60% was the previous coverage before the LLM ran.