Comment by iamflimflam1

1 month ago

Definitely. A lot of what is missing in many discussions is the absolutely essential need to have evals.

The only way to “know” what is the best (or better) approach is to have a significant number of test cases that you can measure performance against.

At the moment, for a lot of people, state of the art is “let’s try a different prompt and see if the answer on my one example is better”

0 comments

iamflimflam1

No comments yet