Comment by llmmadness
12 days ago
Totally! In some of the cases (we used LLMs to help us generate these) the target word is not clear enough for a human either. So for some of these it turns into more of a guessing game than a flinch measurement.
Agreed, the expectation would be that the flinch measurement becomes stronger. If you are interested in making it better feel free to reach out on the repo!
No comments yet
Contribute on Hacker News ↗