Comment by embedding-shape
5 hours ago
Yeah, I suppose it is, I haven't finished my dissertation on it yet, I'll get right on that :)
Throughout them being available I've tried them every now and then, both with AI generated trash and my own pre-LLM writings, and had about 0% success in getting them to accurately report what it actually is. Maybe my writing style and what specific LLM you use matters a lot, I'm sure these platform's training data is mostly from the mainstream models so as soon as you use anything else, they'll get trivially lost. But again, I don't have any evidence and proof behind this, based only on when I've tried to evaluate them myself in the past.
No comments yet
Contribute on Hacker News ↗