Comment by enjeyw
10 days ago
I'm building Collie (https://collie.ink/).
It's a tool to help teachers detect student assignments that have been written by AI. Unlike other solutions out there, it's an entire web-based text editor that analyses not just the final assignment, but all the keystrokes used during the writing process.
My theory is that analysing the final text only is a futile struggle - billions are being pumped into making LLM text look more human, so trying to make an assessment off final text alone is guess work at best.
I'm curious what folks think! Especially teachers, devs, and anyone navigating this space...
I can't help but immediately think about a counteracting piece of software, which asks an LLM for variations of a paragraph, or a phrase, or a few synonyms, and types it the way a human would, with pauses, typos, navigation, rearranging pieces via copy-paste, etc.
Not that your software is going to be useless. But as long as there is an incentive to cheat, new and better tools that facilitate cheating will crop up. Something else should change.
Yeah it's a good call out. I think it's a (more) winnable battle though.
For both a keystroke based AI detector, and software designed to mimic human keystroke patterns, performance will be determined by the size of the dataset they have of genuine human keystroke patterns. The detector has an inherent leg-up in this, because it's constantly collecting more data through the use of the tool, whereas the mimic software doesn't have any built in loop to collect those inputs.
Interesting idea! Could someone use the software to train an LLM prompt that will get around it? By learning what passes and what doesn’t and then having the LLM train on that
1 reply →
I got burned by software like this, when I pasted in a essay I transcribed while driving through Whisper, and software like this thought I had pasted AI content lol
1 reply →