Comment by userbinator
6 days ago
This looks like it'll be useful for CAPTCHA purposes.
According to the researchers, “the triggers are not contextual so humans ignore them when instructed to solve the problem”—but AIs do not.
Not all humans, unfortunately: https://en.wikipedia.org/wiki/Age_of_the_captain
In all fairness most developers are equally impacted by this.
This comes up frequently in a variety of discussions most notably execution speed and security. Developers will frequently reason upon things to which they have no evidence, no expertise, and no prior practice and come up with invented bullshit that doesn't even remotely apply. This should be expected, because there is not standard qualification to become a software developer, and most developers cannot measure things or follow a discussion containing 3 or more unresolved variables.
I wonder what the role of RLHF is in this. It seems to be one of the more labor-intensive, proprietary, dark-matter aspects of the LLM training process.
Just like some humans may be conditioned by education to assume that all questions posed in school are answerable, RLHF might focus on "happy path" questions where thinking leads to a useful answer that gets rewarded, and the AI might learn to attempt to provide such an answer no matter what.
What is the relationship between the system prompt and the prompting used during RLHF? Does RLHF use many kinds of prompts, so that the system is more adaptable? Or is the system prompt fixed before RLHF begins and then used in all RLHF fine-tuning, so that RLHF has a more limited scope and is potentially more efficient?
It feels like reading news nowadays. Lots of noise, nothing relevant.
I tried the Age of the Captain on Gemini and ChatGPT and both game smarmy answers of "ahh this a classic gotcha". I managed to get ChatGPT to then do some interestng creative inference but Gemini decided to be boring.
Cool example in that link, thanks!
I don't expect an elementary student to be programming or diagnosing diseases either. Comparing the hot garbage that is GenAI to elementary kids is a new one for me.