Comment by ben_w
7 months ago
> Since the risks are blindingly obvious
Blindingly obvious to thee and me.
Without test results like in the o1 report, we get more real-life failures like this Canadian lawyer: https://www.theguardian.com/world/2024/feb/29/canada-lawyer-...
And these New York lawyers: https://www.reuters.com/legal/new-york-lawyers-sanctioned-us...
And those happened despite the GPT-4 report and the message appearing when you use it that was some variant — I forget exactly how it was initially phrased and presented — of "this may make stuff up".
I have no doubt there's similar issues with people actually running buggy code, some fully automated version of "rm -rf /", the only reason I'm not seeing headlines about it is that "production database goes offline" or "small company fined for GDPR violation" is not as newsworthy.
No comments yet
Contribute on Hacker News ↗