Comment by falloutx

1 month ago

Because its generated by an AI. All of their posts usually feel like 2 sentences enlarged to 20 paragraphs.

3 comments

falloutx

At this point, this is mostly for PR stunts as the company prepares for its IPO. It’s like saying, “Guys, look, we used these docs to make our models behave well. Now if they don’t, it’s not our fault.”

GoatInGrey 1 month ago
That, and the catastrophic risk framing is where this really loses me. We're discussing models that supposedly threaten "global catastrophe" or could "kill or disempower the vast majority of humans." Meanwhile, Opus 4.5 can't successfully call a Python CLI after reading its 160 lines of code. It confuses itself on escape characters, writes workaround scripts that subsequent instances also can't execute, and after I explicitly tell it "Use header_read.py on Primary_Export.xlsx in the repo root," it'll latch onto some random test case buried in the documentation it read "just in case", and prioritize running the script on the files mentioned there instead.
It's, to me, as ridiculous as claiming that my metaphorical son poses legitimate risk of committing mass murder when he can't even operate a spray bottle.
- rednafi 1 month ago
  
  If they advertised these LLMs as just another tool in your repertoire, like Bash, imagine how that would go.