Comment by shwouchk

21 days ago

I played around with it quite a bit. it is both impressive and scary. most importantly, it tends to indiscriminately use dependencies from random tiny repos, and often enough not the correct ones, for major projects. buyer beware.

This is something I've noticed as well with different AIs. They seem to disproportionately trust data read from the web. For example, I asked to check if some obvious phishing pages were scams and multiple times I got just a summary of the content as if it was authoritative. Several times I've gotten some random chinese repo with 2 stars presented as if it was the industry standard solution, since that's what it said in the README.

On an unrelated note, it also suggested I use the "Strobe" protocol for encryption and sent me to https://strobe.cool which is ironic considering that page is all about making one hallucinate.

  • >They seem to disproportionately trust data read from the web.

    I doubt LLM's have anything like what we would conceptualize as trust. They have information, which is regurgitated because it is activated as relevant.

    That being said, many humans don't really have a strong concept of information validation as part of day to day action and thinking. Development theory talks about this in terms of 'formal operational' thinking and 'personal epistemology' - basically how does thinking happen and then how is knowledge in those models conceptualized. Learning Sciences research generally talks about Piaget and formal operational before adulthood and stages of personal epistemology in higher education.

    Research consistently suggests that about 50% of adults are not able to consistently operate in the formal thinking space. The behavior you are talking about is also typical of 'absolutist' epistemic perspectives where answers are right or wrong and aren't meaningfully evaluated - just identifed as relevant or not. Evaluating the credibility of information is that it comes from a source that is trusted - most often an authority figure - it is not the role of the person knowing it.

  • > ... sent me to...

    Oh wow, that was great - particularly if I then look at my own body parts (like my palm) that I know are not moving, it's particularly disturbing. That's a really well done effect, I've seen something similar but nothing quite like that.

  • >On an unrelated note, it also suggested I use the "Strobe" protocol for encryption and sent me to https://strobe.cool which is ironic considering that page is all about making one hallucinate.

    That's not hallucination. That's just an optical illusion.

Thanks for flagging this! That isn't a behavior I've seen before in testing, and I'd love to dig into it more to see what's happening.

Would you be able to drop me an email? My address is my HN login @github.com.

(I work on the product team for Copilot coding agent.)

Given that PRs run actions in a more trusted context for private repos, this is a bit concerning.

  • As we've built Copilot coding agent, we've put a lot of thought and work into our security story.

    One of the things we've done here is to treat Copilot's commits like commits from a first-time contributor to an open source project.

    When Copilot pushes changes, your GitHub Actions workflows won't run by default, and you'll have to click the "Approve and run workflows" button in the merge box.

    That gives you the chance to run Copilot's code before it runs in Actions and has access to your secrets.

    (Source: I'm on the product team for Copilot coding agent.)

So like the typical junior developer, then.

  • No, lol. Even the enthusiastic junior developer would go around pestering people asking if the dependency is OK.

  • No, not at all. Why do people keep saying shit like these thought terminating sentences. Try to see the glass of Kool Aid please. People are trying to understand how to communicate important valuable things about failure states and you're advocating ignorance.

    • Because the marketing started with "This is literally the singularity and will take over everything and everyone's jobs".

      Then people realized that was BS, so the marketing moved on to "This will enhance everyone's jobs, as a companion that will help everyone".

      People also realized that was pure BS. A few more marketing rebrands later and we're at the current situation where we try to equate it to the lowest possible rung of employee they can think of, because surely Junior == Incompetent Idiot You Can't Trust Not To Waste Your Time†. The funny part is that they have been objectively and undeniably only getting better since the early days of the hype bubble, yet the push now is that they're "basically junior level!". Speaks volumes IMO, how those goal posts keep getting moved whenever people actually use these systems in the real work.

      ---

      † IMO every single Junior I've ever worked with has been some of the best moments of my career. It allowed space for me to grow my own knowledge, while I get to talk to and help someone extremely passionate if a bit overeager. This stance on Juniors is, frankly, baffling to me because it's so far from my experiences with how they tend to work, oftentimes they're a million times better than those "10x rockstars" you hear about all the time.

      1 reply →