Comment by iammjm

3 days ago

I believe the issue of proving who is and who isn't really human on the Internet will be a really important issue in the coming years, especially without sacrificing people's right to privacy and anonymity in the process.

I hope I'm wrong but I don't think a privacy friendly alternative is going to exist. It's going to go the way of show me your drivers license to use my site.

  • Why wouldn't criminals like they do now just use stolen identities? If someone verifies they are a person that doesn't mean they're not leaving their PC on with some AI that uses their credentials either.

    • The point of these systems is not to ban any possibility of fake accounts. The point is to add friction so that creating accounts is harder than banning them, so criminals can’t recreate them at scale. Otherwise bans take seconds to overcome and a single person can run 10000 automated identities.

  • Invite trees approximately solve this problem. I don't need to know who you are to know that someone in good standing in the community invited you.

    • And that if you misbehave you get booted out and whoever invited you gets dinged. If they get dinged enough they become a leaf rather than a branch.

  • No credential will be sufficient, this is basically an unsolvable enforcement problem. That doesn't obviate the utility of rules and norms, but there's no airtight system which will hold back AI generated content.

    • Verifiable credentials have been an idea for a long time now. It wouldn't be that hard to solve. Sign everything you post with a verifiable credential. Implement support on all social media sites. The question is whether the forum implementers, governing bodies, and social media site owners want to try to build a solution like this or not.

      3 replies →

  • I feel like we need a distributed system/protocol that allows people to have pseudonyms not linked to their real identity, but with a shared reputation/trust score, so if you’re a bad actor using a pseudonym your real identity and all your other sock puppets are penalized too.

    I know very little about this but sense that some combination of buzzwords like homomorphic encryption, zk-snarks, and yes, blockchains could be useful.

    Of course this would present problems if any of your identities were ever compromised and your reputation destroyed.

    • Driving everything by reputation-weighted identities just creates echo-chambers you then cannot escape.

      The most useful time for the blowhard spout off at me is at the moment it makes me most uncomfortable. Because the blowhard probably has a valid point at some level, he’s just being an ass about it.

      When we meet that moment with discipline, are able to identify and respond to the kernels of truth and ignore the chaff belted out, focus on the merits of the argument irrespective of the source of an adversarial viewpoint, we thrive.

      I like the blowhards just the way they are, unruly and insolent.

      1 reply →

  • Problem is, if a token is anonymous, then it follows that it can be bought and sold. Which breaks the original use case of the token, right?

  • That is exactly what will happen. The sad thing is, it needs to happen. I've found myself advocating for this lately, when 10 years ago, I wouldn't have even considered taking that position.

    If Web3-like session-signing had taken off enough to become OS or even browser-native, we would have had a fighting chance of remaining mostly anonymous. But that just didn't happen, and isn't going to happen. Mostly because fraud ruined Web3.

I think we're going to have to make some choices.

A completely anonymous stranger has no way to prove that they're human that can't be imitated by an AI. We've even seen that, in some cases, AIs can look more human to humans than real humans do.

The only solution I can think of to that problem is some sort of provenance system. Even before AI, if some random person told me a thing, I'd ignore them; If my most trusted friend told me something, I'd believe them.

We're going to need a digital equivalent. If I see a post/article/comment I need my tech to automatically check the author and rank it based on their position in my trust network. I don't necessarily need to know their identity, but I do need to know their identity relative to me.

  • Reputation tracking is the key. The most simple option is open-invite invite-only spaces: Any user can invite more users, but only users with an invite can participate. Most Discord servers work like this, secret societies like the Oddfellows do, as does the other site.

    If you keep track of the invite tree, you can "prune" it as needed to reduce moderation load: low quality users don't tend to be the source of high-quality users, and in the cases where they are, those high quality users tend find other people willing to vouch for them faster than their inviter catches a ban.

    • The open-invite system works well in many cases. It works particularly well in-person but even there you can get drift over time. Our fraternity unanimously agreed on every single initiate who joined; the cohort today is still very different from the one 20 years ago.

      In online systems the scales quickly get too big for open-invite. There needs to be a way to automatically update the trust network at a fine grain.

      The one that jumps to mind is an inference system; when I +/- a comment, I'm really noting that I trust or distrust the author. It can be general or on a specific topic (eg I trust the author to tell the truth or I trust the author to make me laugh). I could also infer that other people with similar trust patterns are likely trustworthy. And I could likely infer that people who are trusted by people I trust are trustworthy.

    • >secret societies like the Oddfellows do

      yes and they're all full of suckers. In the best case which is already bad you get a pretentious online night club like Clubhouse, in the worst case you get Epstein's island.

      These walled off societies always attract people who are drawn to exclusivity, are run like dystopian island communities or high school cliques and tend to, in a William Gibon 'anti-marketing', way be paradoxically even more vapid.

      No you need actual open access and reputation systems. A good blueprint is something like well functioning academic communities. It's a combination of eliminating commercial motives, strict rules, high importance on reputation and correctness, peer review, and arguably also real identities and faces.

I don't think the real issue is LLM posts. The issue with low quality on the Internet has always been quantity. The problem always has been humans who post too much, humans that use software to post too much, and now it's humans who use LLMs to post too much.

The problem with a medium that is completely free and unrestricted is that whomever posts the most sort of wins. I could post this opinion 30-40 times in this thread, using bots and alternative accounts, and completely move the discussion to be only this.

Someone using an LLM is craft a reply is not a problem on it's own. Using it craft a low-effort reply in 3 seconds just to get out is the problem.

  • > Someone using an LLM is craft a reply is not a problem on it's own.

    No, someone using an LLM to craft a reply is a problem in its own. I want to hear what a human has to say, not a human filtered through a computer program. No grammar editing, nothing. Give me your actual writing or I'm not interested.

    • Do you though? Like what real difference does it make to you? Can you even tell if this has been passed through an LLM or not? If you can't tell, why does it matter?

      I don't want to be robo-slopped at en masse or be fed complete fabrications but neither of those actually require an LLM. If you're going to use an LLM to gather your thoughts, I don't see a problem with that.

      8 replies →

  • If you had the LLM write the comment, then it wasn't your thoughts.

    I sometimes wonder if people aren't forgetting why we're on this platform.

    The goal is to have an interesting discourse and maybe grow as a human by broadening your horizon. The likelihood of that happening with llms talking for you is basically nil, hence... Why even go through the motion at that point? It's not like you get anything for upvotes on HN

    • > If you had the LLM write the comment, then it wasn't your thoughts.

      But what if I provided the LLM my thoughts? That's actually how I use LLMs in my life -- I provide it with my thoughts and it generates things from those thoughts.

      Now if I'm just giving it your comment and asking it to reply, then yes, those aren't my thoughts. Why would I do that? I think the answer goes back to my original point.

      If I'm telling you my thoughts and then you go and tell a friend those thoughts, would you say those are still my thoughts even though I wasn't the one expressing them directly to your friend?

      4 replies →

  • Amusingly your comment carries some of the tropes of AI authorship ("is not a problem on it's own....is the problem") but it's not shaped like a profound insight is being discovered in every line is what makes it human.

    How much of AI writing will pass under the radar when the big companies aren't all maximizing to generate the most engagement hacking content in a chatbot UI? Maybe it'll still stand out for being low quality, but I'm not sure. There's lots of low quality human authored content.

    Not sure where my comment is going, I just kinda rambled.

    • > Amusingly your comment carries some of the tropes of AI authorship

      It was trained on 30 years of my posts on the Internet, I'm sure some part of it sounds just like me.

I'm going to guess we'll eventually settle onto a psuedo-anonymous cert system like HTTPS where some companies are entrusted with verification and if that company says "That's definitely a human" it'll fly - not a great solution, of course, but I really can't see a non-chain-of-custody/trust based approach to the problem and those might only slightly compromise anonymity in optimal scenarios but some compromise is inevitable.

Will it be? Or is the solution to move to smaller, trusted networks where there's less need for proof. Unfortunately I think the age of large scale open discussion forums like HN is coming to an end.

  • I think this is the most likely and best path. There's no stopping the flood of bots, the dead internet theory is beyond just a theory at this point.

    Best we can do, for the internet and ourselves, is to move away from it and into smaller networks that can be more effectively moderated, and where there is still a level of "human verification" before someone gets invited to participate.

    I don't like what that will do to being able to find information publicly, though. The big advantage of internet forums (that have all but disappeared into private discords) is search ability/discoverability. Ran into a problem, or have a question about some super niche project or hobby? Good chance someone else on the net also has it and made a post about it somewhere, and the post & answers are public.

    Moving more and more into private communities removes that, and that is a great loss IMO.

    • > Moving more and more into private communities removes that, and that is a great loss IMO

      It is a great loss. Unfortunately this is a result of unchecked greed and an attitude of technological progress at any cost. Frankly we enabled this abuse by naively trying to maintain a free and open internet for people. Maybe we should have been much more aggressively closed off from the start, and not used the internet to share so freely.

  • The utility of those larger sites is coming to an end, but most people aren't discerning or ambitious enough to leave and seek out the smaller places you mentioned. Places like this will remain but will join Facebook, Reddit, and Twitter as shadows of their prior useful selves. The smaller, better sites won't have to worry about attracting the masses and therefore worsening, because the masses have finally settled.

> I believe the issue of proving who is and who isn't really human on the Internet will be a really important issue in the coming years

On a site like HN it's kinda easy to vet for at least those that already had thousands of karma before ChatGPT had its breakthrough moment a few years ago.

Now an AI could be asked to "Use my HN account and only write in my style" and probably fool people but I take it old-timers (HN account wise) wouldn't, for the most part, bother doing something that low. Especially not if the community says it's against the guidelines.

If it becomes one, then that will be the end of sites like Hacker News.

This site, at its core, is fundamentally too low-bandwidth, too text-only, and too hands-off-moderated to be able to shoulder the burden of distinguishing real human-sourced dialog from text generated by machines that are optimized to generate dialog that looks human-sourced. Expect the consequence to be that the experience you are having right now will drastically shift.

My personal guess: sites like this will slop up and human beings will ship out, going to sites where they have some mechanism for trust establishment, even if that mechanism is as simple and lo-fi as "The only people who can connect to this site are ones the admin, who is Steve and we all know Steve, personally set up an account for." This has, of course, sacrificed anonymity. But I fundamentally don't see an attestation-of-humanity model that doesn't sacrifice anonymity at some layer; the whole point of anonymity on the Internet was that nobody knew you were a dog (or, in this case, a lobster), and if we now care deeply about a commenter's nephropid (or canid) qualities, we'll probably have to sacrifice that feature.

I'd rather keep the feature, pesonally.

Sam Altman would love to sell you a solution to the fire that he dumped gasoline on.

https://en.wikipedia.org/wiki/World_(blockchain)

  • This issue (human attestation) is the product of these AI companies. They are poisoning the well, only to sell the cure. This may not have initially been the plan of many of these companies, but it is the eventual end goal of all of them. Very similar to war profiteers, selling both the problem and the solution simultaneously has yet to be illegalized, but has long been masterfully capitalized, and will be vigourously because nobody will stop it.

    Years ago (around 2020, when GPT-2 and 3 became publicly available) I noticed and was incredibly critical of how prevalent LLM-generated content was on reddit. I was permanently banned for "abusing reports" for reporting AI-generated comments as spam. Before that, I had posted about how I believed that the the fight against bots was over because the uncanny valley of text generation had been crossed; prior to the public availability of LLMs, most spam/bot comments were either shotgunned scripts that are easily blockable by the most rudimentary of spam filters, generated gibberish created by markov chains, or simply old scraped comments being reposted. The landscape of bot operation at the time largely relied on gaming human interaction, which required carefuly gaming temporal-relevance of text content, coherence of text content (in relation to comment chains), and the most basic attempt at appearing to be organic.

    After LLMs became publicly available, text content that was temporally, contextually, and coherently relevant could be generated instantly for free. This removed practically every non-platform-imposed friction for a bot to be successful on reddit (and to generalize, anywhere that people interact). Now the onus of determining what is and isn't organic interaction is squarely on the platform, which is a difficult problem because now bot operators have had much of their work freed up, and can solely focus on gaming platform heuristics instead of also having to game human perception.

    This is where AI companies come in to monetize the disaster they have created; by offering fingerprinting services for content they generate, detection services for content made by themselves and others, and estimations of human authenticity for content of any form. All while they continue to sell their services that contradict these objectives, and after having stolen literally everything that has ever been on the internet to accomplish this.

    These people are evil. Not these companies - they are legal constructions that don't think or feel or act. These people are evil.

  • It's not clear to me how this is verifiable without constant hardware supervision. Even that'll get cracked, just like DVD encryption back in the day.

    You almost need dedicated hardware that can't run any other software except a mechanical keyboard and make it communicate over an analog medium - something terribly expensive and inconvenient for AI farms to duplicate.

    • I started promoting the idea of hardware verification about 6 years ago. Didn't get any traction and I doubt I ever will.

      I think Apple is the only company that would even be able to do that. You have to control the full stack to the pixels or speaker.

    • One physical robot with four wheels, a camera, and a 101 up/down "fingers" to match the keyboard can roll between physical machines and type on mechanical hardware keyboards. This brings the ceiling of how many accounts you can control down to the number of computers you have, but that's not a high price to pay.

Maybe it will push people to seek out more in-person interactions, which would be a good thing.

you could sell physical items at any store where you have to show your ID and you get one for the age group you are.

that kills two birds with one stone, you can then show everywhere online you are human and how old you are without the services needing any personal information about you, and the sellers don't know what you use that id tag for.

  • People who are posting AI comments or setting up AI bots are... people. They can show their ID. If a website owner doesn't have a way to ban that specific human and the bad guy can always get another voucher, it's sort of meaningless.

    In fact, even if you can ban the human for life, I'm not sure it solves anything. There are billions of people out there and there's money to be made by monetizing attention. AI-generated content is a way to do that, so there's plenty of takers who don't mind the risk of getting booted from some platform once in a blue moon if it makes them $5k/month without requiring any effort or skill.

  • Perhaps not only just show your id to get your "Over age X verification object", but your ID also gets irreversibly altered (like a punch card) that makes it one-time-use only.

    That might make it less likely someone would ever sell it because to get a new one might take a very long "cool-down" time and it'd severely hamper the seller.

  • what’s to keep people from selling or giving away those id tags? seems like a nefarious entity could buy them in bulk

I like Mitchell's Vouch idea. At the end of the day, it's all about trust. Anything else is an abstraction attempting to replicate some spectrum of trust.

https://github.com/mitchellh/vouch

  • I think we’ll see a return to smaller groups and implementing a lot of systems the way we do it IRL. I think you could definitely do a more fine-grained system that progressively adds less score to contacts the further away they are. In combination with some type of accumulating reputation system, you’d have both a force to keep out unknown IDs, but also a reason for one to stick to their current ID even though it’s anonymous.

    Adding this type of rep system would destroy a lot of what is so cool about the internet though. There’d probably be segregation based on rep if it’s very visible, new IDs drowning in a sea of noise. Being anonymous but with a record isn’t the same as posting for the very first time as a completely blank identity and still being given an audience. Making online comms more like real life would alleviate some problems but would also lose part of the reason they’re used in the first place. I don’t see much any other way to do it besides maybe a state-provided anonymous identity provider (though that’s risky for a number of reasons), but it’s going to be sad to see things go.

> especially without sacrificing people's right to privacy and anonymity in the process

I'm afraid the ship has sailed on this one. What other solutions have you heard of apart from the dystopian eyeball-scanning, ID-uploading, biometrics-profiling obvious ones?

(knowing that of course, neither of those actually solve the problem)