Comment by dang

1 day ago

The LLM posts that I'm looking at are definitely coming from normal-user IP addresses. There are exceptions, but the rate of those doesn't seem higher than usual. Outright bot/agent posting, as far as we can tell (and we may be wrong!) seems to have down-ticked since the flurry earlier this year (openclaw and so on).

My gut feeling is that this issue isn't much affected by voting rings, which is too bad, because we have a lot of experience with those. If all that was needed here was another round of work on the ring detector, I would be less worried.

It's a moving and blurry picture, but judging by the users that tomhow and I interact with—which is a lot of users! though still only a small sample—the overwhelming majority of these posts are coming from real people with good intentions, who have no idea of the mismatch between what they're posting and the culture of the community.

> coming from normal-user IP addresses

This is the standard now for astroturfing online. Build up a profile over time with varied interactions, sometimes over years, and then sell it for a few hundred dollars via blackhatworld. I've not seen hn listed but reddit definitely follows this pattern.

If you think the IPs are normal, you can check if people are proxying by looking at DNS connecting IP (they may not have proxied UDP), SIMD score (server CPUs cluster differently to consumer), residential proxy lists (there are a bunch of these), invalid webgpu setups, etc. Maybe this kind of detection is against HN way of doing things but I've definitely seen recaptcha on the login before and it employs a bunch of these checks. Happy to help!

  • Adding to this some of those proxied connections will be HTTP/1.1 and not HTTP/2.0 like normal clients. Sometimes the MSS of their TCP SYN packets will be just a little lower than 1460. Some of them are also missing the client header for sec-fetch-mode. Blocking HTTP/1.1 to the non API port/url should slow down some of the nonsense. Many API clients still use HTTP/1.1.

    In NGinx as an example in the Location for the non-API url:

        if ($server_protocol != HTTP/2.0) { return 403 'Browser Error.'; }
    
        if ($http_sec_fetch_mode !~ (cors|no-cors|navigate) ) { return 403 'Error: Flux Capacitor Under-Current.'; }