Ask HN: Did HN just start using Google recaptcha for logins?

3 years ago

In ten years I've never been asked to solve a captcha to login in. Is this new? What happened?

I know that my input into conversations is not some critical feature of HN or anything, but this is enough of a barrier to keep me from bothering to login on most occasions.

Seems odd to enable google to track users logging into HN, but maybe it's always been this way and for some reason recaptcha is just flagging accounts from my network today.

No recent changes, but we do sometimes turn captchas on for logins when HN is under some kind of (possible) attack or other. That's been happening for a few hours. Hopefully it goes away soon.

Btw I also fume when I have to work as an unpaid manual image recognizer, so I'm open to alternatives.

  • Aside from 3rd party code perhaps one middle-of-the-road idea would be a table of a few hundred factoids and then code that makes multiple choice checkbox factoids like

    - Select everything that is a color im sure there are more clever open-ended questions and maybe sometimes switch up "is" with "is not".

    - Red

    - Blue

    - Monkey

    - Violet

    - Armchair

    People say that bots can learn such things but if every site had their own in-house tool then bots would have to keep track of thousands of site specific puzzles. Each site could even rotate through a dozen sets of different puzzle types and pause the ones that get learned. This would avoid sending cookies to a third party or depending on 3rd party code thus mitigating some corporate capture.

    Bonus complexity: Don't use Alpha-Numeric characters. Use something like "figlet" [1] and cycle through a few of its ASCII art fonts.

    [1] - https://github.com/xero/figlet-fonts

  • Hi dang, I'm not sure you're still going to read this message since it's been many hours.

    First, I'm sorry to hear HN was under attack. That's never fun.

    Second, I understand your reasons for temporarily turning on the CAPTCHA, even though as a user I really dislike it - especially reCAPTCHA.

    Given the latter, I hope you will consider alternatives. Regardless though, it would be nice to add a message to the login page explaining that the CAPTCHA is temporary because the website is under attack. That would allow me to keep 3rd-party stuff blocked by uBO on the login page and still know what's going on. I would probably just keep the pages I'm interested in on a tab and come back to them later, when the CAPTCHA is gone.

    In any case, as always, thanks for your work keeping this forum alive and healthy.

  • One concern with Google Recaptcha on HN is that it seems a good number of HN users want to be pseudonymous, possibly including towards Google. Always-perfect browser OPSEC is hard in practice.

    (Condolences on the attack/headache.)

  • It looks like the attack is login based since that's where your captcha is. Allow a single captcha-free attempt to login successfully from a /24. If the login fails then put the /24 on captcha for X hours. That way most login attempts that are legit won't see the captcha. Also, HN crowd I think prefers hcaptcha.

    Lastly, what I would do is have users pick a login image, in addition to the password login, they have to pick a correct image in addition to password.So it would still be the process I suggested except a failed login is allowed one time so long as the correct login image is selected. Also, the login images will be slow to load during times of attack on purpose to identify clients that are guessing before the image is served and to slow down their attack. I would also maintain a list of IP+UA that have repeatedly logged in succesfully to exempt or prioritize them depending on the attack.

  • > we do sometimes turn captchas on for logins when HN is under some kind of (possible) attack

    I don't think people are disputing the necessity, just the mechanism used.

    The other services (hCaptcha) are effectively drop-in replacement with minimal code changes.

  • I have an idea for a button that will slow down bots while being less inconvenient for humans.

    I'll send details in an email.

  • I vote "What's the output of the following Arc snippet?"

    Be sure to include a few macros, otherwise the JS crowd will still be able to reverse engineer their way in.

  • How about if Dang assesses our humanity? That way we don't have to do image recognition stuff and neither does Dang! A win-win if I say so!

  • If possible, implement WebAuthn even if only for human verification.

    Bots will not have access to TouchID, Windows Hello, or a Yubikey but most humans have one of those in the device in front of them right now.

    Fallback to captcha for edge cases, but then at least /most/ people can skip it.

    Example: https://cloudflarechallenge.com/

    • Those can all easily be emulated in software, if you're determined enough.

      There's nothing about the WebAuthn protocol that forces hardware backed key storage, other than everyone collectively agreeing it's a good idea. A bot author would just ignore that.

      Firefox already includes this functionality, gated by flag (security.webauth.webauthn_enable_softtoken).

      1 reply →

  • How about genuinely-long delays between login attempts? 5 seconds slows down a bot, 15-30 seconds could make many login attacks unrealistic.

    Also: OTP 2FA?

    • It’s not easy to tell two login attempts are from one bot. This kind of workaround unfortunately doesn’t work in practice. Otherwise of course this whole problem wouldn’t exist.

      8 replies →

  • I'd love to know more. Historically, what kind of attacks do you see? What is their goal or what do they get out of attacking HN?

  • Unfortunately, this breaks apps, Materialistic in my case.

    • Yes, the mobile apps are all third-party and this is one of the downsides.

      I'll whitelist your account for now (i.e. until the server restarts). If anyone else wants that, email hn@ycombinator.com and I'll do it as soon as I'm back online.

      (It looked like the attack had died down but then it un-died back up again)

      1 reply →

What boggles me about this is:

I do NOT consent to working for free for Google to train their AI.

I'd be willing to solve any CAPTCHA the product of which would be open source, or even useless.

But Google is a for-profit company which uses the solutions to create proprietary software and profit off of it, they won't pay me, and I have no way to opt-out of working for them because the most useful places of the Internet use their CAPTCHAs.

(Yes, I can intentionally put wrong solutions into their CAPTCHAs to poison their data, but I'm afraid they get so many valid solutions that they can just calculate the wrong ones out.)

  • I'm pretty sure Google's AI has already reached the information theoretic limits for recognizing fire hydrants etc. so you're not really training it anymore

    What bothers me about recaptcha (other than the obvious first order task) is that I believe it's used to penalize people who don't let google track them, and by extension to make other browsers look worse. It's an abuse of their market power.

    • I am not sure that's the "intent" but it sure is a suspiciously advantageous (for them) side effect.

      Like how I gave up using protonmail because my emails kept getting classified as spam by anyone using gmail or gmail-backed organizational email.

      1 reply →

  • There is basically no chance the captchas are actually being used for generating training data at this point. The puzzles have not changed for ages. Like, five years? How many billions (trillions?) of labels do you think they have for buses and traffic lights at this point?

    If there was an economic value to using captcha solutions for labeling, somebody would be rotating novel tasks into the mix. But they don't seem to be.

    (And if the goal of running the service was to generate labels, they would not have built solutions to make it possible to pass the captchas without a puzzle, like recaptcha v3.)

    So rest assured, your work in solving the captchas is totally b useless, just like you wanted!

    • Lately I've seen captchas that ask me to identify things in images that are clearly generated with AI. Like frogs without backs.

      I think at this point it is clear we are not training image recognition so much as providing them with free scoring for their image generation algorithms.

      2 replies →

  • > I do NOT consent to working for free for Google to train their AI.

    Its not for free.

    In this case, you get access to HN when it is under attack.

    If you don’t consent to those terms, that's your choice, you can wait and come back later.

    • Or we can complain, suggest alternatives, and hope that it motivates a change. Hacker News is, after all, a place for conversation—people are entitled to express an opinion.

      1 reply →

  • That's an interesting take. Everything costs money. You know the reason why the CAPTCHA service is free is because they have value in the results of the CAPTCHA, right? You're not viewing ads or paying for this service. I'd prefer not to help Google either, but nothing is truly free.

    • Do you know how ReCAPTCHA started? Digitalizing old analog books. Probably just as commercial, but it feels better than training a ML algorithm for an international conglomerate.

  • If your goal is to avoid Google using your data, putting in bad data that is filtered out accomplishes that, right?

    I don't personally have an opinion on HN using the captcha, but their reasoning is pretty obvious, and almost certainly comes from a good place (reducing any spam on the site). That said, you're welcome to your opinions, it just seems like you have an option, based on your stated goal.

    • Even if it does accomplish that they will still have coaxed me into doing work for them even though I'm not consenting to working for Google.

      Consider it like this:

      If someone forced you to do physical work against your will, you wouldn't like it any more just because they throw away the product of your labor in the end.

      It would just make it more obscene.

      1 reply →

  • If you don't consent to it then don't fill it out. Plus CAPTCHAs haven't been used for ML training for many years now.

  • This just shows how little consent-based ethics matter (they break down immediately when the other party simply defects).

Are there no viable alternatives more respectful of privacy than Google's recaptcha? Seems like an anti-user choice to me.

Upon upvoting this submission, I was prompted to login which included a captcha. Possibly due to use of a VPN. Are you using a VPN?

  • I currently avoid VPNs, since it is my understanding using them can collapse all your Tor Browser circuits, ruining the utility of the tool.

    https://web.archive.org/web/20211120193211/https://matt.trau...

    If you need to geo-shift or pirate or something less risky than being a literal dissident, I recommend being VERY careful not to do it where you sleep in case you slip up.

    (And remember: your DNS queries go to the exit note or the VPN, not the ISP.)

  • I do often use a VPN, but not this time, I'm just connecting from MIT's network.

    As Dang says it sounds like it's just a short term attack mitigation, glad to hear it's not intended as a permanent feature.

  • I just got it and I'm not on a VPN. Would be nice of dang or someone just announced that this is in place now.

Hi Folks!

I was not prompted to do a CAPTCHA logging in on the clearnet, but this may be my last batch of posts I do if that ever changes... for I will never consent to asking GOOG if I can post here.

Let us set aside the myriad of issues with visual CAPTCHAs and how they exclude folks with disabilities such as blindness.

There are other solutions like Hcaptcha[1] that do not use GOOG, a company which has strayed so far from it's "Don't be evil" mission that they went from supporting Mozilla via the search deal to moving the Chrome team into the same building, poaching key employees, and aggressively pushing folks so young they can't ask for help without violating COPPA to switch to a browser that would allow them to monitor them from cradle to the grave.

I greatly sympathize with the goal of an authentic dialog... trust me.

But using GOOG to accomplish it is not going to do that.

(The true threats to HN, like any democratic space, come not from automatons, but human beings. Only when you stop giving undue attention to the wrong... metrics... will you find the ecstatic truths you claim to seek.)

- "Greg"

I use Safari almost universally, and therefore can't pass a recaptcha challenge. For the moment I remain logged in, but I suppose once the cookie dies, I won't be able to get back here.

  • > I use Safari almost universally, and therefore can't pass a recaptcha challenge

    That doesn't grok - Millions of people use safari daily and pass recaptcha challenges. You probably have a content blocker set too aggressively or something like that preventing you from it - it's not safari.

Gross, yep. And since I'm using a VPN I'm thrown into the "give them the most painful captchas just to be sure" bucket.

Do what you have to do. Keeping the platform safe is your job. Others might put different values above yours.

If you want drop in replacement, Cloudflare a turnstile as mentioned. Otherwise fully behind Cloudflare. CDN won't help much due to nature of content but WAF rules can be used to easily turn on invisible captcha based on rules.

Considering the articles and general sentiment we have towards Google and AI training, if this is true HN failed to read the room.

Edit: Yep, it's real. Seriously?

Edit 2: Understandable, see https://news.ycombinator.com/item?id=34313452.

Can confirm incognito login had to match images of cars. Dang please do not make me match images of cars, busses, and traffic lights. Please!