Research Findings:

  • reCAPTCHA v2 is not effective in preventing bots and fraud, despite its intended purpose
  • reCAPTCHA v2 can be defeated by bots 70-100% of the time
  • reCAPTCHA v3, the latest version, is also vulnerable to attacks and has been beaten 97% of the time
  • reCAPTCHA interactions impose a significant cost on users, with an estimated 819 million hours of human time spent on reCAPTCHA over 13 years, which corresponds to at least $6.1 billion USD in wages
  • Google has potentially profited $888 billion from cookies [created by reCAPTCHA sessions] and $8.75–32.3 billion per each sale of their total labeled data set
  • Google should bear the cost of detecting bots, rather than shifting it to users

“The conclusion can be extended that the true purpose of reCAPTCHA v2 is a free image-labeling labor and tracking cookie farm for advertising and data profit masquerading as a security service,” the paper declares.

In a statement provided to The Register after this story was filed, a Google spokesperson said: “reCAPTCHA user data is not used for any other purpose than to improve the reCAPTCHA service, which the terms of service make clear. Further, a majority of our user base have moved to reCAPTCHA v3, which improves fraud detection with invisible scoring. Even if a site were still on the previous generation of the product, reCAPTCHA v2 visual challenge images are all pre-labeled and user input plays no role in image labeling.”

  • BarqsHasBite@lemmy.world
    link
    fedilink
    English
    arrow-up
    216
    ·
    edit-2
    3 months ago

    I kinda figured. It was annoying to do one, but then they wanted you to do two or three and that’s absurd. Whenever it comes up now, I usually just close out.

    • Bezier@suppo.fi
      link
      fedilink
      English
      arrow-up
      66
      arrow-down
      1
      ·
      3 months ago

      they wanted you to do two or three and that’s absurd

      Yea how about 20

      • LucidNightmare@lemm.ee
        link
        fedilink
        English
        arrow-up
        36
        ·
        3 months ago

        VPN? Google will just go in a loop with these things, so I just stopped using Google completely.

        • Bezier@suppo.fi
          link
          fedilink
          English
          arrow-up
          11
          ·
          edit-2
          3 months ago

          No. But it’s also not like I get 20 constantly, it was just the worst I’ve seen. Usually it’s 2 to 5, I think.

          I assume they’re just collecting data on how many are users willing to do.

          • LucidNightmare@lemm.ee
            link
            fedilink
            English
            arrow-up
            13
            ·
            3 months ago

            One time I did five in a row, because I use VPNs for everything, and realized after the 5th time that it would have been easier to just use bing so I do that first now. Google has turned into my last last resort, which is quite funny, because that’s where Bing used to be. Lmao

        • I Cast Fist@programming.dev
          link
          fedilink
          English
          arrow-up
          8
          ·
          3 months ago

          Whenever I’m on a private window the captchas just keep on coming. Trying to reset your Steam password via the program will also trigger an infinite loop of captchas, you HAVE to use a browser.

      • Dudewitbow@lemmy.zip
        link
        fedilink
        English
        arrow-up
        9
        arrow-down
        1
        ·
        3 months ago

        if you have to do that many, you either have some privacy setting on or on a flagged ip given from a VPN

          • Dudewitbow@lemmy.zip
            link
            fedilink
            English
            arrow-up
            6
            ·
            3 months ago

            its abnormal to them because vpns are often also used by bad actors. your use is not abnormal but its a there are other people misusing it making it worse for everyone else.

            • Landsharkgun@midwest.social
              link
              fedilink
              English
              arrow-up
              0
              arrow-down
              1
              ·
              3 months ago

              Wow, way to blame individuals who take basic precautions instead of the corporations who are blantly invading your privacy. Good job making the world a better place, bud.

          • catloaf@lemm.ee
            link
            fedilink
            English
            arrow-up
            6
            ·
            3 months ago

            Most people don’t, most bots do. You look more like a bot, so you get extra challenges.

      • sramder@lemmy.world
        link
        fedilink
        English
        arrow-up
        6
        ·
        3 months ago

        I tried to order some components on Digikey a few months ago and I’m still mentally scarred. Probably did a few hundred of those things over the course of 2 weeks.

    • Fisch@discuss.tchncs.de
      link
      fedilink
      English
      arrow-up
      20
      ·
      3 months ago

      Some captchas have also just gotten obvious AI training. “Click on the living being in this image”, “Select every image of the same object as in this example image”. And the images you have to select look obviously AI generated.

    • dinckel@lemmy.world
      link
      fedilink
      English
      arrow-up
      11
      ·
      3 months ago

      At a certain point I did like 10 of them, and then ended up closing the page, cause it never let me in, all because I was on a vpn

    • CosmoNova@lemmy.world
      link
      fedilink
      English
      arrow-up
      7
      ·
      3 months ago

      Funny thing is they stop asking if you do them really slowly. Almost as if to tell you, you‘re too inefficient to even be an unpaid intern or something. Anyway, if they annoy you, take your time.

  • Churbleyimyam@lemm.ee
    link
    fedilink
    English
    arrow-up
    94
    arrow-down
    1
    ·
    3 months ago

    Getting served a captcha often results in me closing the tab. I’m not doing stupid puzzles for you.

          • hddsx@lemmy.ca
            link
            fedilink
            English
            arrow-up
            13
            ·
            3 months ago

            What do you mean? I am a fleshy human and do fleshy human things like being made of flesh.

            • xavier666@lemm.ee
              link
              fedilink
              English
              arrow-up
              2
              ·
              3 months ago

              Time to take a knife and check for sure

              Seriously /s Don’t harm yourself!

              • hddsx@lemmy.ca
                link
                fedilink
                English
                arrow-up
                1
                ·
                3 months ago

                I disassembled my tail using a knife and it reassembled itself. Based on new data, my name is Rafael Cruz.

              • AlolanYoda@mander.xyz
                link
                fedilink
                English
                arrow-up
                1
                ·
                3 months ago

                Harm yourself?

                Take the knife and harm the people responsible for this travesty. The laws of robotics prevent robots from harming humans: if you manage to harm them, then that means either you’re human or they’re not!

      • tyler@programming.dev
        link
        fedilink
        English
        arrow-up
        5
        arrow-down
        3
        ·
        3 months ago

        It knows they’re wrong which is why I don’t really think this article is accurate. Is it training if it already has the answers? Probably not.

        • MajinBlayze@lemmy.world
          link
          fedilink
          English
          arrow-up
          21
          ·
          edit-2
          3 months ago

          That’s why it gives you a panel of 9 images. It would have a high confidence on some images, and a low confidence on others. When you pick the correct images and don’t pick incorrect ones it uses the ones it’s confident about as “validation” while taking the feedback on low confidence images to update the training data.

          What this does mean in practice is that only ones actually being “graded” are the ones bots can solve anyway.

          • Petter1@lemm.ee
            link
            fedilink
            English
            arrow-up
            1
            ·
            3 months ago

            It seems exactly like that, I experimented with it by trying to leave the one I think it has low confidence unchecked, and it often worked.

        • AmidFuror@fedia.io
          link
          fedilink
          arrow-up
          4
          ·
          3 months ago

          My understanding is different from others here. I thought they served the same Captcha to many people at once and use the majority response to decide who is answering correctly.

          • catloaf@lemm.ee
            link
            fedilink
            English
            arrow-up
            4
            ·
            3 months ago

            That’s true, or at least it used to be back when they were using it for OCR. I have no reason to believe it’s changed.

        • Vox@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          3 months ago

          It’s why they ask you to do multiple, 1-2 of them are the control group, they are training on the others

          • tyler@programming.dev
            link
            fedilink
            English
            arrow-up
            2
            ·
            3 months ago

            You’re implying they give you multiple. I hardly ever get multiple, pretty much only if I ‘fail’ the first one.

            • Miaou@jlai.lu
              link
              fedilink
              English
              arrow-up
              3
              ·
              3 months ago

              If they have a good fingerprint on you they don’t need the control group. That’s why you get 5+ captchas when using a VPN/tor.

        • Rolando@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          3 months ago

          If they gave two captchas, one which they knew the answer and one which they didn’t, they could use the second for training. (Even if you’re paying someone, you want to do that sort of thing when crowdsourcing data, because you never know if the paid person is just screwing around.)

    • snooggums@midwest.social
      link
      fedilink
      English
      arrow-up
      6
      ·
      3 months ago

      I haven’t done an image one in years for the same reason.

      My general internet usage has plummeted between ads and captchas and all the other modern website bullshit, which is why I am here so much.

  • Mubelotix@jlai.lu
    link
    fedilink
    English
    arrow-up
    49
    ·
    3 months ago

    I bypassed 35000 google recaptcha v2 using bots. Don’t ever rely on this for security

      • Gizmokid2005@lemmy.world
        link
        fedilink
        English
        arrow-up
        27
        arrow-down
        1
        ·
        3 months ago

        Except, that’s most of its ad copy on Google’s own website?

        reCAPTCHA uses an advanced risk analysis engine and adaptive challenges to keep malicious software from engaging in abusive activities on your website. Meanwhile, legitimate users will be able to login, make purchases, view pages, or create accounts and fake users will be blocked.

        It’s literally billed as a security measure for a website.

        https://www.google.com/recaptcha/about/

        • theherk@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          arrow-down
          13
          ·
          3 months ago

          I see your perspective, but I don’t consider that security in the context of software, which may also explain why they don’t use that word, though I readily admit that it is technically security of a sort. The term usually implies authentication, authorization, and isolation.

  • HiramFromTheChi@lemmy.world
    link
    fedilink
    English
    arrow-up
    33
    arrow-down
    1
    ·
    3 months ago

    There’s nothing that can express my disdain for Google’s reCaptcha.

    😒 We’re training its AI models 😒 It’s free labor for Google 😒 Sometimes it wants the corner of an object, sometimes it doesn’t 😒 Wildly inconsistent 😒 Always blurry and hard to see 😒 Seemingly endless 😒 It’s the robot asking us humans if we’re the robots

  • polonius-rex@kbin.run
    link
    fedilink
    arrow-up
    28
    arrow-down
    2
    ·
    3 months ago

    Google should bear the cost of detecting bots, rather than shifting it to users

    how?

      • siph@lemmy.world
        link
        fedilink
        English
        arrow-up
        11
        ·
        3 months ago

        Considering the article states that reCAPTCHA v2 and v3 can be broken/bypassed by bots 70-100% of the time, they are obviously not the solution.

        • radivojevic@discuss.online
          link
          fedilink
          English
          arrow-up
          6
          ·
          3 months ago

          “Google should bear the cost”

          Google should shut it down and make sites roll their own verification. Give everyone a month to implement a new solution on millions of websites.

          • AeroLemming@lemm.ee
            link
            fedilink
            English
            arrow-up
            1
            ·
            3 months ago

            This is unironically the answer. You can’t make a general-purpose captcha solver AI if every website or group of websites uses a completely different kind of captcha.

        • conciselyverbose@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          5
          ·
          3 months ago

          At what cost?

          100% success rate isn’t even moderately useful if it costs $5 per pass. The discussion is completely pointless without a concrete, documented analysis of the actual hardware and energy costs involved.

        • polonius-rex@kbin.run
          link
          fedilink
          arrow-up
          5
          arrow-down
          1
          ·
          3 months ago

          how do you get the metric of 70-100% of the time?

          the best bots doing it 70-100% of the time is very different to the kind of bot your average spammer will have access to

          • siph@lemmy.world
            link
            fedilink
            English
            arrow-up
            6
            arrow-down
            2
            ·
            3 months ago

            Did you read the article or the TL:DR in the post body?

            The paper, released in November 2023, notes that even back in 2016 researchers were able to defeat reCAPTCHA v2 image challenges 70 percent of the time. The reCAPTCHA v2 checkbox challenge is even more vulnerable – the researchers claim it can be defeated 100 percent of the time.

            reCAPTCHA v3 has fared no better. In 2019, researchers devised a reinforcement learning attack that breaks reCAPTCHAv3’s behavior-based challenges 97 percent of the time.

            So yeah, while these are research numbers, it wouldn’t be surprising if many larger bots have access to ways around that - especially since those numbers are from 2016 and 2019 respectively. Surely it is even easier nowadays.

            • polonius-rex@kbin.run
              link
              fedilink
              arrow-up
              4
              ·
              3 months ago

              researchers were able to defeat reCAPTCHA v2 image challenges 70 percent of the time

              that doesn’t answer the question?

              researchers devised a reinforcement learning attack that breaks reCAPTCHAv3’s behavior-based challenges 97 percent of the time

              i’d argue “bespoke system, deployed in a very limited context, built by researchers at the top of their field” is kind of out of reach for most people? and any bot network scaled up automatically becomes easier to detect the further you scale it

               

              the cost of just paying humans to break these already at or below pennies per challenge

          • siph@lemmy.world
            link
            fedilink
            English
            arrow-up
            13
            arrow-down
            1
            ·
            3 months ago

            Maybe a billion dollar company has the budget to come up with something?

            Looking at the numbers in this post, reCAPTCHA exists to make Google money, not to keep bots out.

            I’d rather have no reCAPTCHA than the current state.

  • daniskarma@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    16
    arrow-down
    1
    ·
    edit-2
    3 months ago

    I don’t really get where this article is going. They are all over the place.

    Let’s start with a fuck google. They are a evil company. But:

    • Other captchas are also not very effective against bots. Arguably most traditional systems would be worst that recaptcha at fighting bots.

    • Recaptcha agent validation while a privacy violation is faster than solving any other captcha and if you are hit with the puzzle is not that much more time consuming that every other captcha.

    • That profit number is very questionable and they know it. Anyway, that’s no much different and probably less profitable that most google services.

    Also is ridiculous how someone can say in the same article that the image puzzle can be solved by bots 100% of the time and that is a scheme to get human labor to solve the puzzle. Am I the only one seeing the logical failure here?

    And what’s the purpose of all this? Just let bots roam free? Are they trying to sell other solution? What’s the point?

    I hate google as much as the next guy. But I don’t really share this article spirit.

    If I were to make a point. They point will be that people and companies should stop making registration only sites and dynamic sites when static websites are enough for their purposes. And only go for registration or other bot-vulnerable kind of sites of there is no way around it. But if you need to make a service that is vulnerable to bots, you need to protect it, and sadly there’s not great solutions out there. If your site is small and not targeted by anyone malicious specifically you can get with simpler solutions. But bigger or targeted sites really can’t get around needing google or cloudfare and assume that it will only mitigate the damage.

    But if anyone knows a better and more ethical solution to prevent bot spam for a service that really need to have registrations, please tell me.

    • conciselyverbose@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      6
      arrow-down
      1
      ·
      edit-2
      3 months ago

      Also worth noting that Google has always been extremely open about the fact that they use recaptcha for that purpose. It’s never been a secret.

      Their service to the website owners is the meaningful reduction in effectiveness of bots in places bots are harmful. The website’s service to you is the content that that’s being used to protect (and the stuff that has recaptcha on it is stuff like games where there’s a competitive advantage, things like search engines where there’s a meaningful cost to heavy bot use, and login pages where there’s a real security cost to mass bot use). I use a VPN, which increases the rate of captchas a lot, and I think it’s a pretty reasonable way to do things, personally.

  • Petter1@lemm.ee
    link
    fedilink
    English
    arrow-up
    17
    arrow-down
    2
    ·
    3 months ago

    Why is that no news to me? How did so many people not know that? Should I have spread the word more, even if all people I told that where likr “yea, yea, of course, but, what can I do? 🤷🏻‍♀️”?

  • umbraroze@lemmy.world
    link
    fedilink
    English
    arrow-up
    11
    ·
    3 months ago

    reCAPTCHA is exploiting users for profit

    Well duh.

    reCAPTCHA started out as a clever way to improve the quality of OCRing books for Distributed Proofreaders / Project Gutenberg. You know, giving to the community, improving access to public-domain texts. Then Google acquired them. Text CAPTCHAs got phased out. No more of that stuff, just computer vision rubbish to improve Google’s own AI models and services.

    If they had continued to depend on tasks that directly help community, Google would at least have had to constantly make sure the community’s concerns are met. But if they only have to answer to themselves for the quality of the data and nobody else even gets to see it, well, of course it turned into yet another mildly neglected Google project.

  • snooggums@midwest.social
    link
    fedilink
    English
    arrow-up
    10
    ·
    3 months ago

    The conclusion can be extended that the true purpose of reCAPTCHA v2 is a free image-labeling labor and tracking cookie farm for advertising and data profit masquerading as a security service,” the paper declares.

    I thought this was known since it came out. It seemed even more obvious when the images leaned in heavily to traffic related pictures like stoplights.

  • gradyp@awful.systems
    link
    fedilink
    English
    arrow-up
    8
    ·
    3 months ago

    I honestly thought it was common knowledge that these things were essentially free labor for training AI.

    • dan@upvote.au
      link
      fedilink
      English
      arrow-up
      1
      ·
      3 months ago

      The original reCAPTCHA from Carnegie Mellon University was helping to digitize books. It showed one known word and one unknown word, and if enough people answered the second word with the same answer, that’d be marked as the correct value.

  • cygnus@lemmy.ca
    link
    fedilink
    English
    arrow-up
    8
    arrow-down
    1
    ·
    3 months ago

    Gonna have to disagree hard with this, based on extensive first-hand experience (web dev). I’ve added CAPTCHA to dozens (hundreds?) of web forms, and it all but eliminates spam.

    • vastard@lemmynsfw.com
      link
      fedilink
      English
      arrow-up
      3
      ·
      3 months ago

      My experience matches yours. I don’t enjoy putting recapcha v3 on my sites but it takes contact form spam from 70-80 messages per day to 0-2.

      I’d switch to other services if they could be as effective. If anybody has real-world experience with another option working I’d love to hear it.

    • rbits@lemm.ee
      link
      fedilink
      English
      arrow-up
      3
      ·
      3 months ago

      Right, so similar to locks? Usually can be easily bypassed if you know how, but it at least filters out the people who aren’t determined enough to put in the effort.

      • cygnus@lemmy.ca
        link
        fedilink
        English
        arrow-up
        1
        ·
        3 months ago

        Basically, yeah. The vast majority of spambots are simple and lazy.

  • cley_faye@lemmy.world
    link
    fedilink
    English
    arrow-up
    6
    arrow-down
    1
    ·
    3 months ago

    reCAPTCHA v2 visual challenge images are all pre-labeled and user input plays no role in image labeling

    That’s funny, because when I’m faced with this, I keep adding/removing one of the image randomly and it keeps accepting them as ok.