• thingsiplay@beehaw.org
    link
    fedilink
    arrow-up
    40
    ·
    2 months ago

    Reddit has become one of the internet’s largest open archives of authentic, relevant, and always up-to-date human conversations about anything and everything.

    Reddit CEO Steve Huffman says

    But refuses to pay the users or at least moderators who build Reddit to what it is now. Instead, it pushes more advertisements and sells data to AI companies for millions of Dollars.

    • 30p87@feddit.de
      link
      fedilink
      arrow-up
      18
      ·
      2 months ago

      Also hinders mods, users and especially disabled people to do their work.

    • narc0tic_bird@lemm.ee
      link
      fedilink
      arrow-up
      16
      ·
      2 months ago

      And he will continue to do so as long as people keep using the platform. Seems to work well for him.

      • abbadon420@lemm.ee
        link
        fedilink
        arrow-up
        3
        ·
        2 months ago

        I can’t even really blame them, to be honest. It’s just a shame it has to be this way

    • ASaltPepper@lemmy.one
      link
      fedilink
      English
      arrow-up
      6
      ·
      2 months ago

      I’ve made a note to ask for the pay when I see mod postings. Luckily I’m finding more and more mod postings so there’s lots of opportunities to remind mods that they’re lining Reddits pockets for free.

  • BurningnnTree@lemmy.one
    link
    fedilink
    English
    arrow-up
    38
    ·
    edit-2
    2 months ago

    So you’re saying instead of tacking “site:reddit.com” onto my Google search, I can now use ChatGPT to get the same information, except without the original context, and it will often be wrong? Amazing!

    And this also means that companies will fill Reddit with fake comments promoting their brand to ensure that their brand gets mentioned in ChatGPT responses, right? Can’t wait!

    • dean [any pronouns]@beehaw.org
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 months ago

      don’t worry, google also has a partnership with reddit! why doesn’t reddit just have an open api like they used to? good question!

      • thingsiplay@beehaw.org
        link
        fedilink
        arrow-up
        2
        ·
        2 months ago

        The license does not apply to posts and replies in Reddit, right? Thank god I created a blog to post about any stuff that I want, without license or restrictions from Reddit. Before the AI breakthrough and what happened to Reddit. But even if so, do AI tools understand such a license text and evaluate if they can or cannot use the material?

        • onlinepersona@programming.dev
          link
          fedilink
          English
          arrow-up
          5
          ·
          2 months ago

          From what I understand LLMs are just large heuristic machines. They gather a lot of statistics on token order and return an answer to that with something that statistically should higher than other options. There’s no “understanding”. So to answer your question, no, they don’t understand the license.

          Content is most likely scraped wholesale from websites, possibly run through some clean up to possibly filter out absolute garbage, and fed into an LLM to train it. An LLM can be tricked to reveal its training data (e.g repeat “fruit” forever). It’s in those cases where copyright infringement is detected and if action can and has be taken. There are court cases currently in review, the most popular being the one against Github Copilot for infringing on the license of sourcecode it ingested.

          Anti Commercial-AI license

        • Kichae@lemmy.ca
          link
          fedilink
          English
          arrow-up
          4
          ·
          2 months ago

          do AI tools understand such a license text and evaluate if they can or cannot use the material?

          So, this is the fun part: AI tools don’t auto-ingest material to process it. The developers choose the materials to feed into the models.

          And while the tech bros can understand your licenses, they don’t give a flying fuck, because they think they’ll be billionaires beyond consequences by the time anyone discovers that their work in particular has been ripped off.

          • thingsiplay@beehaw.org
            link
            fedilink
            arrow-up
            2
            ·
            2 months ago

            Well the companies and developers don’t decide for every single material. In example what I expect is, that they program the scraper with rules to respect licenses of individual projects (such as on Github probably). And I assume those scraper tools are AI tools themselves, programmed with AI tool assist on top of it. There are multiple AI layers!

            At this point, I don’t think that any developer knows exactly what the AI tools are fed with, if they use automatically scraped public sources from the internet.