• Match!!@pawb.social
    link
    fedilink
    English
    arrow-up
    1
    ·
    5 months ago

    Typically the citation is included with the software, possibly linked from a site / service and/or included in their dataset repo (e.g. on huggingface.co)

    • Cosmic Cleric@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      5 months ago

      True, but they still have to cite my name, and I’m not sure they’re going to name every person that they use every one of their comments to train their models from.

      Granted it relies on them honoring the license, but still easy thing to try.

      CC BY-NC-SA 4.0

      • Match!!@pawb.social
        link
        fedilink
        English
        arrow-up
        1
        ·
        5 months ago

        From a look at the metadata for, for example, LAION 5B, the attribution (as well as the license when present) is scraped along with the datat