A new report from plagiarism detector Copyleaks found that 60% of OpenAI’s GPT-3.5 outputs contained some form of plagiarism.

Why it matters: Content creators from authors and songwriters to The New York Times are arguing in court that generative AI trained on copyrighted material ends up spitting out exact copies.

  • General_Effort@lemmy.world
    link
    fedilink
    English
    arrow-up
    6
    ·
    4 months ago

    No. It’s not really clear what LLMs do, but it certainly depends on context.

    What they fundamentally do is continue a text. That’s what they were originally trained to do. Then they were fine-tuned to continue a chat log or respond to an instruction. To be able to do that, they have learned a lot. Unfortunately, we do not know what.

    If you ask for a summary of some text, it will give you one; regardless of whether the text even exists.

    The summary could be one written by a human that it has memorized. Or it could be complete nonsense, that it is making up on the fly. You never know.