• NotAnotherLemmyUser@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    21 days ago

    I’m sure you understand this, but anonymized data doesn’t mean it can’t be deanonymized. Given the right kind of data, or enough context they can figure out who you are fairly quickly.

    Ex: You could “Anonymize” gps traces, but it would still show the house you live at and where you work unless you strip out a lot of the info.

    http://androidpolice.com/strava-heatmaps-location-identity-doxxing-problem/

    Now with LLMs, sure, you could “anonymize” which user said or asked for what… but if something identifying is sent in the request itself, it won’t be hard to deanonymize that data.

    • XiozTzu@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      21 days ago

      So you would rather submit your non-anonymized data? Because those bastards will find a way to unanonimize it. Is Apple doing the right thing or not?

      • NotAnotherLemmyUser@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        21 days ago

        What? No. I would rather use my own local LLM where the data never leaves my device. And if I had to submit anything to ChatGPT I would want it anonymized as much as possible.

        Is Apple doing the right thing? Hard to say, any answer here will just be an opinion. There are pros and cons to this decision and that’s up to the end user to decide if the benefits of using ChatGPT are worth the cost of their data. I can see some useful use cases for this tech, and I don’t blame Apple for wanting to strike while the iron is hot.

        There’s not much you can really do to strip out identifying data from prompts/requests made to ChatGPT. Any anonymization of that part of the data is on OpenAI to handle.
        Apple can obfuscate which user is asking for what as well as specific location data, but if I’m using the LLM and I tell it to write up a report while including my full name in my prompt/request… that’s all going directly into OpenAIs servers and logs which they can eventually use to help refine/retrain their model at some point.

    • Zos_Kia@lemmynsfw.com
      link
      fedilink
      English
      arrow-up
      1
      ·
      21 days ago

      I don’t know about the US but in European GDPR parlance, of it can be reversed then it is NOT anonymized and it is illegal to claim otherwise. The correct term is pseudonymized.