• sugar_in_your_tea@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    10
    ·
    2 months ago

    That’s how a lot of people handle deleted data in database, it’s literally just a flag. That’s why there’s a recommendation to edit Reddit posts before deleting them, to ensure they’re actually overwritten so they can’t just be restored.

    • fishpen0@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 months ago

      Every time someone says something like this I have to explain CDC and regular old backups. There’s no way in hell Reddit doesn’t keep cold and hot backups of their shit. And while Reddit is unlikely to be doing CDC for soc2 or other compliance reasons, it’s the easiest method to capture data for analytics purposes.

      CDC stands for change data capture. It’s generally done with databases by streaming the change log or ref log to a bucket or a service like Kafka where you can fast forward and rewind the log queue to see the state of the DB at any point in time. Even if you edit your comments it’s likely sitting in a Kafka topic or a snowflake bucket outside of the DB or cache used for the presentation layer.

      Zero large scale websites operate with a truly single data store. There is always another layer that your user operations don’t impact