Safety Engineer, Dad, Husband, Pilot, Musician. Not necessarily in that order.

Ingenieur für funktionale Sicherheit, Vater, Ehemann, Pilot, Musiker. Nicht notwendigerweise in dieser Reihenfolge.

  • 2 Posts
  • 48 Comments
Joined 1 year ago
cake
Cake day: June 11th, 2023

help-circle
  • Then why do you think manufacturers still list these failure rates (to be sure, it is marked as a limit, not an actual rate)? I’m not being sarcastic or facetious, but genuinely curious. Do you know for certain that it doesn’t happen regularly? During a scrub, these are the kinds of errors that are quietly corrected (althouhg the scrub log would list them), as they are during normal operation (also logged).

    My theory is that they are being cautious and/or perhaps don’t have any high-confidence data that is more recent.


  • Hopfgeist@feddit.detoSelfhosted@lemmy.worldHow to fix my ZFS pool mistakes
    link
    fedilink
    English
    arrow-up
    1
    arrow-down
    1
    ·
    3 months ago

    Bit error rates have barely improved since then. So the probability of an error whenr reading a substantial fraction of a disk is now higher than it was in 2013.

    But as others have pointed out. RAID is not, and never was, a substitute for a backup. Its purpose is to increase availability. And if that is critical to your enterprise, these things need to be taken into account, and it may turn out that raidz1 with 8 TB disks is fine for your application, or it may not. For private use, I wouldn’t fret. but make frequent backups.

    This article was not about total disk failure, but about the much more insidious undetected bit error.


  • Let’s do the math:

    The error-reate of modern hard disks is usually on the order of one undetectable error per 1E15 bits read, see for example the data sheet for the Seagate Exos 7E10. An 8 TB disk contains 6.4E13 (usable) bits, so when reading the whole disk you have roughly a 1 in 16 chance of an unrecoverable read error. Which is ok with zfs if all disks are working. The error-correction will detect and correct it. But during a resilver it can be a big problem.


  • To add, unlike “traditional” RAID, ZFS is also a volume manager and can have an arbitrary number of dynamic “partitions” sharing the same storage pool (literally called a “pool” in zfs). It also uses checksumming to determine if data has been corrupted. On redundant setups it will then quietly repair the corrupted parts with the redundant information while reading.




  • Hopfgeist@feddit.deOPtoSelfhosted@lemmy.worldDifferent "geometries" for same disk model?
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    1
    ·
    edit-2
    8 months ago

    Sure, SCSI disks will show their defective list (“primary defects”, as delivered by the factory, and grown defects, accumulated during use), and they all have a couple hundred primary defects. But I don’t see why that would affect the reported geometry, given that it is fictional, anway. And all disks have enough spare tracks to accommodate for the defects, and offer the specified full number of total sectors, even for long list of grown defects. Incidentally, all the 4TB disks are still “perfect” in that they have no grown defects.

    And yes, ever since LBA, nobody has used sectors and cylinders for anything.


  • I’m not touching that post again. But a small rant about typesetting in lemmy: It seems there is no way whatsoever to put angle brackets in a “code” section. In an overzealous attempt to prevent HTML injection, everything in angle brackets is just removed when posting (although it remains there in preview). In normal text, you can use “<”, but not inside “code” segments, where it will be retained verbatim.



  • If you’re as paranoid as me about data integrity, SAS drives on a host adapter card in “Initiator Target” (IT) mode with write-cache on the disks disabled is the safest. It will degrade performance when writing many small files concurrently, but not as badly as with SATA drives (that’s for spinning disks, of course, not SSD). With a good error-correcting redundant system such as ZFS you can probably get away with enabled write cache in most cases. Until you can’t.


  • RAID is generally a good thing but don’t get complacent, follow the 3-2-1 method

    To expand on that: Redundant drive setup and backups serve completely different purposes. The only overlap is in case of a single disk failure, where RAID (or similar) may save the data.

    Redundancy is all about reducing downtime in case of single hardware failures. Backups not only protect you from data loss in case of multiple simultaneous failures, but also from accidental deletion. Failures that require restoration of data almost always involve downtime. In short: You always need backups (unless it’s strictly a local cache, and easily recreatable), but if you want high availability, redundancy may help.

    3-2-1-rule for backups, in case you’re unfamiliar: 3 copies of important data, on 2 different media, with 1 off-site.



  • Gold-plating the connectors is actually one of the few things that does make sense. When new, they won’t sound better, but they corrode less, which can, sometime in the future, make a difference, albeit very slight: surface oxidation can form a tiny capacitor. That said, I think you’d be hard-pressed to tell the difference to chrome-plated ones. But unlike lots of other esoteric “high-end” nonsense, this one has at least theoretical technical merit. And the micrometer-scale galvanic gold-plating isn’t expensive, either.


  • Most of the OnePlus series, including older models, is fully supported by LineageOS, and unlocking the bootloader is straightforward. That were the most important reasons for me to go OnePlus. For me and my family there was nothing else comparably easily supported by Lineage with a good price/performance ratio. We currently use 6T and 8T models, that we bought used. The only downside for me is the lack of a notification light.