-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increased Longevity multiple cells Option for flash and ssds etc. #60
Comments
The limiting factor will always be the device firmware. If a media failure causes the firmware to lose some bits in the translation layer or some similarly critical metadata region, all user data on the entire device (including all filesystems) will be lost instantly. Some large fleet studies suggest this happens about 1 of 4 drive failures. Also, the firmware has to cooperate for this scheme to work. If the data is whitened or compressed by the firmware, it will not be possible to predict which bit patterns correspond to which cells. If the firmware relocates data on the media, then it might cause cells to alternate between storing high- and low-longevity data at different times. If the firmware detects a single-bit error in a cell, it may refuse to read the entire sector, so a slightly different analog value will result in digital data that is not readable at all. This places an upper bound on how much benefit can be obtained from data encodings. They are only useful in the boundary between cases where all of the following occur:
Most SSDs will fail on points 4 and 5, especially when they apply their own MLC cell wear mitigations in firmware. An SSD that fails on point 3 is typically not a SSD you want to use since you'll need to use filesystem-level csums in order to trust any data you read from such a device. You'd need a SSD that had firmware support for bypassing points 3-5 in order for this to work at all. Maybe something like a ZNS drive with some extended interface. But if you have such control over firmware, it would likely be better to make partitions that use different cell depths, and let the firmware take care of the details (picking good values, aligning bits to cell boundaries, preventing relocations between partitions of different cell depths, etc.). Then you create separate filesystems in each partition for the data of different longevities--far simpler than extending btrfs to do that on a per-file basis without requiring all metadata in the filesystem to be stored with the highest longevity method. Separate filesystems have separate metadata, so a sector fault in a low-longevity partition won't destroy a high-longevity filesystem. |
Wow, impressive consideration. Part of my thought was seeing a hack-a-day where someone made an SLC drive out of an MLC drive by just changing the firmware. I expected most drives to allow reading of corrupted data. I expected this to be in some debug mode. How would the message be passed to the drive controller that thr user requests extra care? Drive manufacturers might choose different methods, but what is optimal? What sets of requests make sense? doing a partition with SLC, be more careful with the tables for wear leveling and error mitigation, do extra CRC checks, write data 3x... Program data can often be restored. OS data can be restored... Personal data, core firmware and UFEI and the Hashes of the OS and program data should be fortified. |
Nowadays with TLC and QLC drives going for mega storage but with disregard for longevity, I propose an option to game the controller, or negotiate with the controller, to use only patterns that can last a long time.
So if a controller has 2^4 charge levels for a cell (bit value changes if it degrades past 1/16th of the value... Could we use 0010 and 1101 as 1 and 0, or 1111 and 0000, and effectively make it so we can correct more errors (if the drive returns 0001 or 1100 we can easily round them to the correct values.
I proposed not hitting all 1s or 0s to reduce stress on the drive.
This would be used for important files, like the journal, and encryption keys and certificates. May go further and use more than one cell and just have a character device pretend to be a block device or something like that.
I understand this may not be in scope for BTRFS, but it might be cool to have a flag for hyper important long term storage ... I might do my documents directory that way.
The text was updated successfully, but these errors were encountered: