Wednesday, February 18, 2015

USB MicroSd Card Storage Solution using ZFS on Linux - Fast, Reliable & Inexpensive!

Recently a nearly brand new WD MyBook 4TB died unexpectedly, so I have been reorganizing for the future. 
Enter ZFS as my first choice in filesystems to move forward with and the results are very promising so far.  In particular I have had very decent results with the deduplication feature and compression.

Interestingly though these features are so new they lack of current support in many common system utilities like df so using them will get you inaccurate results. For instance df cannot handle deduplication, once you start putting extra GB's of dedup'd files on it starts lying about it and tells you the disk is much bigger than it really is.  Also because a lot of stuff is going on in memory you never really know exactly how fast things are going, so the throughput is probably a bit lied about too. So the end result is basically you don't really know how much space you have at any given time or how fast it is, yet you get the general sense that its safe, at least I did. I was able to break and repair the zpool several times, simulate corruption and scrub it back and resilver new disks.

Also to be as accurate as possible with these numbers, to make sure the files weren't simply sitting in memory cached and not written I kept checking zpool iostat -v (I saw what looked like write operations queued up and slowly dwindling down). Strangely du -sh said the full file was there, I md5sum'd it and it had the same signature as the original file and finally to settle it with certainty I shutdown zfs-fuse itself (which quickly unmounted the filesystems) and disconnected / reconnected the drives and then brought zfs-fuse back up and sure enough the full file was there. Seems a bit like Voodoo, but it works well enough for me.

I realize these numbers I achieved are probably not 100% accurate because of all the memory caching and compression but it doesn't really matter for my needs. What I need is to be able to survive a drive loss and continue to function until the new drive is brought online while not being sluggish. ZFS accomplishes all that and more - like deduplication, compression, scrubbing (against bitrot) and snapshots. Plus its free so I am very content with ZFS. Here is my setup and results: 

Acer C720 (w/2GB RAM) - Chromeos Crouton/chroot'd to Ubuntu 14.04 (Trusty)
Upgraded SSD to a 128 GB MyDigital SSD w/6GB SuperCache2
1 Vantec 10 port USB 3.0 hub ($45 from NewEgg)
(Update Feb 19, 2015 looks like the price is now $60 for this)
5 USB 3.0 MicroSD SDXC Card Readers  (5 x $5 AliExpress)
5 SanDisk MicroSD 128GB              (5 x $13 AliExpress)
Grand Total  $135
ZFS-FUSE  (apt-get install zfs-fuse)

cp normal file speed averages approximately 300MB/s !!
The normal speed for these drives (formatted exFat or Fat32) is approximately 29MB/s, so even with striping redundancy the speed is approximately 1000% increased. Not bad! =)

time cp big_file_00.big_file /MYWINPOOL/
‘big_file_00.big_file’ -> ‘/MYWINPOOL/big_file_00.big_file’
real    0m1.030s
user    0m0.007s
sys    0m0.360s
(trusty)cronkilla@localhost:/MYWINPOOL$ du -sh big_file_00.big_file
300M    big_file_00.big_file
(Note: big_file was created via dd if=/dev/urandom)

Sequential benchmark performance using /dev/zero (my alias bm): ranges from 490MB/s to 540MB/s

To obtain these results I upped the max-arc-size from the the ZFS configuration file (/etc/zfs/zfsrc) from 100 to 1000,
this had a big impact. I also changed a few other parameters:
max-arc-size = 1000
fuse-mount-options = default_permissions,big_writes,allow_other
#zfs-prefetch-disable   ### This was uncommented and I commented it out

(trusty)cronkilla@localhost:~$ sudo zpool history MYWINPOOL
History for 'MYWINPOOL':
2015-02-18.18:51:15 zpool create -f MYWINPOOL raidz1 /dev/disk/by-id/usb-Generic_STORAGE_DEVICE_FUNWAY091552-0:0 /dev/disk/by-id/usb-Generic_STORAGE_DEVICE_FUNWAY091125-0:0 /dev/disk/by-id/usb-Generic_STORAGE_DEVICE_FUNWAY091147-0:0 /dev/disk/by-id/usb-Generic_STORAGE_DEVICE_FUNWAY091067-0:0 /dev/disk/by-id/usb-Generic_STORAGE_DEVICE_FUNWAY090855-0:0
2015-02-18.18:53:26 zfs set compression=zle MYWINPOOL
2015-02-18.18:53:27 zfs set checksum=fletcher4 MYWINPOOL
2015-02-18.18:53:29 zfs set dedup=on MYWINPOOL
2015-02-18.18:53:30 zfs set xattr=off MYWINPOOL
2015-02-18.18:53:31 zfs set atime=off MYWINPOOL

Here is a quick glimpse at what it looked like during a write test via zpool iostat -v
(trusty)cronkilla@localhost:~$ sudo zpool iostat -v
                                           capacity     operations    bandwidth
pool                                    alloc   free   read  write   read  write
--------------------------------------  -----  -----  -----  -----  -----  -----
MYWINPOOL                                151M   596G      0     27    834  1.89M
  raidz1                                 151M   596G      0     27    834  1.89M
    disk/by-id/usb-Generic_STORAGE_DEVICE_FUNWAY091552-0:0      -      -      0      9  13.7K   493K
    disk/by-id/usb-Generic_STORAGE_DEVICE_FUNWAY091125-0:0      -      -      0      9  13.7K   495K
    disk/by-id/usb-Generic_STORAGE_DEVICE_FUNWAY091147-0:0      -      -      0     10  13.7K   501K
    disk/by-id/usb-Generic_STORAGE_DEVICE_FUNWAY091067-0:0      -      -      0      9  14.6K   500K
    disk/by-id/usb-Generic_STORAGE_DEVICE_FUNWAY090855-0:0      -      -      0     11  13.7K   509K
--------------------------------------  -----  -----  -----  -----  -----  -----


Here is more detailed disk data, showing through the first disk not the remaining disks:
(trusty)cronkilla@localhost:~$ sudo zdb
MYWINPOOL:
    version: 23
    name: 'MYWINPOOL'
    state: 0
    txg: 4
    pool_guid: 17797988667815477235
    hostid: 8323328
    hostname: 'localhost'
    vdev_children: 1
    vdev_tree:
        type: 'root'
        id: 0
        guid: 17797988667815477235
        create_txg: 4
        children[0]:
            type: 'raidz'
            id: 0
            guid: 16975055505754696246
            nparity: 1
            metaslab_array: 23
            metaslab_shift: 32
            ashift: 9
            asize: 644221501440
            is_log: 0
            create_txg: 4
            children[0]: id: 0
                guid: 13728714778704373774
                path: '/dev/disk/by-id/usb-Generic_STORAGE_DEVICE_FUNWAY091552-0:0'
                whole_disk: 0
                create_txg: 4
...  


In conclusion, I found ZFS with MicroSD cards to be particularly a decent pairing since MicroSD cards are super cheap and ZFS is already a software RAID Controller. ZFS provides resiliency against bitrot via scrubbing, RAID redundancy against drive failures and snapshots so ZFS w/MicroSD cards - in my opinion - makes for a nearly perfect complementary match.

As of Feb 19, 2015 - this won't work directly for Windows since Windows doesn't support ZFS - but there is VirtualBox / drag-and-drop Guest Additions pass-throughs.. =)

7 comments:

  1. As of Feb 21, 2015 this setup is still working well, yet I have noticed that every scrub produces repairs. So it does seem like there is much more bitrot going on with MicroSD cards than I previously knew about. So I definitely recommend daily scrubs for any MicroSD ZFS Pool.

    Also I want to mention that the other ports on my 10 port hub are used for ancillary backups, 4 dedicated to a triply mirrored 128GB MicroSD and the 10th to yet another 4TB WD harddrive. This is also accompanied by offline backups to USB sticks, and multiple cloud backup account uploads.

    ReplyDelete
  2. It might be worth considering whether the bit rot of your microSDs which did not exist at the beginning was caused by ZFS's CoW nature which, in time, wore down the SD cards (which have limited writes). If this is true, then ZFS is fixing errors which it is causing itself.

    A better choice for a dumb flash storage medium, such as an SD card, should be F2FS filesystem. However, as it lacks some cool ZFS/BTRFS features (for example compression and RAID) I looked into using F2FS on top of a ZFS' ZVOL. However, after thinking about it it seems that because of all those extra ZFS features, ZVOL (unlike LVM) would not be transparent to F2FS, preventing its direct access to the SD card, and thus the benefits of F2FS would be lost.

    ReplyDelete
  3. Actually my sd cards are brand new so writes threshold isnt a concern. It looks to me from my experience over time is that MicroSD are simply an inferior storage medium that need to have the fancy embedded microcontroller constantly juggle away bitrot.

    In my opinion it is not something that is currently stable enough to be suitable intended for long term storage generally speaking. But because it is cheap and the sizes are decent, the speed is decent and it is very portable it still makes sense if those things are important to you. I looked at F2FS and it seems interesting but the main feature that I think would make it unsuitable for microSD because of the heavy presence of bitrot is that it has only a single Shadow copy (mirror) for resiliency and does not self-heal.

    On microSD this is not enough in my opinion. I put everything on RAIDZ2 now on my microSD ZPOOL, scrub daily and have scheduled rar archives created daily with 10% recovery records and additionally I run PAR2 (Reed Solomon) for an additional set of 10% recover records for the rar archives that back up all important directory structures. In addition I still have snapshots sent off of the ZPOOL and even with this system I have had to restore the entire ZPOOL twice in the last 3 months because enough bitrot happened at sensitive areas to make the entire POOL faulted (that was before I made hot swappable disks available and set autorecover on, since I have not had a problem its been about 2 months).

    ReplyDelete
    Replies
    1. I disagree that bit rot is an integral part of microSD to that scale. It can’t be and if it is, you have grounds for warranty replacement. More even, I disagree that what we have here is bit rot at all. In my opinion cards got damaged due to excessive writes.

      I have several SD cards. Adata Premiere was running my Arch linux installation as root with RAID1 setup (two partitions on one card) + SWAP. Other two cards: one SanDisk Premiere 16GB and one No-Name 16GB were used as storage with F2FS. After four months of such setup, I replaced Adata with Sandisk Extreme USB 3.0 stick. I made a ZFS pool of all three. Results: Adata is done for. ZFS reports errors instantly when data is written to it. The other two cards, even though much older, show no signs of damage, no ZFS errors. I am pretty sure this is because the received much less write than Adata.

      The symptoms you describe are consistent with the above. SD cards are just not meant for write-intensive life, especially on CoW systems. This is why I would suggest using them only for applications where reads are vastly predominant to writes, ie. http server, media storage etc.

      Delete
    2. As I said the MicroSD cards I used are brand new, and the system is not write-intensive. This is a seldom used drive that only sometimes is used as a sandbox, generally it is ancillary writes maybe a few hundred writes a day.

      Try yourself, not with USB sticks yet MicroSD cards and make a ZPOOL and see what ZFS finds after a scrub. For me its usually about 100 Checksum errors and about 10MB of block repairs, the errors clear and afteward the whole pool appears healthy.

      The speed is fine and everything works fine (as expected) otherwise.

      I have read many times online that MicroSD cards are crap storage, that SSD manufacturers make. I never wanted to believe it but after all this bitrot issue I recalled when they first came out how I kept my MP3 files on them and a few movies and overtime all the music had scratchy hiccups in them and the movies also were corrupted. I am pretty sure it happens on all microSD media regardless of whether you use the card or not, the bits simply flip by themselves.

      Delete
  4. Could it be they are not real media errors, but are actually the really cheap sd adapter controllers and the adapters themselves that under heavy load are returning flipped bits? Also you have to be careful buying sd cards from aliexpress/ebay as there are a lot of fakes.

    ReplyDelete
  5. Yes, I'm fairly sure it was not the adapters since I used decent quality full size USB 3 adapters that worked flawless for better quality chips. The SD cards were the cheap ones that probably did fail QA at some point. But I have tried with higher quality store bought smaller SD cards and the results are the same. You can probably notice it yourself, if you take an SD chip and put a large movie on it. Watch the movie with a notepad, note any glitches at what time frames, then let the card sit for about 6 months and watch it again. You will likely see many new glitches. That is bit rot, eating away the data.

    ReplyDelete