I manage a small a set of Citrix Xenserver hosts for various infrastructure functions, for storage, I’ve been running openfiler for about 3 years now, since the last reboot, my uptime is 1614 days! It’s pretty solid, but the interface seems buggy, there’s a lot of things in there I don’t use. When I do need to go change something, it’s so long in between uses, that I have to re-read documentation to figure out what the heck it’s doing. I’ve got a new Xenserver cluster coming online soon, and have been researching, thinking, dreaming, of what I’m going to use for VM storage this time.

Openfiler, really has been mostly great. My server load runs about 1.13 always, which somewhat bugs me, mostly due to conary (its package manager) running. Openfiler is almost never updated which isn’t a bad thing, since the machine is inside our firewall, without internet access unless I set a specific nat rule for it. I’m running it on an old Dell 310 server with two 2TB drives running RAID1, it’s got 4GB ram and boots to the same drives as openfiler runs its magic on (this server was originally implemented as a quick fix, to get us off local Xen storage, so we could do rolling restarts). It’s not a problem, but now, 3 years later, I notice, the latest version, IS THE SAME version I have installed and have been running for the last 1614 days… So maybe it’s time to find something new.

So I build out a nice Dell 530 server, dual 16gb flash cards, dual 120gig write intensive SSDs, a bunch of 2TB SATA drives, dual six core procs, and 32gig ram, dual power supplies, nice RAID card. The system arrived, and I had a lot of good feedback for NAS4Free, both online (googling, lots of reddit threads), and even in person recommendations. I was pretty excited about it honestly, I’m a little unfamiliar with FreeBSD, but have used it on and off in my now 20 year Linux career. I went ahead and installed the thing to the 16gb flash, as recommended. I disabled RAID on the server, and setup all the drives as SATA. Booted to the system and got rolling. It was really simple, seems easy to use, does WAY more than I could even actually want, in a storage device. I setup a big lun, with ZFS and iSCSI, added the write intensive SSDs as cache, installed all the recent updates, and was ready.. Then I read documentation a bit.

  • iSCSI can’t make use of SSD write cache.. Well, I guess I get an all SSD lun.
    • “A dedicated log device will have no effect on CIFS, AFP, or iSCSI as these protocols rarely use synchronous writes.”
  • Don’t use more than 50% of your storage space with ZFS and iSCSI.. WHAT?
    • “At 90% capacity, ZFS switches from performance- to space-based optimization, which has massive performance implications. For maximum write performance and to prevent problems with drive replacement, add more capacity before a pool reaches 80%. If you are using iSCSI, it is recommended to not let the pool go over 50% capacity to prevent fragmentation issues.”

So, this was some sad news, no write caching, cant use more than 50% of my disk space, but, I decided to press on. I went home for the night. The next morning I got a friendly email from my new server that it had some critical updates, cool, I though, so I installed the updates, now it wants to reboot. So, I let NAS4free reboot, two days later, more critical updates and a reboot required.. This is a bad thing for me. I run servers that really need to be up 24/7/365, yes, we run everything clustered, and redundant, and can reboot a server without anyone noticing, but not the entire storage device, that kills the point of having my VMs all stay up. This is still okay, because we have a second VM cluster, which has “the sister machines” to all our cluster nodes going into it. I just dont want to have to fully shutdown a VM cluster so the storage host can reboot once or twice a week. Kudos to the NAS4Free guys though, it’s a really good thing they are so active, it’s just not going to be the device for me.

So, I ripped it apart. Created 2xRAID1 SSD, a RAID10 set out of the 2TB drives, and installed my best friend Debian. Debian is rock solid, I only need to reboot for kernel updates, and that’s very few. Installed iscsitarget, setup my block devices using lvm, and bam! Within 30 minutes I had an iSCSI target setup and connected to Xen.

Reliability? I see a lot of ZFS fanboys touting that hardware RAID sucks, ZFS is awesome, good luck recovering your data, etc. I really haven’t had problems with RAID in the 15+ years I’ve been using it. We buy vendor supported hardware, if something dies, Dell sends me a new one. I backup onsite and offsite. I haven’t had to restore from a backup (other than testing restores), in years. I think this will all be okay.

Next article, I’ll write about setting up my iSCSI target, since there wasn’t many decent articles out there, I’ll write about it. It’s really pretty simple. Even have multipath IO working.