Like most, I learn a lot more by doing things wrong before doing them right. Maybe, I can save someone some of my learning pain, I mean curve!

Monday, September 21, 2009

zfs de-duplication has broken my heart, or ...

more accurately, my lack of attention to detail has broken my heart.  As I stated in my post on the FreeNAS forum "zfs de-duplication - is it working?" - "I absolutely hate it when reality doesn't match my pre-conceived ideas!!"

I'll blame it on the zfs zealots, surely it wasn't me.  Surely these fire-brand wielding zfs prosolytes are to blame for connecting my wants, I mean needs, up to future features and not letting me see that the error of my ways, my desire for block level de-duplication, was/is nothing but vaporware -- at least for now :(

While headway is being made on de-duplication for zfs, it is somewhere between alpha and beta land on Solaris (or OpenSolaris, or !@#$Solaris land, ...)  Block level de-duplication is no where near being part of zfs version 6 currently implemented in FreeBSD 7.  It isn't event part of version 13 that will be available in the next major FreeBSD release.

I was torn with what to do.  I considered going to Microsoft Windows Home Server and getting at least file level de-duplcation or possibly cracking the piggy bank and going to Windows Server 2008.  But in the end, after doing some scribbling on the back of a napkin and figuring out that I would be money ahead to buy another drive, re-build my array, and be OK (defined as enough space without block level de-duplication) for another 12 - 18 months, I decided to stay with FreeNAS, for now, until something better comes along, until I get bored, err, I'm rambling again.

Stay tuned until next time, when I discuss the trials and travails of getting some CF to IDE adapters to work!

And since it is NCAA Football season here in the States -- GEAUX TIGERS!!!

Bye for now,

lbe

Wednesday, September 2, 2009

Let's Tune 'er Up!!

I've had a little time since my last post to work on the tuning. Please consider this some where between coarse to medium tuning and certainly not fine tuning!

As I stated in a previous post, using the 0.71RC1 straight install and enabling nothing more "large read/write" and "use sendfile" in CIFS, I was able to achieve transfer rates of approximately 17 MB/sec. Performance with ftp was actually lower, ~9-10 MB/sec. In order to tune a system, one must know what the components can do. The components in this case were the raw drives, the zfs partition and samba/cifs. The tests of these components and their results are shown below.

Disk Test
I ran diskinfo from an ssh console against the first drive in my array (all three drives are the same - 1.5 TB WD Caviar Green WD15EADS).

nas01:/# diskinfo -tv ad4
ad4
512 # sectorsize
1500301910016 # mediasize in bytes (1.4T)
2930277168 # mediasize in sectors
2907021 # Cylinders according to firmware.

16 # Heads according to firmware.
63 # Sectors according to firmware.
ad:WD-WCAVY0511783 # Disk ident.

Seek times:
Full stroke: 250 iter in 7.376821 sec = 29.507 msec
Half stroke: 250 iter in 5.211402 sec = 20.846 msec
Quarter stroke: 500 iter in 8.364027 sec = 16.728 msec
Short forward: 400 iter in 3.197826 sec = 7.995 msec
Short backward: 400 iter in 3.506082 sec = 8.765 msec
Seq outer: 2048 iter in 0.774789 sec = 0.378 msec
Seq inner: 2048 iter in 0.571217 sec = 0.279 msec

Transfer rates:
outside: 102400 kbytes in 1.051929 sec = 97345 kbytes/sec
middle: 102400 kbytes in 1.142268 sec = 89646 kbytes/sec
inside: 102400 kbytes in 1.950994 sec = 52486 kbytes/sec

Clearly, the disk performance is not the cause of the bottle necks since the minimum disk transfer was 3.5 time or more faster than my sustained transfer rate.

zfs Partition
I used dd to create a 1GB file using two different block sizes, 1 & 64 KB.


nas01:/mnt/vdev0mgmt# dd if=/dev/zero of=mytestfile.out bs=1K count=1048576
1048576+0 records in
1048576+0 records out
1073741824 bytes transferred in 47.028245 secs (22831850 kbytes/sec)

nas01:/mnt/vdev0mgmt# dd if=/dev/zero of=mytestfile.out bs=64K
count=16384
16384+0 records in
16384+0 records out
1073741824 bytes transferred in 11.612336 secs (92465619 bytes/sec)
Again, both of these cases exceeded my test case though only barely with a 1KB block size. So this is not the culprit.

Network Transfer
I use iperf to test transfer rates in both directions, from server to workstation and vice a versa. The work station used is a Quad Core Intel with 8 GB of RAM. The memory use during the tests never exceeded 6 GB eliminating any workstation disk interaction. I ran iperf with two transaction size, 8 and 64 KB. The results are:

C:\>iperf -l 8K -t 30 -i 2 -c 192.168.2.5
------------------------------------------------------------
Client connecting to 192.168.2.5, TCP port 5001
TCP window size: 8.00 KByte (default)
------------------------------------------------------------
[148] local 192.168.2.207 port 57889 connected with 192.168.2.5 port 5001
[ ID] Interval Transfer Bandwidth
[148] 0.0-30.0 sec 984 MBytes 275 Mbits/sec

C:\>iperf -l 64K -t 30 -i 2 -c 192.168.2.5
------------------------------------------------------------
Client connecting to 192.168.2.5, TCP port 5001
TCP window size: 8.00 KByte (default)
------------------------------------------------------------
[148] local 192.168.2.207 port 57890 connected with 192.168.2.5 port 5001
[ ID] Interval Transfer Bandwidth
[148] 0.0-30.3 sec 1.70 GBytes 482 Mbits/sec

Again, both of these cases exceeded my test case significantly but did show that transaction size has a lot to do with network efficiency.

samba/cifs Test
I used the Microsoft RoboCopy utility to copy a 1GB from and to the NAS servers from my Windows workstation. The results are:

W:\tmp>robocopy . c: temp.file (READ)
Speed : 19140465 Bytes/sec.
Speed : 1095.226 MegaBytes/min.

Ended : Wed Sep 02 17:57:30 2009

This test more or less approximated my original test though it was slightly faster.

Conclusions:
  1. Neither the hard drives, the file system nor the network were major contributors to the bottleneck.
  2. The bottle necks then are "probably" in the kernel (stack, ipc, filesystem and network drivers) and in samba/cifs

Being basically lazy, and definitely not a good scientist (sorry teachers :( ), I surfed the web and found some tried and true tunings for samba/cifs as well as some items that seem to make sense for the kernel. Note, that testing has shown this configuration works for my server. I think the samba/cifs settings will likely help on any server as they have for me over the years across multiple Linux and BSD distributions. The kernel tunings are likely to have a heavy depency on the hardware that I use, namely the 1.6 GHz dual-core Atom 330, the Intel chipset (945GC northbridge and ICH7 southbridge) and Realtek NIC (RTL8111C) built into the MS-9832 motherboard and its 2 GB of RAM. If you hardware is too disimilar, you will "definitely" need to validate values on your own.

Here's what I did to tune my server.

samba/cifs tweaks
I added the following two lines to the auxillary parameters on the services/cifs configuration page.

max xmit = 65535
socket options = TCP_NODELAY IPTOS_LOWDELAY SO_SNDBUF=65535 SO_RCVBUF=65535

I also set the send and receive buffers to 65535 to insure that is what they are.

kernel tweaks
I harvested my kernel tunings from multiple locations with references to their source embedded as remarks below. These additons were made to my /cf/boot/loader.conf file since I am booting from a USB flash drive. I used the advanced file editor in the WebGUI to make these changes since it takes care of mounting the flash drive read write and then resets it to read only
.

# http://acs.lbl.gov/TCP-tuning/FreeBSD.html
kern.ipc.shmmax=67108864
kern.ipc.shmall=32768
#
http://harryd71.blogspot.com/2008/10/tuning-freenas-zfs.html
vm.kmem_size_max="1024M"
vm.kmem_size="1024M"
vfs.zfs.prefetch_disable=1
#
http://wiki.freebsd.org/ZFSTuningGuide
vfs.zfs.arc_max="100M"
# ups spinup time for drive recognition
hw.ata.to=15
# System tuning - Original -> 2097152
kern.ipc.maxsockbuf=16777216
# System tuning
kern.ipc.nmbclusters=32768
# System tuning
ern.ipc.somaxconn=8192
# System tuning
kern.maxfiles=65536
# System tuning
kern.maxfilesperproc=32768
# System tuning
net.inet.tcp.delayed_ack=0
# System tuning
net.inet.tcp.inflight.enable=0
# System tuning
net.inet.tcp.path_mtu_discovery=0
#
http://acs.lbl.gov/TCP-tuning/FreeBSD.html
net.inet.tcp.recvbuf_auto=1
#
http://acs.lbl.gov/TCP-tuning/FreeBSD.html
net.inet.tcp.recvbuf_inc=16384
#
http://acs.lbl.gov/TCP-tuning/FreeBSD.html
net.inet.tcp.recvbuf_max=16777216
# System tuning
net.inet.tcp.recvspace=65536
#
http://acs.lbl.gov/TCP-tuning/FreeBSD.html
net.inet.tcp.rfc1323=1
#
http://acs.lbl.gov/TCP-tuning/FreeBSD.html
net.inet.tcp.sendbuf_auto=1
#
http://acs.lbl.gov/TCP-tuning/FreeBSD.html
net.inet.tcp.sendbuf_inc =8192
# System tuning
net.inet.tcp.sendspace=65536
# System tuning
net.inet.udp.maxdgram=57344
# System tuning
net.inet.udp.recvspace=65536
# System tuning
net.local.stream.recvspace=65536
# System tuning
net.local.stream.sendspace=65536
#
http://acs.lbl.gov/TCP-tuning/FreeBSD.html
net.inet.tcp.sendbuf_max=16777216

While I originally had the intent of testing the impact of the individual settings, I quickly grew bored of the reboots and rigor required (hence my reason for choosing an engineering vs. a scientific career :)). I can clearly say, you "must" disable the zfs pre-fetch in order to get read rates up to the levels that I have achieved.

Tuned Results!
My first test were to verify that I achieved my end goal of having read and write rates in excess of 35 MB/sec. My goal was achieved. Yeah!!!


C:\>robocopy w: . temp.file (READ)
Speed : 39179078 Bytes/sec.
Speed : 2241.844 MegaBytes/min.


C:\>robocopy . w: temp.file2 (WRITE)
Speed : 35574390 Bytes/sec.
Speed : 2035.582 MegaBytes/min.

While the previous test is verification of my goal, I wanted to see if the changes to the kernel network configuration changed the base network performance. It did, significantly.

C:\>iperf -l 8k -t 30 -i 2 -c 192.168.2.5
------------------------------------------------------------
Client connecting to 192.168.2.5, TCP port 5001
TCP window size: 8.00 KByte (default)
------------------------------------------------------------
[148] local 192.168.2.207 port 58332 connected with 192.168.2.5 port 5001
[ ID] Interval Transfer Bandwidth
[148] 0.0-30.0 sec 985 MBytes 276 Mbits/sec

C:\>iperf -l 64k -t 30 -i 2 -c 192.168.2.5
------------------------------------------------------------
Client connecting to 192.168.2.5, TCP port 5001
TCP window size: 8.00 KByte (default)
------------------------------------------------------------
[148] local 192.168.2.207 port 58334 connected with 192.168.2.5 port 5001
[ ID] Interval Transfer Bandwidth
[148] 0.0-30.3 sec 1.89 GBytes 537 Mbits/sec

nas01:/# iperf -l 8k -t 30 -i 2 -c 192.168.2.207
------------------------------------------------------------
Client connecting to 192.168.2.207, TCP port 5001
TCP window size: 65.0 KByte (default)
------------------------------------------------------------
[ 3] local 192.168.2.5 port 55338 connected with 192.168.2.207 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-30.1 sec 2.11 GBytes 601 Mbits/sec

nas01:/# iperf -l 64k -t 30 -i 2 -c 192.168.2.207
------------------------------------------------------------
Client connecting to 192.168.2.207, TCP port 5001
TCP window size: 65.0 KByte (default)
------------------------------------------------------------
[ 3] local 192.168.2.5 port 49701 connected with 192.168.2.207 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-30.1 sec 2.12 GBytes 604 Mbits/sec

The network tuning increased the small block size read rate from 275 to 601 Mbps and the large block size read rate from 482 to 537 Mbps. I did not capture the screen on my write tests, but the results were approximlately 550 Mbps untuned and are now 601 Mbps for the small block and 604 Mbps for the large block.

As I sleepily look back and what I have done, I realize that I have not tested the NAS server to the point of determining the effect of filling the cache since I have tested with files size of 1 GB and have 2 GB of RAM in the server. That will have to wait for another day.

Again, as I warn earlier, you "must" test to make sure that these tweaks are amenable to your hardware. If you are running 32 bit version of FreeNAS, there are many more kernel tweaks needed. If you are running with less memory, you will need to reduce some of the allocations. If you have a much larger server with many more clients, you will need to increase the allocations and probably want to have a better NIC than my bargain basement Realtek.

Realtek takes a beating in many of the NAS and network centric forums; however, depending upon the usage patternse, the Realtek can be a more than capable GigE NIC as this testing shows. It just bears getting the tuning right!!!

Bye for now, more errors are coming this way!!!

Wednesday, August 26, 2009

NAS - OS selection

To ZFS or not to ZFS; To Windows SIS or not to Windows SIS; to use a dedicated NAS appliance OS or not to .... Well you get the point. This has been what I have spent most of my brain cycles, yes all three of them, thinking about for this project.

In general, I want a solution that at a least can automagically de-duplicate files (Windows SIS) but would love block level de-duplication. My primary use for this server will be to serve up files to Windows clients (CIFS/SMB) with some potential for FTP or other solutions. And with a little time, I'll probably put some type of streaming solution in place.

As of now, I have decided to use FreeNAS 0.7rc1 with ZFS support. I believe it is the best combination of ease, performance and flexibility for me. (If something else works better for you, good. I'm happy for you. :)

With the initial install things are working well booting from my SD card. Setup was pretty simple though the FreeNAS documentation for ZFS is pretty non-existant at this time. In short, the procedure is:
  1. Format all of your drives
  2. Create a virtual device (RAIDZ in my case) using all of the drives (whole disks are preferable to slices)
  3. Create a management object (there is probably a better name for this)
  4. Create a single ZFS share for CIFS.
I then enabled Active Directory support and connected the FreeNAS server to my home AD domain. I then created the two directories needed in the ZFS share and created their CIFS Shares and shared them out. Additionally, I enabled the SSH service so that I could log in remotely since this will be a headless system. Voila! Done! All is great!

Well actually, all is good. I ran into some limitations of the existing interoperability between FreeNAS 0.7rc1 (based on FreeBSD 7.2 RELEASE) which contains ZFS Version 6 and Samba used for CIFS. Currently, you cannot administer access control using the Windows tools. This will apparently be addressed with FreeBSD 7.3 which will provide ZFS Version 13 and a new version of CIFS which will support the zfsacl module as a Samba vfs object. For my relatively simple home setup, this is not a major problem. I logged into my FreeNAS server using SSH and modified the permissions using chown and chmod to do what I need and things are working fine now. So while a bit of an aggrevation, this does not impact functionality.

The following are a few brain droppings on other OSes that I considered. Please note, you will not find any religious zeal or bias here as it relates to Windows vs *nix vs Linux vs a Hamilton Beach Toaster. They are all good "tools" and should be used as appropriate. If you aren't intelligent enough to figure out "as appropriate", well then ....


Windows Server 2008 R2 - Given my desire to integrate with my home AD, currently running on a tried and reliable Windows 20083 Server and my current files server, this makes sense. Howevever, I don't want to invest this kind of money, even if I get an MS employee friend to buy a copy for me. SIS works reliably and would certainly meet my needs.

Windows 7/Vista/XP - Honestly, these desktop OSes could be configured to meet my needs but not my want to have de-duplication.

OpenSolaris & eon - These two solutions provide probably the pinnacle from a ZFS perspective; however, it wasn't clear that OpenSolaris provided full support for my hardware and my Solaris skills are not what they used to be. So laziness kept me from pursuing this one too much.

Linux (any thing from Fedora to ClarkConnect to OpenFiler) - there are a wealth of solutions out there based upon Linux. But not having kernel level ZFS support (FUSE is an interesting approach but not for me) or an alternative ready (like BTRFS) makes Linux unattractive to me.

Nexenta (Hybrid Solaris kernel & GNU user space) - This caught my eye and I gave it a spin. You have the originator in ZFS running the kernel and user space commands that I am current on. This sounded great, until I installed NexentaCP 2 and had problems with installing it to my SD card and then problems trying to get the network configured properly. I'm sure that both of these issues could be addressed with some knowledge; however, the documentation for Nexenta was practically non-existant. This made me seriously question if I wanted to go this route.

Hence in the end, my decision to go with FreeNAS. It has ZFS, though still in the early stages. It provides de-duplication. A base install gives me reasonable performance, which I hope to improve with tuning. The WebGUI is pretty well thought out. The only need that I have had for the command line was setting up directory permissions. So my lack of experience with FreeBSD shouldn't be too much of a problem.

Stay tuned for more advice, err stories about errors, to come!

NAS - Hardware Assembled with Error, of course!

OK, I got all the pieces in and put them together. Voila, they all fit. Kind of. Sort of ....

The Chenbro case is a nice little case; however, I have not figured out how to remove the front cover. The online manual depicts a cover that is slightly different from my physical cover. However, regardless of the manual that says you need to remove the cover to install the motherboard and SD card reader, one does not. You need to release the cover and get it open enough to remove the blank in the slim optical device slot and remove the SD card reader holder, but that is it. There is plenty of play to do so. This gives full access to the mounting points for the mother board.

While installing the motherboard, I realized an oversight, yes an error, that I made. The MS-9832 motherboard contains only one internal USB header block. So it is not possible to plug in both the the up front USB ports as well as the the SD card reader. Furthermore, I am missing a cable to connect the card reader at all. So for now, I connected the two USB ports on the front of the chassis to the internal USB header and have put my SD card in a nifty SD to USB converter. I will use the SD for the OS.

The WD drives easily mounted in the hot swap trays and slid in wihtout a problem.

I am not installing a CD or DVD reader in this unit to keep the overall power consumption down. I have added a PATA CD/DVD reader/writer from LG sitting on top of the case with the cable going through the open side access. I will use this to load the operating system on the SD card and will then remove it and button it up for the long run.

All in all, the hardware came up fine!

Tuesday, August 18, 2009

NAS - Hardware Requirements

My hardware requirements are very few:
  • Low power consumption - +/- 30W full load, <10w>
  • High Efficiency PSU
  • Quiet Case - fanless if possible
  • Minimum of 4 hot swap 3.5" drive bays
  • 3 Gbps SATA II Drives for the storage array
  • OS to be loaded on solid state device (SD)
  • GigE Network Interface
  • USB HID and minimal video support for OS installation

My selected hardware is:

  1. 1 - Chenbro ES34069
    Mini-ITX Case with 4-Hot Swappable Drive Bays, 120W PSU
    $171.85

  2. 1 - Chenbro E434440
    4-in-1 Card Reader - Card reader - 4 in 1 ( MMC, SD, miniSD )
    $13.84

  3. 3 - WD WDxxxxx
    1.5TB 7200RPM SATA II Hard Disk
    $358.50

  4. 1 - MSI MS-9832-05S
    Industrial IM-945GC Mini-ITX Mainboard with 1.6 GHz dual core Atom 330 processor, 2GB DDR2 667 RAM, on-board 4 x SATA II (3 Gbps), 2 - GigE NICS
    $207.02

Total Cost to date - $751.21

    I expect to add an 8GB Class 6 SD card (approximately $18) to the card reader upon which I will install the OS once selected.

    That's all for now.

    lbe

    NAS - Getting Started

    Over the years, I have mused about taking the time to share some of the things that I have designed, built and/or implemented to share back with the Internet community from which I have stolen, errr adopted, I meaned learned so much. In the past, I decided not to do so because of all kinds of excuses. I am going to try to change that now!

    For the past year, I have been musing in the background about buying or building a NAS server for home. I have made a decision to build and have selected the hardware and will soon finalize the OS decisions. I have searched the "whole" Internet, every little corner, ever bit, maybe even a qubit or two. And remarkably, I have failed to find a solution that meets what I personally want. That isn't to say I haven't found a lot, but just not the whole enchilada. So I have decided to start here.

    Stay tuned!

    lbe

    About Me

    Houston, Texas, United States
    Geek, sometimes its biting the head off of a chicken, sometimes its getting hit in the head while working on something :)

    Followers