High Performance Professional Photo Storage on a 10GbE Synology NAS
I’m a professional commercial portrait photographer with a need for large and safe data storage needs. Over the last few weeks I’ve been researching and setting up a new high-performance network attached storage system for myself that will hopefully last me a few years. After reading dozens of articles, blog posts, and forum threads, I though I should put all that I’ve learned in one place for others to potentially benefit by. So enjoy!
Disclaimer: I just want to say right off that I’m not sponsored or paid by any of the companies I mention in this article. All of this information is based on my experience or research I’ve done. End of disclaimer.
Here are some details about my specific workflow so that all of the thoughts and decisions that follow might have some context.
I’ve been shooting with a Pentax 645z for the past few years. It’s got its foibles, but for the money, the image quality is absolutely outstanding and used glass is inexpensive and plentiful especially compared to Phase One, Hasselblad, and Fuji.
I edit my RAW files in CaptureOne 12 after batch editing the metadata on import so that C1 sees the files as IQ250 DNGs (same Sony sensor). Otherwise it artificially locks them out. That’s a whole other story which I’ve written up over here: https://medium.com/@billwadman/my-pentax-645z-medium-format-conversion-in-a-nutshell-45d28861d54e
I use a single C1 session file, and I just navigate around my image folders within the library tool. Not the most efficient way of doing it, but in my experience it’s much better performance than using a catalog in C1 and more convenient than a session for each project. For overall catalog work, I miss Lightroom.
Almost images I work on end up as a layered uncompressed TIF. These end up being between 1–4GB per image. The compression on save takes a while with big files and so writing the uncompressed files straight to disk is faster. I don’t edit THAT many files so I’m willing to give up the disk space for the speed when saving. This was more of a problem back when saving was modal in Photoshop, but old habits die hard.
The Old Storage Situation
When you’re working with tens of terabytes of data and a finite budget, storage can become a tricky problem to tackle. Every 5 years or so I end up getting close to outgrowing my current setup and need to rethink from scratch. Here’s where I was starting out:
I have a 1TB internal SSD in my iMac Pro with one APFS container for the OS and another for current working files. This would include client jobs that perhaps I was waiting on selects to retouch or personal projects I wasn’t quite finished with. This drive is crazy fast (like 3GB/s) and so I tend to work of this whenever possible.
Next in line is a 20TB G-Tech array with two drives in RAID-0 for speed (aka Scary RAID) connected to my Mac via USB3 (It’s got Thunderbolt2 as well, but I didn’t have a TB2>TB3 adapter, and honestly the throughput of a couple of spinning drives is still well within the limits of the USB3 interface. I was getting around 350MB/s read and write from this drive. This was my online archive of everything I would generally want access to on a day to day basis.
For backup, the G-Tech array was being cloned using SuperDuper over to a four bay USB3 Drobo from a few years ago. At the time I had two 8TB and two 4TB drives in there which gave me about 14.5TB of usable space.
I had one more level: Cold Storage. Two 4TB external drives that I kept in sync which held old client and personal work that I didn’t want to or couldn’t throw out, but that I didn’t need to access very often.
Problems with the Old System
First off, I was running out of space. The 20TB G-Tech array was at around 13.5TB full. You really don’t want to fill up hard drives. They get slower and crankier as they fill up. Plus you’ll remember that the Drobo I was backing it up to only has 14.5TB total, so things were getting to a boil.
Speaking of the Drobo, I started having problems with it (like a lot of people I reckon). It would randomly unmount while doing clones which was obnoxious, and then I had a failed drive and it was seemingly having a hard time rebuilding when I replaced it. Basically I didn’t trust it anymore. And data storage you don’t trust has to go. Plus, even if you swapped out all the drives for say 8TB ones, the maximum volume size on a Drobo like mine is 16TB, so it ends up chopping up installed storage into multiple volumes. Not ideal for backing up one big array.
What I need
I need a lot of storage space that I won’t outgrow anytime soon; ideally it would be expandable. I was thinking 30TB would allow me to move everything over and cover me for the next couple years. Oh, I’m also doing a bit more toying with video, so that also ups my future storage needs. It also needs to be fast enough to work off of. At least as fast as the G-Tech array I’ve got. So 300MB/s or so.
A multi drive desktop RAID array could work great. Thought there are surprisingly not THAT many options from reputable vendors. OWC sells a four bay, but it’s noisy and I’ve heard horror stories. Lacie has a couple four bays as well, but they’re expensive and I feel like I’ll outgrow them too quickly. Something like a Pegasus Thunderbolt array could work, a 32TB eight bay array is about $3000. It would give me a lot of space, basically ~28TB in RAID-5, and I do prefer the idea of direct attached storage. I just like my drives connected right to my computer. I don’t access them from other machines and I don’t need to access them via the net from afar. Plus then the Mac sees them just as Mac drives. I get Spotlight and fast finder browsing.
The thing is, that’s a lot of drives spinning and clacking right next to me. It would be nice if I didn’t have to have the drives on my desk. This iMac Pro is astonishingly quiet (The only time I hear it at all is when I render videos in DaVinci Resolve and the fans spin up, otherwise it’s just about completely silent). The tiny slow fan on my UPS under the desk is more audible. My two drive G-Tech array is loud enough that I unmount and turn it off when I’m not using it. Imagine that multiplied by four. No thanks. Plus if I end up with 24TB of data on this thing and need to upgrade? I have to buy different bigger one to copy it over to? Yuck.
I’d also like to simplify my backup drives as well. The Drobo array was handy because I had it, and single drives weren’t that big a few years ago, but they are now. For this transition process, I bought a 14TB WD EasyStore drive to make an extra copy of everything after the Drobo quasi-died and while I did my shuffling of data around. It was less than $300 with tax and shipping. So I think I want to get my backups to single drives instead of arrays.
So as I often do when I have a hard time making a technical decision, I called my friend Dan Gottesman for a second opinion. Basically Dan suggested that I just bite the bullet and invest in a networked storage solution. As you may recall, I prefer direct attached drives, but Dan convinced me that I’m at that inflection point of storage needs where I’ve got to get a bit fancier. Our mutual friend Angus Oborn made the jump a while back and so I got a third opinion from him and decided to make the leap. At this moment the Coronavirus lockdown is just starting and I’ve can’t work so figured it’s a perfect time to give myself a project that might take a while to figure out and fine tune.
In the past, the main problem I had with network drives is the throughput. A one gigabit ethernet connection tops out at about 118 MB/s at best. Not NEARLY fast enough to actually work on without driving me crazy. However my iMac Pro happens to have an ace in the hole; It’s got a 10 gigabit ethernet port that I never thought I’d use. I use 802.11AC WiFi for my internet connection, which means that the ethernet port is just sitting there. 1GbE isn’t fast enough, but 10GbE is.
The New Setup
I ordered myself a Synology DS1819+ eight bay NAS, the optional 10GBe, a 50ft CAT7 ethernet cable, and four 10TB Seagate Iron Wolf NAS drives to get me started. About $2400 in total.
When everything arrived, I ran the cable from my desk, all the way around some furniture and into the big closet where I keep my equipment. It’s got plenty of airflow in there so I’m not worried about heat. Installed the 10GbE card in the Synology, put it on my wire rack shelf with my printers, and plugged in the four new drives into bays 1–4.
Before I got started with my data I first plugged the first Synology internal 1Gb ethernet port, which is set to grab an IP via DHCP, into my router so it could talk to my whole network and the internet. I connected to it via a browser through my wifi and got used to the whole DSM interface, let it update the software, set up my account on the thing, etc. While this box can do a TON of different things (Media server, virtual machines, Torrent/NZB downloader, and more) I plan to keep it VERY simple. Just a dumb file server connected directly to my Mac. Not even connected to the internet most of the time. Dan hooked me up with some local networking knowledge, so while I was in there I manually set the IP address of 10GbE interface to 10.10.10.10 (see image) and set the MTU to 9000 otherwise known as Jumbo Frames which really improves throughput of data on a network file server over a share. On the Mac, I setup my ethernet port similarly, but with 10.10.10.11 as the IP address. When that was all done, I switched the cable to be plugged directly from the Mac’s ethernet port straight into the 10GbE card in the Synology. I could then connect to the Synology through a web browser at 10.10.10.10 and I was ready to rock & roll.
I’m not going to get into the details of setting up storage pools and volumes on the Synology, there’s plenty of that on the net which you can find or RTM. I decided to use Synology’s single drive redundant SHR RAID type which from the white papers I’ve read is similar to RAID-5 with just a tad more abstraction that allows more flexibility down the line with a very small performance penalty. UPDATE (2/5/2021): I have recently added two more 10TB drives for a total of six. One I added to the array as additional storage and then I’ve used the second to upgrade to SHR2 two drive redundancy. The conversion of which has been running for two weeks now and is only 64% complete. So if you can, buy extra drives when you’re setting it up.)The software also offers traditional RAID levels like 0, 1, 10, 5, and 6, but I liked the idea that SHR is a bit more flexible if I need start using larger drives in the future. Maybe at some point I’ll install another drive and upgrade to SHR2 which would allow my to lose two drives and still not lose my data, and maybe an extra hot spare drive installed so it could rebuild on it’s own if it needed to. However the rebuild time for two drive parity RAID setups like that can be a LONG time, and I’m going to keep backups on external drives so single drive redundancy across my four drives is what I’m going to stick with for now. When setting up the storage pool I used the newer and modern BTRFS file system (I know the FS stands for file system, I’m just being clear to the reader). BTRFS has all kinds of data protection, snapshot and checksum self-healing qualities which are exactly the kinds of things I want from a file system. These are things that HFS+ does not have, and bit rot is real people. I’ve also setup my system to do a Data Scrubbing once a month to make sure every bit is what it’s supposed to be. That’s all I’m going to get into the Synology specifics for now so as not to reinvent the wheel.
Here’s where things become a little more cloudy. There are a number of different protocols you could use to get to the data on the Synology from your Mac across the network. The most common which are SMB (Microsoft’s Server Message Blocks) and AFP (Apple Filing Protocol, or if you’re old school AppleTalk). Apparently from all of the research I’ve done, AFP is being deprecated and Apple itself is suggesting that people move to SMB. That said, in my research I’ve found lots of people who use AFP happily. You can turn these protocols on and off in the Synology control panel and do some tests. Just be sure to set your SMB to SMB3. Older versions are slow as dirt.
I did some tests myself and here’s where I came out:
SMB3 gave me 298MB/s read and 485MB/s write.
AFP gave me 312MB/s read and 434MB/s write.
Very similar all things considered. Once I’ve got a bunch of data on there things have moved more in SMBs favor so that’s what I’ve been sticking with for the time being, but we’ll see. I may revisit.
I’m not exactly sure why I’m getting demonstrably slower reads then writes. If anything I would expect the other way around. There’s a penalty in using single drive redundancy, but this is secondary storage for me, not what I’m working off all day long so it’s fine for now, and throughput should improve by adding drives to the array, so maybe I’ll put one more in and see.
Some people complain about network shares disconnecting at random times, especially on MacOS. I ended up purchasing a well reviewed app in the Mac App Store called AutoMounter for $10 which sits up in your menu bar and automatically remounts them if they drop. Seems to work well. On wake every once in a while I’ll get an alert that one or all of the shares disconnected, but they’re mounted again automatically. Remember to pause AutoMounter if you want to manually unmount the shares for any reason, otherwise it’s going to comically remount them for you every time you try. Doh!
Oh, and as an aside, you can now even setup a Time Machine drive which is broadcast over SMB. I’ve done this with an old WD Red drive in bay 8 just as a little test and it seems to work just fine.
UPDATE (2/5/2021): Changes in the security model of MacOS on Intel make installation of the two main iSCSI clients GlobalSAN and ATTO require turning off System Integrity Protection (SIP). Which is generally something I don’t recommend doing. Also there are currently NO iSCSI clients for the new Apple Silicon based M1 Macs. I have not gotten a good answer as to whether there is a way to use new Apple storage APIs to write new iSCSI drivers or if the old and not deprecated kernel extension method is the only solution. Either way, at the moment, I think iSCSI might be a dead-end on Mac sadly.
There is one OTHER way to connect to the Synology storage and that’s through iSCSI. Setup this way, the storage on the Synology shows up as a regular external disk on the Mac. Even prompts you to go into Disk Utility and format it HFS+ and everything. In some ways this is the best of both worlds for me personally because I’m only connecting from one computer, the Mac will think it’s a local drive, so I’ll get spotlight and trash even though it’s all still 30ft away in a closet and therefore much more quiet. But there are a few issues.
One is that MacOS requires a piece of software called an iSCSI initiator to connect to an iSCSI target like the Synology. The one people seems to recommend most is GlobalSAN Initiator which costs $89. Not the end of the world, but an additional expense. Also, I don’t know how deep that software needs to burrow into the OS to work. What if in the next revision of MacOS (I’m still running Mojave, by the way, Catalina has no advantages for me and a whole lot of bugs) they lock down MacOS even more and iSCSI is no longer an option. Then I’m stuck with 20TB of data that I can’t access and have to move somehow. That makes me worry.
Two is that I’ve heard that you really should have an uninterruptible power supply on both ends of an iSCSI connection, because if either of them goes down you can corrupt the volume. Now, this is anecdotal and I’m having a hard time verifying the reality. Is it worse than unplugging an external drive that’s being accessed? Which most of the time will be fine and occasionally require First Aid in Disk Utility or at best Disk Warrior to mend? Maybe, I’m not sure.
Three is that I’d then be using HFS+ or APFS (though Apple says APFS isn’t ideal for spinning disks in heavy write situations because of potential performance problems due to file fragmentation) and potentially losing out on all the great BTRFS data integrity advantages. That said, the RAID array underneath the iSCSI LUN as they call them would still be formatted BTRFS, but I’m not sure if that still helps my HFS+ formatted data chunk on top of it. Very hard to find answers to these kinds of technical questions.
GlobalSan has a trial version though and so I did some tests and ended up with 254MB/s read and 407MB/s write. A bit slower that the standard share protocols actually, which surprised me a little. There is a vague statement in the press for the next version of the Synology DSM software that they have ‘supercharged’ or some similar word the iSCSI implementation. So I’m keeping it all in the back of my head. I don’t really WANT to copy all my data over again if I decided to use an iSCSI drive instead of the standard NAS shares, but I could.
So for the time being I’m going to be using at G-Tech 20TB array and the 14TB WD external I mentioned earlier to backup the Synology data. There is a way to do this by directly plugging the drives into USB ports on the Synology and using an app on there called HyperBackup. However then I think the drives are Linux formatted and I’ve heard can be quite slow. So for the time being at least I’m just going to keep these drives on my desk and plugged into my iMac. Every once in a while or when I make changes to the Synology data I care about, I’ll use ChronoSync to keep everything in sync. That way I have an Apple formatted drive just sitting right next to me with everything on it.
I also use BackBlaze for online backup. Backblaze won’t backup network drives, so I lose that functionality, but I can turn on the backup drives that ARE plugged into my Mac and make sure they backup. However with my crappy upstream Spectrum internet topping out at 20Mbs, it’s not like uploading 15TB in any reasonable timeframe is an option anyway.
Any images I do ‘finish’ get exported as full-res high-quality JPEGs and uploaded to my dropbox. So I’ll still have everything I really care about if the zombie apocalypse happens.
So where am I at? Well I’ve got 28 or so terabytes of data storage in a relatively quiet box in my closet direct connected to my iMac Pro via 10Gb ethernet using SMB3 shares. I’ve been working off of it for a week or so now. It’s a little sluggish sometimes when jumping quickly between folders in CaptureOne, but not that different from a direct connected spinning hard drive. We’ve all gotten so used to the instantaneous feel of fast SSDs that it’s hard to go back. That said, I’ve got all of my data including the Cold Storage drive up on the Synology and available to me at all times without having to plug in and manually spin up loud drives on my desk. And with plenty of room to grow just by adding a $300 drive for another 10TB.
If you’ve read all of this and have some experience or optimizations I haven’t thought of, or questions to ask, please do so in the comments below. While there are a lot of people out there with these NAS boxes in their closet. Not many of them are using 10Gig ethernet, or direct connecting, or using iSCSI unless they’ve got multiple virtualize Windows servers. So let’s stick together. We need all the shared knowledge we can get.