Ever since I dropped Dropbox in September of 2013 due to mounting privacy concerns, I’ve been searching for and testing various filesystem syncing solutions to take its place.
This post is a summary of the notes I made during this time.
It’s quite hard finding reviews online that go deeper than the very surface. Most sync software reviews can be summarised as:
Hey, look, here’s a new service that’s a dropbox killer / dropbox alternatives / disruptive in the sync space. Wow, you get N gigabytes of free space, that’s more than dropbox. Look, it seems to be syncing these three files I put in there. That’s great. The End.
I hope the notes in this post can fill a tiny little bit of the giant sinkhole left by the hundreds of vacuous reviews following the template above. Over the past two years, I have used seven different Dropbox alternatives on my primary file collections (just under 50G in total). During each period, I actually committed to the relevant syncing system as my main and only sycing tool and kept notes detailing how this went.
TL;DR In searching for a Dropbox-level solution to synchronise my files between my four computers, I’ve spent some quality time with btsync, syncthing, seafile, CloudStation, SpiderOak, Wuala, Dropbox and unison. This post summarises my experiences. Warning: There is no real winner, but there is quite an amount of hopefully useful information.
I need a system that is able to keep my two main working repositories of files in sync across two laptops and two workstations. I have a Synology DS213j low-power home NAS that can also be used as part of syncing solutions that support it.
The two repositories are:
- sync-1: 13 gigabytes spread over 100 thousand files. This repo changes quite often. (At the start of the adventures described below, this repo was 15G spread over 150 thousand files.)
- sync-2: 35 gigabytes spread over 80 thousand files. This repo changes less often.
I should be able to close one laptop, and continue working on my desktop on the same files, without having to think too much about it. The client software should support Linux, because that’s what I mostly use. Preferably, the tool should support delta-syncing (owncloud does not do this, for example), deduplication and LAN sync, because I live an a bandwidth-starved part of the world. Importantly, it should be relatively easy to roll-back inadvertent changes and deletions.
Finally, an important use case for me is syncing git repos that I’m working on. This means that I want to work on some source code in a git repo, then go home without having to commit and push just because I’m going home, and continue working at home on a different computer, perhaps committing and pushing from there when I’m good and ready with my changes.
This peer-to-peer personal syncing solution, also called btsync, is often touted as a great dropbox replacement, and is also the one I started using right after I dropped Dropbox. I only have experience with pre-2.0 btsync.
btsync is really fantastically fast. Multiple gigabytes of files can be spread really really quickly through your mesh network.
However, I after upgrading to 1.4 I soon ran into the btsync-simply-refuses-to-sync-and-there’s-nothing-you-can-do-about-it problem also described in this forum topic. This was quite frustrating, to say the least.
I downgraded to 1.3.109, and all was fantastic. It managed to bring my sync-1 repository in sync between 4 nodes in no time at all.
I made sure that all nodes were time-synced using ntp because I had had some snafus with git repositories getting corrupted using btsync.
All was well in the world of syncing, until my Synology, part of my btsync mesh, was down for a few days, whilst all other nodes remained regularly active. When the Synology was switched on again, btsync happily overwrote a newer subdirectory of files with older versions across all nodes.
This last incident, together with the elephant in the room, namely the fact that btsync is closed source and hence quite hard to have audited, or just to be able to fix by myself, led to yet another breakup after some months.
Bye-bye btsync, it’s you, not me.
Syncthing is more or less an open source version of btsync.
The author is passionate about this project, it’s run very well, the tool itself has a gorgeous web-interface and the project’s goals are laudable. This is the peer-to-peer syncing tool you really want to succeed!
However, besides the fact that the binary took up a little too much RAM on my Synology (this is forgivable, my synology only has 512MB), there were two issues preventing me from using this as my primary sync tool:
- syncthing does not have filesystem monitoring integrated. This means it needs to scan your whole sync repository every so often for changes. This scan was eating a significant amount of CPU cycles on my laptop.
- My work laptop very often had issues connecting to my synology at home, even after futzing with port-forwarding on my firewall at home.
I’ll probably check in again after some time to see how my favourite peer-to-peer syncing tool is doing!
Seafile is a fantastic project. In short, it’s sort of a DropBox clone, but both the server and the client are completely open source. As if that wasn’t enough, it supports, optionally, end-to-end client-side encryption.
Because I was not yet ready to invest time and money to setup my own server, and I wanted to evaluate its performance first, I bought a three month subscription to their commercial service, seacloud, for $30.
Uploading my sync-1 repository (15G and 150 thousand files at the time) took about 45 hours.
Soon after this, I started running into issues.
Firstly, every time I moved with my laptop to a new access point (for example between home and work), it would just sit there reporting that it was “syncing”, when in fact it had merely gotten really confused by the changed network connection. Only stopping and starting the client would get it going again.
Secondly, and this was the deal-breaker, even a minute change to a small text file, would result in the seafile client (I tested up to version 4.0.4) using up a whole Core i7 core on my laptop for a few minutes. I logged this as a bug on the seafile github project.
Searching around the net, and reading the comments on the bug I reported, you’ll see that more people are experiencing this error.
Unfortunately, this meant that I had to leave seafile on my journey to a suitable syncing solution. I will keep an eye on it, because it does have amazing potential.
If you’re interested in seafile, you will definitely have to read the blog posts of Pat Regan. They are filled with in-depth information and experiences, very much unlike most other syncing tool reviews on the interwebs.
The Synology NAS is a great product. They’re basically efficient little Linux machines with a whole bunch of Synology-written and packaged software to boost your home network’s utility. I have the DS213j.
CloudStation is Synology’s answer to Dropbox.
The out-of-the-box experience is really smooth. Configuring it on the Synology server is well done, whilst the client installs easily on your Linux (or Windows or MacOS) machines.
Uploading my 15G, 150 thousand file (at that point) sync-1 repository took about 40 hours in total. This is with me sitting on the same LAN as the Synology. The poor little thing is just very under-powered.
On my main Linux laptop, I could not get the file manager sync status icon overlays working, and together with Synology support I was not able to get this problem sorted.
I could live with no icon overlays, convenient as they may be, but the deal-breaking was a day or so later when the CloudStation client started acting up.
As I was editing a Python source code file, the client kept deleting the file, as I was working on it, and making it appear as
filename_hostname_date_Conflict.ext. The first time I simply renamed this file back, thinking some once-off issue with the sync, but CloudStation stubbornly kept on deleting my file and creating the conflict-named one.
Synology, I love you, but CloudStation is OUT!
This is probably the most well-known Dropbox alternative that supports end-to-end encryption. It’s even been name-dropped by Edward Snowden, which is high praise in these circles.
Although SpiderOak as a company has a solid reputation, and has released a number of open source encryption-related software packages, the SpiderOak client itself is closed-source. This means we have nothing more than SpiderOak’s word that their client is indeed performing the end-to-end encryption in a secure way, and is not adding some backdoor key to every packet passing from your computer to their servers. I do think that their service has a higher chance of not beeing snoopable than, say, Dropbox.
In any case, I signed up for the new (at the start of 2015) $12 / month 1 terabyte package and again started uploading my sync-1 repo. SpiderOak was the slowest of the bunch: It took about 60 hours to upload the complete repository.
SpiderOak is significantly more flexible than Dropbox. You can configure it to backup any number of directory trees on your computers. Once a tree has been backed up, it becomes part of your SpiderOak cluster and is visible from all other nodes. To sync, directories on multiple computers, you first have to setup backup on all computers for the tree in question, and only once all backups are complete, you can configure a sync involving any number of those backed-up directories.
SO also supperts lan sync, and has been designed from the start to support per-user de-duplication of file blocks, even although file blocks are encrypted. Furthermore, they’ve also managed an rsync-like efficient file transfer with encrypted blocks. Pretty nifty!
After having completed the backup and sync setup of sync-1 on two laptops and two workstations, I was quite happy with SpiderOak’s flexibility, and the apparent security of the software, so I decided to add sync-2 to the mix, but only for two of the computers.
After the long process of getting everything synced up and starting to think that SO was going be The One, I started running into issues. Firstly, SO loves your RAM. At one point, the client was taking up just under 1GB of RSS on my main Linux laptop. Ouch.
I could probably have learned to live with that, were it not for the fact that the client is just extremely slow to pick up new changes on the filesystem and especially so when adding new computers to your sync network (that syndication process, anyone?). At the end of the work day, I would often have to wait quite a whie for SO to sync all the last changes I had made so I could go home.
Perhaps more insidious than that, was the behaviour that SO would have huge but silent problems with starting to sync again after laptop resume. I would get home, resume my laptop, and after 30 minutes SO would still show no activity whatsoever, in picking up new changes from the SO servers, or picking up changes on my laptop. I would have to kill the client and start it up again.
All in all, SO is probably the service that I wanted the most to succeed, but in the end, these opaque anomalous behaviours and the continuous waiting killed it for me. Sorry SO, I really want to see this work…
After the SpiderOak adventure, I briefly tried the other end-to-end encrypted file syncing service, this one from Switzerland. It’s called Wuala, and it belongs to La Cie, which belongs to Seagate, a US company, so we can unfortunately not count Wuala’s Swiss origins as a privacy plus.
In any case, Wuala does not have a free tier anymore, so I ponied up a few euros for the 20G package. I was planning to start by testing just my sync-1 repository.
When this started to upload my files, it surprised me by maxing out my admittedly puny almost 1 Mbit/s upstream. At work, where we are fortunate enough to have 20 Mbit/s symmetric optic fibre, it managed to get up to 3 to 4 Mbit/s, which is quite good considering that that fibre is shared by a number of engineers.
I was also quite impressed by the Java client application. It had similar flexibility to SpiderOak, but was much more user-friendly. Also the real-time per file sync feedback (only in the app itself) was significantly more useful than SpiderOak’s log-style reporting of syncinc activity.
As Wuala was uploading from my laptop, I thought: Hey, let’s install Wuala on my workstation as well. The sync-1 file tree there is identical to that on my laptop. Any self-respecting file sync service should be able to handle this in its stride, and perhaps with two computers encrypting and uploading I could get slightly higher throughput.
Well, it turns out that Wuala is quite lacking in this regard. As soon as I started up the client on my workstation, on the already identical file repos, the clients on both laptop and workstation started throwing up numerous error dialogs about file conflicts.
I was quite surprised by this. I expected a mature syncing service like Wuala to handle this in its stride. It has all of the file block hashes, so at the very least it should realise that a bunch of blocks already exist on its servers, skip the already present blocks, and just continue with its merry life.
At this point, I was not in the mood to continue with Wuala, and so I said good bye.
After Wuala, and almost two years of searching, I was ready to give up. Finding a tool or system that would satisfy all my requirements had already cost me too much time. I was ready to tell the NSA: “Oh go ahead, run your algorithms over all my data. Read my files. Do that thing that you do, I just want to SYNC!”
I signed up for the Dropbox 1 TB package and started uploading sync-1, now whittled down to about 100 thousand files and 12G.
Wow, it was just as fast and as pretty as I remembered it!
However, I soon came across files that simply refused to synchronise. Dropbox would get stuck at “Uploading N files…” for hours, and N would remain constant. Finally I had to resort to uploading these files through the web interface. This cost me time (that thing I was trying to save) but got the job done.
After one and a half weeks, my Linux workstation which had been inactive for a few days, was switched on again for the first time. All of a sudden, massive file activity on my laptop.
Hey, a mass deletion of all my source code!
I know that source code was not deleted on the Linux workstation. I don’t know how this deletion could have been triggered, but it was.
No problem right? Let’s just go to the dropbox events interface, and undelete those files. Wait what? The events interface refuses to work, instead just reporting “There was a problem completing this request”:
I can undelete the files via the file browsing interface, but they’re spread out and that would take ages. Fortunately I had a good backup of the whole sync-1 repo so I could restore from there.
Dropbox support has been really helpful and said they would undo the mass deletion events. However, it’s now four days later; my files have not been restored, and the dropbox events interface is still broken.
If this had been my primary file store and I didn’t have the great backups I did, this would have been a complete nightmare. As it stands, I’ve unlinked and uninstalled dropbox from everywhere, but I would still like to have my files restored, and the mystery of the events interface solved.
My confidence in dropbox (it NEVER disappointed me in the years before 2013) has taken a severe knock. Added to its security issues, we’ll have to see what happens to our relationship in the coming time.
In the meantime, I’ve returned to my very old friend unison. I used this in the years before dropbox to keep my stuff synchronised.
Furthermore, the unison boys and girls have been busy in the meantime. In version 2.48.3, there’s even a neat filesystem monitor with which you can setup lightning fast automatic bi-directional file syncing. It’s all very nerd-DIY, but for some of us that’s a plus.
unison is terribly efficient at transferring changes to and fro. It makes use of the rsync-algorithm, with added niceties like duplicate detection shortcuts. I’ve now set it up in a star topology with my synology. I’ve setup daily incremental backups on the synology so that I have at least some form of roll-back and deletion recovery.
At the moment, because I’m still a little shaken by the mass deletion event, I have the unison GUI running. I simply press
A to have it scan and sync any specific file tree that I’ve setup. It takes about a second for sync-1, and I get a list of the changes it makes so I can check that it’s not nuking half of my files.
unison makes me work harder, but of all of the solutions discussed in this post, it gives me the most control.
This is for sync systems that are still on my list to try, or that I used very briefly.
Git-based open source syncing system. By design, unable to synchronise git repositories. The author does not feel that this is an issue.
After a healthy bout of laughing, I nuked it from all of my computers.
There you have it: A whole bunch of words about a whole bunch of personal syncing solutions. I have not yet found The One, although at this moment unison is doing the job quite well, albeit with some klunkiness.
I would love to hear your thoughts on and your motivation of your favourite syncing tool in the comments below, or over at Hacker News!