Posted by rrreese 20 hours ago
Technically speaking, imagine you're iterating over a million files, and some of them are 1000x slower than the others, it's not Backblaze's fault that things have gone this way. Avoiding files that are well-known network mount points is likely necessary for them to be reliable at what they do for local files.
It's important to recognize that these new OS-level filesystem hooks are slow and inefficient - the use case is opening one file and not 10,000 - and this means that things you might want to do (like recursive grep) are now unworkably slow if they don't fit in some warmed-up cache on your device.
To fix it, Backblaze would need a "cloud to cloud" backup that is optimized for that access pattern, or a checkbox (or detection system) for people who manage to keep a full local mirror in a place where regular files are fast. This is rapidly becoming a less common situation. I do, however, think that they should have informed people about the change.
The technical and performance implications of backing-up cloud mount-points are real, but that's zero excuse for the way this change was communicated.
This is a royal screw-up in corporate communications and I would not be surprised if it makes a huge negative impact in their bottom line and results in a few terminations.
They're really proving lately that they are a company that can't be trusted with your data.
There are 2 components in my mind: the backup "agent" (what runs on your laptop/desktop/server) and the storage provider (which BB is in this context).
What do people recommend for the agent? (I understand some storage providers have their own agents) For Linux/MacOS/Windows.
What do people recommend for the storage provider? Let's assume there are 1TB of files to be backed up. 99.9% don't change frequently.
I've also configured encrypted cloud backups to a different geographic region and off-site backups to a friend's NAS (following the 3-2-1 backup rule). It does help having 2.5Gb networking as well, but owning your data is more important in the coming age of sloppy/degrading infrastructure and ransomware attacks.
Trying to audit—let alone change—the finer details is a pain even for power users, and there's a non-zero risk the GUI is simply lying to everybody while undocumented rules override what you specified.
When I finally switched my default boot to Linux, I found many of those offerings didn't support it, so I wrote some systemd services around Restic + Backblaze B2. It's been a real breath of fresh air: I can tell what's going on, I can set my own snapshot retention rules, and it's an order of magnitude cheaper. [2]
____
[1] Along the lines of "We have your My Documents. Oh, you didn't manually add My Videos or My Music for every user? Too bad." Or in some cases, certain big-file extensions are on the ignore list by default for no discernible reason.
[2] Currently a dollar or two a month for ~200gb. It doesn't change very much, and data verification jobs redownload the total amount once a month. I don't backn up anything I could get from elsewhere, like Steam games. Family videos are in the care of different relatives, but I'm looking into changing that.
As for GUIs in general... Well, like I said, I just finished several years of bad experiences with some proprietary ones, and I wanted to see and choose what was really going on.
At this point, I don't think I'd ever want a GUI beyond a basic status-reporting widget. It's not like I need to regularly micromanage the folder-set, especially when nobody else is going to tweak it by surprise.
_____
[1] The downside to the dumb-store is a ransomware scenario, where the malware is smart enough to go delete my old snapshots using the same connection/credentials. Enforcing retention policies on the server side necessarily needs a smarter server. B2 might actually have something useful there, but I haven't dug into it.
Feel free to reach out to me if you have any questions about setting up duplicati.