backup | Technology Tales

Avoiding permissions, times or ownership failure messages when using rsync

22^nd April 2023

The rsync command is one that I use heavily for doing backups and web publishing. The latter means that it is part of how I update websites built using Hugo because new and/or updated files need uploading. The command also sees usage when uploading files onto other websites as well. During one of these operations, and I am unsure now as to which type is relevant, I encountered errors about being unable to set permissions.

The cause was the encompassing -a option. This is a shorthand for -rltpgoD, and the individual options perform the following:

-r: recursive transfer, copying all contents within a directory hierarchy

-l: symbolic links copied as symbolic links

-t: preserve times

-p: preserve permissions

-g: preserve groups

-o: preserve owners

-D: preserve device and special files

The solution is to some of the options if they are inappropriate. The minimum is to omit the option for permissions preservation, but others may not apply between different servers either, especially when operating systems differ. Removing the options for preserving permissions, groups and owners results in something like this:

rsync -rltD [rest of command]

While it can be good to have a more powerful command with the setting of a single option, it can mean trying to do too much. Another way to avoid permissions and similar errors is to have consistency between source and destination files systems, but that is not always possible.

Limiting Google Drive upload & synchronisation speeds using Trickle

9^th October 2021

Having had a mishap that lost me some photos in the early days of my dalliance with digital photography, I have been far more careful since then and that now applies to other files as well. Doing regular backups is a must that you find reiterated by many different authors, and the current computing climate makes doing that more vital than it ever was.

So, as well as having various local backups, I also have remote ones in the form of OneDrive, Dropbox and Google Drive. While these more correctly are file synchronisation services, disciplined use can make them useful as additional storage facilities in the interests of maintaining added resilience. There also are dedicated backup services that I have seen reviewed in the likes of PC Pro magazine, but I have to make use of those.

Insync

Part of my process for dealing with new digital photo files is to back them up to Google Drive, and I did that with a Windows client in the early days but then moved to Insync running on Linux Mint. One drawback to the approach is that this hogs the upload bandwidth of an internet connection that has yet to move to fibre from copper cabling. While having fibre connections to a local cabinet helps, a 100 KiB/s upload speed is easily overwhelmed and digital photo file sizes keep increasing. It does not help that I insist on using more flexible raw formats like DNG, CR2 or CR3 either.

While making fewer images could help to cut the load, I still come away from an excursion with many files because I get so besotted with my surroundings. This means that upload sessions take numerous hours and can extend across calendar days. Ultimately, this makes my internet connection far less usable; hence I want to throttle upload speed, much like what is possible in the Transmission BitTorrent client or in the Dropbox client. Since this is not available in Insync, I have tried using the trickle command instead, and an example is below:

trickle -d 2000 -u 50 insync

Here, the upload speed is limited to 50 KiB/s while the download speed is limited to 2000 KiB/s. In my case, the latter of these hardly matters, while the former leaves me with acceptable internet usability. Insync does not work smoothly with this, though, so occasional restarts are needed to keep file uploads progressing and CPU load also is higher. As rough as the user experience feels, uploads can continue in parallel with other work.

gdrive

One other option that I am exploring is the use of the command-line tool gdrive and this appears to work well with trickle. After downloading and installing the tool, getting going is a matter of issuing the following command and following the instructions:

gdrive about

On web servers, I even have the tool backing up things to Google Drive on a scheduled basis. Because of a Google Drive limitation that I have encountered not only with gdrive but also with Insync and Google's own Windows Google Drive client, synchronisation only happens with two new folders, one local and the other remote. Handily, gdrive supports the usual bash style commands for working with remote directories, so something like the following will create a directory on Google Drive:

gdrive mkdir ttdc [ID for parent folder]

Here, the ID for the parent folder may be omitted, though it can be obtained by going to Google Drive online and getting a link location by right-clicking on a folder and choosing the appropriate context menu item. This gets you something like the following and the required identifier is found between the last slash and the first question mark in the address string (so as not to share any real links, I made the address more general below):

https://drive.google.com/drive/folders/[remote folder ID]?usp=sharing

Then, synchronisation uses a command like the following:

gdrive sync upload [local folder or file path] [remote folder ID]

There also is the option to do a one-way upload, and this is the form of the command used:

gdrive upload [local folder or file path] -p [remote folder ID]

Because every file or folder object has its own ID on Google Drive, it is possible to create two objects on there that appear to have the same name, though that is sure to cause confusion even if you know what is happening. It is possible in each of the above to throttle them using trickle as well:

trickle -d 2000 -u 50 gdrive sync upload [local folder or file path] [remote folder ID] trickle -d 2000 -u 50 gdrive upload [local folder or file path] -p [remote folder ID]

Handily, this works without the added drama seen with Insync and lends itself to scripting as well, so it could be something that I will incorporate into my current workflow. One thing that needs to be watched is file upload failures, but there may be ways to catch those and retry them, which would be another thing that needs doing. This is built into Insync, and it would be a learning opportunity if I were to stick with gdrive instead.

Copying only updated new or updated files by command line in Linux or Windows

2^nd August 2014

With a growing collection of photographic images, I often find myself making backups of files using copy commands and the data volumes are such that I don't want to keep copying the same files over and over again, so incremental file transfers are what I need. So commands like the following often get issued from a Linux command line:

cp -pruv [source] [destination]

Because this is on Linux, it is the bash shell that I use, so the switches may not apply with others like ssh, fish or ksh. For my case, p preserves file properties such as its time and date and the cp command does not do this always, so it needs adding. The r switch is useful because the copy then in recursive, so only a directory needs to be specified as the source and the destination needs to be one level up from a folder with the same name there to avoid file duplication. It is the u switch that makes the file copy incremental, and the v one issues messages to the shell that show how the copying is going. Seeing a file name issued by the latter does tell you how much more needs to be copied and that the files are going where they should.

What inspired this post though is my need to do the same in a Windows session, and issuing xcopy commands will achieve the same end. Here are two that will do the needful:

xcopy [source] [destination] /d /s

xcopy [source] [destination] /d /e

In both cases, it is the d switch that ensures that the copy is incremental, and you can add a date too, with a colon between it and the /d, if you see fit. The s switch copies only directories that contain files, while the e one copies even empty directories. Using the d switch without either of those did not trigger any copying action when I tried, so I reckon that you cannot do without either of them. By default, both of these commands issue output to the command line so you can keep an eye on what is happening, and this especially is useful when ensuring that files are going to the right destination because the behaviour differs from that of the bash shell on Linux.

Pondering storage options

1^st June 2011

The combination of curiosity and a little spare time had me browsing online computing technology stores recently. A spot of CD and DVD burning brought on by a flurry of Linux distribution testing reminded me of the possibility. Because I have built up a sizeable library of digital photos, ensuring that I have backups of them is something that needs doing. While a 2 GB Samsung external hard drive is brought to life every now and again for that purpose, the prospect of using Blu-ray discs has appealed to me. After all, capacities of 25 GB for single-layer discs and 50 GB for dual-layer ones sound not inappropriate for my purposes. However, they aren't a cheap option at the time of writing, with each disc costing in the region of £3-4 at one place where I was looking. The cost of BD writers themselves seems not to be so bad though, with a few in the £60-100 bracket; any lower than this and you could end up with a combo drive that reads Blu-ray discs and writes to DVD's and CD's, so a modicum of concentration is needed. As attractive as the idea might be, the cost of BD media means that I'll wait a little while before deciding to take the plunge. The price premium at the moment is a reminder of the way that things used to be when CD and DVD writers first came on the market. It is very telling when discs come packaged in jewel cases, something that you won't see too often with CD's or DVD's.

Another piece of storage excitement that hasn't escaped me is the advent of SSD hard drives. With no moving parts like in conventional hard drives, they bring a speed boost. Concerns about their lifetimes and the numbers of read/write events per drive would stall me when it comes to storing personal data on them but using them for the likes of operating system files sounds attractive, especially with my partiality to Linux perhaps not hammering drives so much. As with any new technology, there is a price premium, even though a drive big enough for hosting an operating system can be acquired for less than £100. As with many of my hardware purchase brainwaves, there's no rush, but this is an option that I'll keep at the back of my mind.

Another appealing notion is the idea of getting a NAS so that files can be shared between a few computers. While I have seen prices starting at just above £70 for single disk enclosures, these generally are a more expensive option than external drives, and that's before you consider the cost of any hard drives. Nevertheless, the advantages of a unit containing more than a single hard drive while operating as a print server for any compatible printer, too. When you get to 4 or 5 hard drive trays, then the cost has mounted, but that could be when they pay their way too. What reminded me of these was a bookazine on home networking that I recently found at a branch of WHSmith's and their attractions are subject to the networking side of things being made to work without a drama. Once that's out of the way, then their usefulness really does appeal.

Mulling over all these brainwaves is one thing, but it doesn't mean that the purse strings will become too loose in this age of economic constraint. In fact, pondering them may serve to staunch any impulse purchases. Sometimes, a spot of virtual shopping serves to control things rather than losing the run of oneself.