Why Are My Backups So Slow?

I recently tested two portable disk drives to use for data backups but was appalled by how badly they performed. The drives would alternate between writing quickly (50 – 100MB/s) and stuttering to a halt (250 – 500KB/s), and I could not understand why.

Here’s what that looked like:

Task Manager showing super terrible disk transfer rates

The two drives were:

  • Seagate Backup Plus Slim 2TB portable disk drive (2.5″ size)
  • Toshiba Canvio Basics 2TB 2.5″ portable disk drive (2.5″ size)

Because there are really only a small handful of hard disk companies anymore, it’s not clear to me what incentives they would have to ensure that the drives actually perform well. And I don’t believe that they test these drives in very many configurations; or, at least, they don’t talk about them. Which is dumb, if it leads to bad Amazon reviews.

So now I’m stuck trying to figure out where the problem lies by examining all of the possible variable combinations.

Fixed Parameters

  • 3rd-generation Intel computer used throughout testing (Intel Q77 Express Chipset)
  • Windows 10 Pro, Version 1903 fully-patched
  • Toshiba Canvio Basics 2TB portable disk drive (HDTB420MK3AA), the Seagate was returned
  • ~500 GB of photo and video data, large amounts of contiguous data which should transfer very fast.

Independent Variables

  • Sector Sizes: Advanced Format (512e / 4Kn) vs Old-School (512n)
  • Partition Alignment: 4KB-aligned or not
  • Write Caching: Enable / Disable
  • Copy Mechanism: Robocopy / TeraCopy
  • Encryption: TrueCrypt / VeraCrypt / Bitlocker To Go / Unencrypted

Thing-to-Check #1: Sector Sizes

The question to answer here is: Does the drive use 512-byte or 4096-byte physical sectors?

Older drives (under ~1TB) use 512-byte sectors and came from the earlier part of the 21st century. Most modern drives (1TB and up) use 4096-byte sectors, but may report themselves to the operating system in different ways to increase compatibility with older operating systems.

Let’s check this on an unencrypted partition with the Command Prompt:

C:\WINDOWS\system32>fsutil fsinfo ntfsinfo e:
 NTFS Volume Serial Number :        0x1654711c5470ffb3
 NTFS Version      :                3.1
 LFS Version       :                2.0
 Total Sectors     :                33,554,431  (16.0 GB)
 Total Clusters    :                 4,194,303  (16.0 GB)
 Free Clusters     :                 4,182,263  (16.0 GB)
 Total Reserved Clusters :               1,024  ( 4.0 MB)
 Reserved For Storage Reserve :              0  ( 0.0 KB)
 Bytes Per Sector  :                512
 Bytes Per Physical Sector :        4096
 Bytes Per Cluster :                4096
 Bytes Per FileRecord Segment    :  1024
 Clusters Per FileRecord Segment :  0
 Mft Valid Data Length :            256.00 KB
 Mft Start Lcn  :                   0x00000000000c0000
 Mft2 Start Lcn :                   0x0000000000000002
 Mft Zone Start :                   0x00000000000c0040
 Mft Zone End   :                   0x00000000000cc840
 MFT Zone Size  :                   200.00 MB
 Max Device Trim Extent Count :     0
 Max Device Trim Byte Count :       0
 Max Volume Trim Extent Count :     62
 Max Volume Trim Byte Count :       0x40000000
 Resource Manager Identifier :      45F123C2-FF16-11E9-A62D-F8B156DDDD2F

So we can see that the drive is natively an Advanced Format (512e) type. i.e. it reports 512-byte logical sectors to the OS, but on disk it stores those sectors as 4096-byte physical sectors.

(There are explainers from Microsoft and Seagate about this.)

Ok, so we have an Advanced Format drive, what does that affect? Read on.

Thing-to-Check #2: Partition Alignment

For best performance, Advanced Format drives need to be partitioned on sector boundaries that equal their physical sector sizes. Not doing this leads to write amplification, which is bad on magnetic hard disks and catastrophically bad on solid-state disks.

(The diagram here explains it well.)

Let’s check this:

C:\WINDOWS\system32>wmic partition get blocksize, startingoffset, name, index, size
BlockSize  Index  Name                   Size           StartingOffset
 512        0      Disk #1, Partition #0  1000186314752  16777216
 512        0      Disk #0, Partition #0  523239424      1048576
 512        1      Disk #0, Partition #1  104857600      524288000
 512        2      Disk #0, Partition #2  499461914624   645922816
 512        0      Disk #2, Partition #0  17179869184    1048576
 512        1      Disk #2, Partition #1  1983215828992  17180917760

Looking specifically at “Disk #2, Partition #0”, we see that it has a starting offset at byte 1048576 of the drive, which is located precisely at the boundary of physical sector 256. So, the partition is aligned and a 4KB write should land squarely within a 4KB physical sector.

This is good, no speed impediments here.

Thing-to-Check #3: Write Caching

As it turns out, Windows 10 by default sets portable drives to allow immediate removal without needing to “eject” the drive first. This meant that write caching was disabled on the drives and probably that the operating system would constantly be sending filesystem commit and sync operations to the drive.

This is bad. It’s not clear how well the drive firmware handles this, even if most drives should expect to be handled this way.

It would also mean that any changes to the master file table (MFT) and filesystem journal might not overlap as operations in cache memory before being committed to the drive, which would happen if the write cache was used.

With the Write Cache disabled, from what I could tell, the drive would bounce between really quick writes (50 – 100MB/s) and then stall completely down to (250 – 500KB/s) forever. The would force me to reboot my computer to stop the underlying operation completely, because it was impossible to eject the drive. The irony of this being that the drive was supposed to be in “quick removal” mode.

USB portable drive write cache policies

Enabling the Write Cache mitigated, but did not eliminate, the transfer rate stalling.

The drives would still have moments where they would slow down drastically, but these periods of time would not go on forever.

Since this did not fix the problem, I started to suspect something else.

Thing-to-Check #4: TrueCrypt

When you back things up, it’s a good idea to encrypt the data at rest, so that if you lose the portable disk, whoever finds it doesn’t immediately have all of your vacation photos and what not.

One option for doing this is TrueCrypt, which can encrypt an entire filesystem partition.

Unfortunately, when examining the Sector Sizes that TrueCrypt presents to the operating system, there might be a problem:

C:\WINDOWS\system32>fsutil fsinfo ntfsinfo m:
 NTFS Volume Serial Number :        0x966ea2d86ea2b103
 NTFS Version      :                3.1
 LFS Version       :                2.0
 Total Sectors     :                3,873,467,903  (1.8 TB)
 Total Clusters    :                  484,183,487  (1.8 TB)
 Free Clusters     :                  422,777,543  (1.6 TB)
 Total Reserved Clusters :                  1,024  (4.0 MB)
 Reserved For Storage Reserve :                 0  (0.0 KB)
 Bytes Per Sector  :                512
 Bytes Per Physical Sector :        512
 Bytes Per Cluster :                4096
 Bytes Per FileRecord Segment    :  1024
 Clusters Per FileRecord Segment :  0
 Mft Valid Data Length :            71.00 MB
 Mft Start Lcn  :                   0x00000000000c0000
 Mft2 Start Lcn :                   0x0000000000000002
 Mft Zone Start :                   0x0000000000000000
 Mft Zone End   :                   0x0000000000000000
 MFT Zone Size  :                   0.00 KB
 Max Device Trim Extent Count :     0
 Max Device Trim Byte Count :       0
 Max Volume Trim Extent Count :     62
 Max Volume Trim Byte Count :       0x40000000
 Resource Manager Identifier :      45F12414-FF16-11E9-A62D-F8B156DDDD2F

TrueCrypt’s partition-based encryption pretends to be an Old-School drive with 512-byte sectors, possibly leading to weird performance issues during use when the emulated device driver has to pass data down to the native device driver.

Thing-to-Check #5: TeraCopy

My go-to tool for backing up data is the built-in version of Robocopy on Windows, but I wonder if the way that it copies data is causing a problem.

Since I’ve already seen identical problems using TrueCrypt + Robocopy on two different hard drives, let’s see if TeraCopy on Windows has similar issues with a freshly-formatted filesystem using TrueCrypt as the underlying encryption mechanism.

TeraCopy copying files over to the portable drive
Windows Task Manager showing I/O stats on the portable drive

In fact, TeraCopy doesn’t suffer as much data transfer stuttering as Robocopy. The moments when the transfer rate plummets do not last nearly as long. (Spoiler alert: this is later proven false during the VeraCrypt tests…)

But the drive performance is still pretty abominable, especially for lots of large, contiguous photo files. Note that this is an I/O-bound issue, since the CPU is mostly idle and could easily handle all of the necessary encryption.

I still don’t understand why the “Active time” and “Average response time” fields always peg out at 100% and ~500ms respectively. I deeply, deeply suspect something going on in the drive’s firmware or the underlying drivers that cause it to start having problems keeping the data transfer rate up.

Since it’s clear that TeraCopy has a smoother, somehow friendlier data transfer mechanism, with less stuttering, I’ll use it going forwards with the remaining tests and stop using Robocopy.

Things-to-Try #1: Unencrypted Partition + TeraCopy

Let’s reformat the partition as an unencrypted NTFS partition, and see how it fares.

Newly formatted partition

If we check the fsutil information, we see that the drive is properly exposed now as an Advanced Format (512e) type:

C:\WINDOWS\system32>fsutil fsinfo ntfsinfo n:
 NTFS Volume Serial Number :        0xf4a8bd64a8bd2650
 NTFS Version      :                3.1
 LFS Version       :                2.0
 Total Sectors     :                3,873,468,415  (1.8 TB)
 Total Clusters    :                  484,183,551  (1.8 TB)
 Free Clusters     :                  484,143,229  (1.8 TB)
 Total Reserved Clusters :                  1,024  (4.0 MB)
 Reserved For Storage Reserve :                 0  (0.0 KB)
 Bytes Per Sector  :                512
 Bytes Per Physical Sector :        4096
 Bytes Per Cluster :                4096
 Bytes Per FileRecord Segment    :  1024
 Clusters Per FileRecord Segment :  0
 Mft Valid Data Length :            256.00 KB
 Mft Start Lcn  :                   0x00000000000c0000
 Mft2 Start Lcn :                   0x0000000000000002
 Mft Zone Start :                   0x00000000000c0000
 Mft Zone End   :                   0x00000000000cc820
 MFT Zone Size  :                   200.13 MB
 Max Device Trim Extent Count :     0
 Max Device Trim Byte Count :       0
 Max Volume Trim Extent Count :     62
 Max Volume Trim Byte Count :       0x40000000
 Resource Manager Identifier :      B5D714F5-FFD9-11E9-A62E-F8B156DDDD2F

Using TeraCopy to copy files to the drive now shows no signs of the data transfer stuttering, and the average transfer rate is much higher. However, the “Average response time” measurements go way, way up.

I begin to suspect that TrueCrypt’s inability to support Advanced Format drives correctly is one of the culprits here.

Things-to-Try #2: VeraCrypt + TeraCopy

Let’s see how VeraCrypt performs.

Resetting the partition as a RAW, unformatted one

After resetting the partition and formatting it with VeraCrypt, double-checking the NTFS filesystem information shows that VeraCrypt also does not support Advanced Format drives correctly.

I deeply suspect that there will be data transfer stuttering.

C:\WINDOWS\system32>fsutil fsinfo ntfsinfo n:
 NTFS Volume Serial Number :        0x3008a6ad08a67190
 NTFS Version      :                3.1
 LFS Version       :                2.0
 Total Sectors     :                3,873,467,903  (1.8 TB)
 Total Clusters    :                  484,183,487  (1.8 TB)
 Free Clusters     :                  484,143,165  (1.8 TB)
 Total Reserved Clusters :                  1,024  (4.0 MB)
 Reserved For Storage Reserve :                 0  (0.0 KB)
 Bytes Per Sector  :                512
 Bytes Per Physical Sector :        512
 Bytes Per Cluster :                4096
 Bytes Per FileRecord Segment    :  1024
 Clusters Per FileRecord Segment :  0
 Mft Valid Data Length :            256.00 KB
 Mft Start Lcn  :                   0x00000000000c0000
 Mft2 Start Lcn :                   0x0000000000000002
 Mft Zone Start :                   0x00000000000c0040
 Mft Zone End   :                   0x00000000000cc840
 MFT Zone Size  :                   200.00 MB
 Max Device Trim Extent Count :     0
 Max Device Trim Byte Count :       0
 Max Volume Trim Extent Count :     62
 Max Volume Trim Byte Count :       0x40000000
 Resource Manager Identifier :      B5D715CC-FFD9-11E9-A62E-F8B156DDDD2F

After letting TeraCopy run for a while, I run into the same issues that I had with Robocopy, where the drive simply stutters to a halt with a minutes-long data transfer slowdown.

Data transfer rate totally collapses on Toshiba Canvio Basics drive

After a few minutes, the drive finally finishes committing its operations.

Ok, so that doesn’t work. I strongly suspect that talking to an Advanced Format drive as though it is an Old-School drive is causing this problem.

There’s one last thing to try.

Things-to-Try #3: BitLocker To Go + TeraCopy

Since I’ve now shown that both Robocopy and TeraCopy cause the drive to go into a transfer rate meltdown, and since both TrueCrypt and VeraCrypt (using 512-byte emulated sector size) seem to suffer from this, while Unencrypted partitions (using Advanced Format native sector sizes) don’t seem to suffer from this, there’s one last option: BitLocker To Go.

The question is: Can I haz BitLocker To Go support Advanced Format drives correctly?

Let’s find out.

Enabling BitLocker To Go in the Control Panel settings

Once the partition is encrypted, the Disk Management tool will show this.

BitLocker partition as shown in Computer Management.

After enabling BitLocker To Go on the partition, I check the fsutil output again and see that it does indeed correctly identify the Advanced Format sector sizes.

C:\WINDOWS\system32>fsutil fsinfo ntfsinfo n:
 NTFS Volume Serial Number :        0xf0900ca5900c73fe
 NTFS Version      :                3.1
 LFS Version       :                2.0
 Total Sectors     :                3,873,468,415  (1.8 TB)
 Total Clusters    :                  484,183,551  (1.8 TB)
 Free Clusters     :                  484,143,159  (1.8 TB)
 Total Reserved Clusters :                  1,024  (4.0 MB)
 Reserved For Storage Reserve :                 0  (0.0 KB)
 Bytes Per Sector  :                512
 Bytes Per Physical Sector :        4096
 Bytes Per Cluster :                4096
 Bytes Per FileRecord Segment    :  1024
 Clusters Per FileRecord Segment :  0
 Mft Valid Data Length :            256.00 KB
 Mft Start Lcn  :                   0x00000000000c0000
 Mft2 Start Lcn :                   0x0000000000000002
 Mft Zone Start :                   0x00000000000c0000
 Mft Zone End   :                   0x00000000000cc820
 MFT Zone Size  :                   200.13 MB
 Max Device Trim Extent Count :     0
 Max Device Trim Byte Count :       0
 Max Volume Trim Extent Count :     62
 Max Volume Trim Byte Count :       0x40000000
 Resource Manager Identifier :      B5D7167F-FFD9-11E9-A62E-F8B156DDDD2F

I suspect that this will work as speedily as the Unencrypted partition type, without data transfer stutters, so let’s fire up TeraCopy one last time.

I cross my fingers.

In this case, I see the transfer rates look good, but the Average response time also creeps up even further than in the Unencrypted partition case.

After a couple of hours copying data, I don’t see any of the catastrophic transfer rate slowdowns, although I do see the rate dip every now and then as the drive seems to be flushing data to the platters. It’s not easy to see where the bottlenecks are, but generally the transfer rate still swings between 40 – 130MB/s. I suspect that the 130MB/s number is a fluke, because that rate doesn’t hold steady even when transferring ~4GB contiguous video files.

WTF, Toshiba?

I think I might return this drive, for another reason.

Device Manager device connection tree

Looking at the device in the Windows Device Manager, it is connected to the system as a standard USB Mass Storage Device, i.e. it uses USB Bulk-Only Transport (BOT) to send data back and forth.

It’s 2019. Modern portable disk drives should be using the USB Attached SCSI Protocol (UASP) to transfer data. Toshiba is skimping on developing a more efficient drive controller, and can’t be bothered to source one in from elsewhere. I have a handful of other portable drive enclosures that all support UASP, even from a few years back, so this is a letdown.

Things-to-Try #4: BitLocker To Go + Robocopy

It turns out that TeraCopy has a pretty bad average transfer rate with a graph of the data transfer showing a lot of peaks and valleys. I’m not sure what the difference is in programming here, but the dips really undercut the backup speed.

When using Robocopy, I see a smooth, continuous transfer rate, also the “average response time” figures are quite a bit lower and transfer of large, contiguous files managed to keep the drive writing at its maximum write speed.

Update #1: Drive 3 + TrueCrypt + Robocopy

November 2019: Testing out another Advanced Format (512e) drive, this time from Western Digital (but one of the older ones), I see none of the issues that I see in the other two drives when using TrueCrypt + Robocopy.

Even though TrueCrypt reports a logical / physical sector size of 512 bytes, as in the previous test cases, there is no transfer rate stuttering or stalling.

There is also much reduced average response time. I’m wondering if the drive possibly has a faster processor or better signal processing. This drive also has a lower areal density and more disk platters than more modern off-the-shelf offerings.

Update #2: Compress Your Files

One other way to make sure your backups proceed quickly is to move large amounts of small files into single, larger compressed archives.

This also gives you the opportunity to use something like QuickPar to create parity archives, which can help to recover data from bit rot and random multiple bit errors in the magnetic media. Parity archives can recover up to a specified percentage of corrupted data.

Eliminating large numbers of folders and small files speeds up the disk transfers because the operating system does not have to spend as much time updating the filesystem structures.

Summary

Use BitLocker To Go and Robocopy together to backup data.

BitLocker To Go is the only way to get reasonably speedy encrypted backups without unexpected hardware hiccups on a standard Windows 10 Pro system, at least with this generation of portable disk drives.

VeraCrypt and TrueCrypt both exhibited issues in the emulated block device driver causing data transfer stutters that made both of them effectively unusable on my system.

TeraCopy transfers data in a bursty fashion, even with large files, which leads to a massive slowdown in the average rate and leads to much longer backup times.

Robocopy seems to have a much better average transfer rate, because it maintains a steady and fast transfer rate continuously, leading to far better backup times.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.