Know your cables and connections - data wrangling fundamentalsFirstly a bit about me. I’ve been in Film & TV for over 20 years primarily as a DIT and in post-production. Prior I worked in IT for 4 years after graduating from the Royal Melbourne Institute of Technology... Read more
Know your cables and connections - data wrangling fundamentals

Firstly a bit about me. I’ve been in Film & TV for over 20 years primarily as a DIT and in post-production. Prior I worked in IT for 4 years after graduating from the Royal Melbourne Institute of Technology (RMIT).

This is a follow-up article to a previous piece about understanding hard drives and solid state drives: https://www.thetalentmanager.com/hub/post/33969

In this new article I will go into depth about the cables and connections these drives use, and how to ensure you have configured your data wrangling cabling for maximum efficiency.


Introduction

Please note that laptops can behave differently running on battery or mains power. In an effort to save power some computers will cripple or even disable certain ports on your computer, so I highly recommend always data wrangling using mains power, ideally on a UPS. If you must run off the battery, know what the implications will be for your specific computer. Try and do a dummy run with your data wrangling setup before working with actual rushes to prevent catastrophe.

The best cable to use for any drive is the cable it came with at purchase, even if a different cable might be technologically superior. Unfortunately though drives and cables often get separated. There is a lot more to cables than just the type of connectors they have at each end and just because a plug fits, it doesn’t mean it’s correct.

In my previous article I used a highway analogy to convey the difference between ‘drive speed’ and ‘interface speed’. In this analogy the ‘drive’ (hard drive or SSD) is represented by a car, and like cars all drives have different top speeds. The ‘interface’ (USB 3.0, Thunderbolt etc) is represented by a speed limit sign. If the speed limit is faster than the top speed of a car, it doesn’t mean the car can magically drive faster. However, if the speed limit is slower than the top speed of a car, then the car is not able to drive at its top speed. So the interface you use is important to facilitate the top speed of a drive, but a faster interface won’t improve a drive’s top speed.

In this analogy cables would be represented as roads. Driving on a bumpy road will reduce the top speed of your car and also prevent it from driving at the speed limit. But having a pristine road doesn’t improve a car's engine, nor increase the legal speed limit. Using the wrong cable can prevent a drive from running at its top speed, lower an interface's potential speed limit, or even prevent a drive from running at all.

Below is a cheat sheet which outlines commonly used interfaces (speed limit signs) and their theoretical maximum speeds. There are too many to list individually so I have grouped them by transfer speed which is the only metric we are interested in. Interface speed is usually measured in megabits per second rather than megabytes per second which is very different. So I’ve standardised all measurements in this article to megabytes per second to avoid confusion.

USB 3.0 / USB 3.1 Gen 1 - 625 Megabytes per second
USB 3.1 Gen 2 / USB 3.2 Gen 1 - 1250 Megabytes per second
USB 3.2x2 (also referred to as USB 3.2 Gen 2) - 2500 Megabytes per second
USB4 - 5000 Megabytes per second
Thunderbolt 1 - 1250 Megabytes per second
Thunderbolt 2 - 2500 Megabytes per second
Thunderbolt 3/ Thunderbolt 4 - 5000 Megabytes per second

In order for a drive to run at full speed:
- The device (in our case a hard drive or SSD) must be compatible with the interface
- The cable must be compatible with the interface
- The port on the computer the cable is plugged into must be compatible with the interface

Compatible is an important word here, since this isn’t just a matter of having the device, cable, and computer port all match in terms of interface.

For example, you can have a USB 3.0 device, using a USB 3.0 cable, plugged into a Thunderbolt port on a computer, because Thunderbolt ports are compatible with USB 3.0. You don’t need to specifically use a USB 3.0 port for a USB 3.0 drive to work at full speed. I will explain in more detail which interfaces are compatible with each other in detail below.


USB

USB stands for Universal Serial Bus, which is ironic given how many different types of USB there are. The different flavours are determined by a number suffix and a letter suffix. The number refers to the generation of USB (2.0, 3.0 etc) and the letter refers to the shape of the connector. The generation determines its speed, not the connector. So looking at the above chart, all the cables designated as USB 2.0 are far too slow for data wrangling. Be aware, some USB 2.0 cables may have the same connector as a USB 3.0 cable, so using one will limit you to USB 2.0 speeds which are much slower.


Colour Coding

You can see on the above chart all of the USB 3.0 connectors are Blue which is the standard for USB 3.0. However this colour coding is not mandatory for hardware manufacturers. Some cables are rated for USB 3.0 but are black or even orange. Many USB 3.0 cables are black but have blue plastic only inside the metal shroud of the connector. The colour coding for the different flavours of USB 3.1 and 3.2 are supposed to be Teal and Red, but again not always. In short, colour coding does not guarantee the rating of a cable but it can be a good place to start when selecting something from a box of spaghetti’d cables.

The same can be said for ports on a computer. Some will be colour coded or have the interface type written next to the port. Neither should be trusted at face value, test everything you can in advance.


USB 3.0 (USB 3.1 Gen 1 is the same speed so we will refer to both as just USB 3.0 for simplicity)

USB 3.0 is the standard interface for hard drives and older generation SSDs because its bandwidth is more than adequate for those drive’s top speeds. The USB 3.0 Type B and USB 3.0 Micro Type B connectors in the above chart are exclusive to USB 3.0. So if your storage device uses those, any cable with that connector should work fine. Be weary of USB 2.0 Type B connectors (often used for printers and scanners) which do fit in a USB 3.0 Type B socket but are still a different shape. If you use one of those the drive will be limited to USB 2.0 speeds.


USB 3.1 Gen 2 and USB 3.2x2

USB 3.1 Gen 2 and 3.2x2 are more modern and faster iterations of USB which benefit recent generations of solid state drives (SSDs) and card readers which are faster than the top speed of USB 3.0 (625MB/s).

USB 3.1 Gen 2 and 3.2x2 do not require specialised cables, however I’ve only ever seen them with USB Type A or Type C connectors. Any USB 3.0 Type A or C cable should be capable of maintaining the increased bandwidths of 3.1 Gen 2 and USB 3.2x2.


USB-C

USB-C is not linked to any particular generation of USB. It was supposed to be a standard which unified cables, however in reality there are a myriad of cables with different uses and speed ratings using USB-C connectors. It’s often misunderstood as being faster than USB-A, it is not. It’s just uncommon to see newer generations of USB with a Type A connector but they do exist. USB-C is also associated with interfaces like USB 4 and Thunderbolt 3&4 which are very fast, but not because of the USB-C connector. Be aware which interfaces your computer supports so you can develop a data wrangling plan to maximise bandwidth.

One of the biggest issues with USB-C is it’s also used for battery charging, so many USB-C cables are designed only for charging and have no capacity for transferring data. If you use a USB-C charging cable with a hard drive or SSD, it may power up but it will likely not mount. Also sometimes a USB-C charging cable has data transfer capabilities but is limited to USB 2.0.

USB-C doesn’t have widely used colour coding or insignia to label what a cable is rated for, so without cutting them open to see what’s inside the only way to know if they’re appropriate for your device is to try them out.


USB-C PD

The PD in USB-C PD stands for Power Delivery. It has no impact on transfer speed and references the amount of power output (usually 100W) a USB-C PD port on a computer has for charging devices. These ports sometimes have an icon next to them which look similar to a Thunderbolt icon which can trick you into thinking a computer is Thunderbolt capable when it isn’t.


Thunderbolt

Thunderbolt 1 and Thunderbolt 2 utilise a connector called ‘Mini-Displayport’. However not all Mini-Displayport cables are compatible with Thunderbolt even if they have the exact same plug. All generations of Thunderbolt usually have the Thunderbolt icon printed on each end of the cable, often with a number next to it specifying the generation of Thunderbolt. There are also Thunderbolt 2 to Thunderbolt 3 adaptors which allow you to plug a Thunderbolt 3 device into a Thunderbolt 2 port or vice versa. This is absolutely fine to do since Thunderbolt 2 is plenty fast for any drive on the market. However some card readers will saturate the bandwidth of Thunderbolt 2.

Both Thunderbolt 3 and Thunderbolt 4 utilise a USB-C connector. Thunderbolt 4 does not offer an advantage over Thunderbolt 3 for data wrangling so we can refer to them interchangeably and use either generation of cable or port with no impact. In most cases a Thunderbolt device will only work with a Thunderbolt cable

Thunderbolt 3 & 4 ports are ‘backwards compatible’ meaning you can plug a USB device of any generation into a Thunderbolt port and it will run at top speed. But (with the exception of USB4) you can’t do the opposite and plug a Thunderbolt device into a USB port of any generation, it won’t recognise it at all.

One of the benefits of Thunderbolt is ‘daisy chaining’. Many Thunderbolt devices have two Thunderbolt ports, meaning one port will be plugged into your computer and the second port can be used to plug in another device. This allows up to six devices to be chained together to a single Thunderbolt port. If you have a Thunderbolt drive with only one port it can be included but must be the last device in the chain. You can also include a USB device but again it must be the last device in the chain. In rare cases you might saturate the bandwidth of a Thunderbolt port by daisy chaining too many drives. Simply add up the total write speed of all the drives in a chain which you’d like to write to simultaneously, and then compare that with the bandwidth of the version of Thunderbolt you’re using. In some cases it may be better to perform some backups in a separate batch, especially if your card reader is one of the devices in the chain.


USB4

USB4 is backwards compatible with all flavours of USB as well as Thunderbolt 3&4. USB 4 and Thunderbolt 4 are often referred to interchangeably even though they’re different. For our use case there is no benefit of one over the other.


Adaptors

There are many adaptors on the market which convert a USB-C connector to a USB-A connector and vice versa. Be aware many dishonest aftermarket manufacturers will advertise an adaptor as being USB 3.1 Gen 2 or 3.2x2 compliant because most people won’t notice the difference. Make sure you test an adaptor to ensure it’s hitting advertised speeds. Personally I recommend the brand Graugear to be sure you’re getting what’s written on the tin. Just like with cables some adaptors are only designed for charging batteries and will not mount drives.


Fussiness

Over the years I’ve encountered drives which are just plain fussy about the cable you use. It’s not because the cables are faulty, and honestly I don’t know what is the cause sometimes. But having a cable of the correct interface and connector doesn’t guarantee a drive will play nice with it. If a seemingly correct cable doesn’t work, try other cables before assuming a drive is dead.


Drives

I go into this in a lot more detail in my previous article but in summary, USB 3.0 is more than adequate for all external hard drives and older SSDs. Some newer SSDs will benefit from the higher bandwidth of USB 3.1 Gen 2, USB 3.2x2, and Thunderbolt. As a rule of thumb you don’t need anything faster than USB 3.0 unless your drive speed is over 625MB/s. This should be advertised on the drive’s packaging or Google your model number.

For example a Samsung T5 runs at 540MB/s so will not benefit from an interface faster than USB 3.0. However the Samsung T7 runs at 1000MB/s so you will need at least a USB 3.1 Gen 2 port to run at full speed. And a Samsung T9 runs at 2000MB/s so you will need at least a USB 3.2x2 port to run it at full speed.

Remember Thunderbolt 3&4 and USB 4 include compatibility with all previous USB interfaces, so don’t be worried if your computer has a Thunderbolt port but not a USB 3.2x2 port for example.


Card Readers

Generally speaking camera media are much faster than hard drives and SSDs. Sometimes card readers are exclusively Thunderbolt meaning you will need a Thunderbolt or USB4 port on your computer. If you don’t yet have your own card readers they are often supplied by the camera department so it’s a good idea to determine their model ahead of time to be sure your computer can accommodate them. Sometimes a reader will support multiple interfaces. Always prioritise giving the card reader one of your fastest ports because they need to have maximum bandwidth for Simultaneous Writing.


Simultaneous Writing

The most efficient way to data wrangle is to offload camera media to multiple backup drives simultaneously whilst maintaining each drives maximum write speed.

For example, if a camera is recording to a Sandisk Extreme Pro CF Express Type B card (1700MB/s Read) and you have a Thunderbolt 3 card reader:

Using Samsung T5 SSDs (540MB/s Write) you could theoretically offload to three T5 drives simultaneously and all three drives would be writing at their top speed. (3x 540MB/s = 1620MB/s = less than 1700MB/s). This setup would require a computer with at least one Thunderbolt 3 port and 3x USB 3.0 compatible ports.

Using Samsung T7 SSDs (1000MB/s Write) you won’t be able to write to two simultaneously without a drop in write speed. (2x 1000MB/s = more than 1700MB/s) However simultaneous writing would still be faster than offloading to one drive at a time in this case. This setup would require a computer with at least one Thunderbolt 3 port and two USB 3.1 Gen 2 compatible ports.

If you don’t have enough ports on your computer for the setup required, a hub can be used to provide additional ports. However this can introduce new issues if not configured correctly.


Hubs

Be very careful to check the rating of all the ports on a hub, some might only be rated for USB 2.0. Also some Thunderbolt hubs have additional USB-C ports which are actually just USB 3.0 Type C ports rather than Thunderbolt. Unfortunately some less honest brands will advertise ports incorrectly so test everything.

Another big consideration is power delivery. Desktop hard drives are powered from a wall outlet, however many hard drives and SSDs are what is called ‘bus powered’ meaning they are powered by the USB or Thunderbolt port itself. Some hubs use USB-C PD to provide sufficient power to all their ports, however some hubs are just bus powered themselves meaning every drive which is plugged into the hub will be drawing power from a single port on the computer. In many cases a single port is not able to power multiple drives. SSDs require relatively low power compared with hard drives but I wouldn’t recommend attaching more than two to a bus powered hub even if it has more ports. You can use multiple hubs provided they’re plugged into separate ports on the computer. If you have mains power available to you on set, it’s a good idea to use a mains powered hub like something from CalDigit. These will give you a number of new ports to use without power delivery concerns.

Using a hub can also introduce bandwidth limitations. Each port on a computer has its own maximum bandwidth, so adding a hub requires multiple devices to share that bandwidth. For example if you plug two T5 SSDs into a USB 3.0 hub which is plugged into a USB 3.0 port, they will each get half of that ports bandwidth (625MB/s /2 = 312.5MB/s) which is much slower than a T5’s write speed. (540MB/s)
Be aware that some computers are constructed with internal hubs that you can’t see. So if you aren’t getting the speeds you expect whilst using multiple ports simultaneously this could be the cause. For example many MacBook Pro’s have four Thunderbolt ports, but they only have two Thunderbolt interfaces inside, so each set of two ports are sharing a single Thunderbolt interface.


How to check if a setup is working correctly

There is very little risk in using the wrong cable to test a drive. In some rare cases a drive can fail if it’s not getting enough power. This usually indicates a drive which was probably going to fail soon anyway. In any case it’s best practice to test a cable and drive combination before it has any important data on it.

The first test is the mounting test. This is to see if a drive shows up in Disk Utility (MacOS) or Disk Management (Windows). If the drive doesn’t show up, it’s either not plugged in to a compatible port, or the cable is likely not compatible or faulty. Sometimes a drive will repeatedly connect and then disconnect, this also indicates a faulty or incompatible cable.

Be aware though of File System Formatting. For example if a drive was previously formatted on a Mac, the volume might not be accessible on a Windows PC. In this case you will either need some specialised software to read it or you’ll need to reformat the drive prior to use. This will erase all the data on it so be sure to check with production before formatting any drives.

Once you have a drive mounted it’s then important to test its speed. There are many programs available but an industry standard is ‘Blackmagic Disk Speed Test’. It’s free and available on both Windows and Mac. It’s not 100% accurate but it will give you a good idea of how a drive is performing. Note both read and write speeds, though write speeds are most important. It’s best to test a drive with no other applications running, and certainly don’t test a drive while you’re transferring data to or from it.

Be aware that speed test programs will test one drive at a time, so it won’t tell you if there are potential bottlenecks in your system which only occur during simultaneous writing. The best way to truly test a setup is to do a dummy run using your offload software and look at the numbers. If you’re doing a test from a card reader, you’ll need to make sure the reader also has the correct cable and port. If the card reader is running slowly, then the drives will appear slow even if they have the right cable and port.
Don’t be concerned if a drive is not quite reaching the speed it’s advertised to run at, those numbers are best case scenario and rarely reached in real world practice. But it will be evident in the numbers if a drive is severely underperforming. Refer to my previous article for information about the types of speeds you can expect for different types of drives. If you’re getting anything under 40MB/s then your drive or card reader is most likely running at USB 2.0.


Write Caching

I go into depth about write caching in my previous article, but in short you want to have it enabled as it will improve your offload times. It’s already enabled by default in MacOS, but will need to be enabled manually for each drive in Windows. It’s easy to find how to do it with a Google Search.


Transcoding

Another common responsibility of data wranglers is rushes transcoding. This can severely slow down your rushes offload speed if you’re transcoding at the same time. Personally I use two separate computers when data wrangling to circumvent this, one for offloading, one for transcoding. If this is not possible for you there are other things you can do to mitigate bottlenecks.

Don’t write your proxy files to the same drive you’re reading them from. If you read files from one drive and write them to another, transcoding will be much faster as it’s sharing the load between two drives. The process of transcoding is typically reading very large video files and then writing much smaller ones, so the demands on read speed will be much higher than write speed. If one of the drives in your setup has faster read speeds, use that drive to read from and a slower drive to write to.

The best option if you have access to mains power and are not running and gunning is to have a high speed raid at your disposal which is fast enough to handle transcoding reads and writes and offloading all at the same time.

In cases when you’ve already offloaded camera media, you can even use the camera media itself as a read source for transcoding since the read speeds are very high. This is sometimes frowned upon but perfectly safe in my opinion if the rushes have already been backed up. Keep in mind transcoding uses a lot of processing power, and depending on how fast your computer is can slow down offloading just because your computer is maxed out. Always prioritise offloading over transcoding. If transcoding is getting in the way of offloading, pause the transcoding and tell production you may need more time.


Wrap

I’ve listed my drive recommendations in my previous article, though keep in mind this is an ever changing landscape and recommendations quickly become obsolete. I’ve also decided not to include any specific computer recommendations in this article since that is a very lengthy topic and something best saved for a future article.

Well done if you made it all the way through, I hope this information has been helpful. Feel free to reach out to me here if you have any questions, I visit this site every day. And disclaimer, I am not affiliated with any hardware or software manufacturer and have not been paid by anyone to write this article.