Photo recovery challenges are often related to small flash based drives
Most cases that land on my desk arrive as photo or video repair tasks. As I have stated before, many of those are the result of botched recovery jobs, or corruption is caused by a file system or physical issue with the drive. Also, in a majority of the cases photos are stored on small flash based devices such as memory cards or USB flash drives -or- it can be determined photos probably corrupted while at the flash based drive. For example I get many corrupt photos that client copied from a flash drive in corrupted form, and then flash drive was returned to service. If this happens the corrupt photos are really the only thing we can work with.
So emphasis of the blog post is on those and the problems associated with those devices in relation to lost and corrupt photos and video. I will try avoiding jargon as much as possible or try explaining it in layman language.
Logical Recovery Challenges
The most commonly used file systems on small NAND flash based devices are all a flavor of FAT. FAT stands for File Allocation Table and whether it’s FAT16, FAT32 or exFAT (or FAT64), this table is the only structure that keeps track of which clusters are allocated to any specific file. In short, a file start with a directory entry which stores the filename, other meta data (creation date, attributes etc.) and the first cluster of the file.
The rest of the clusters need to de looked up in the FAT. Typically in the entry for cluster 100 we find the value for the next cluster, there we find the value for the next etc.. Until a special value signifies the end of the cluster chain. So, ‘101 – 102 – 103 – 117 – 118 – 119 – 144 – END’ could be a chain in the FAT for a specific file. And then we immediately see the biggest challenge in recovering photos and videos from a Flash drive, as soon as we lose the information if the FAT, fragmented files are almost impossible to recover. From the chain it becomes obvious that our file is not in a contiguous chain of clusters.
File fragmentation on small NAND based devices
The following is specially true for memory cards as used in cameras. Many people review photos in the on-screen display and decide whether to keep it or not. Also people tend to change between still mode and video mode. All that combined ‘encourages’ file fragmentation. Following are stats from a real world recovery:
We see that of the total 306, 15 MP photos, 67 were stored in 2 or more fragments! Videos tend to be even more fragmented. Virtually no end user file recovery software will be able to recover those files if the FAT is corrupt or wiped (due to a format for example). It is even fair to say that many data recovery labs will have trouble recovering those, or will consider it too much trouble!
Challenges related to physical properties of NAND memory
As with conventional drives physical issues and damage can prevent you from accessing your data. You may notice this by the drive not being detected by the operating system, the operating system ‘hanging’ or freezing as soon as the drive is connected, drive being detected but with incorrect capacity, etc.. All the things that can happen with a malfunctioning conventional drive. There are however important differences that make data recovery from a flash based drive a more complex matter than conventional hard drive recovery.
Physical issues specific to NAND Flash Memory
Fake or counterfeit memory cards
A very common issue which I need to mention although recovery from this issue is impossible. Key is, the (mostly) memory card has an advertised capacity that’s bigger than actual memory capacity. Manufacturers of memory cards program the controller specific for the NAND memory on the card. One of the parameters that’s programmed is the capacity. Because every ounce of NAND chip is being used, even partially failing chips the manufacturer may for example limit the capacity of a NAND chip and sell a 64 GB chip as 32 GB.
Using the same tool however a criminal can make the controller believe it’s dealing with a 64 GB NAND capacity while true capacity is only 16 GB, and then sell the card with advertised capacity of 64 GB. Meta data of FAT file systems are at the start of the drive. As long as the card reports no errors when saving data, even to addresses exceeding 16 GB the FAT can be updated and a camera believes it is actually saving photos even when they’re written to non existing memory.
The photographer awaits a big surprise when he reviews the photos later on a computer. All files written to non existing memory seem to exist as all meta data was correctly written, however many of his photos are corrupt. In fact, they’re empty, and even now the counterfeit controller does it’s job as we can actually treat these phantom photos like any other file! We can copy, move and delete them but if we’d look at their actual content we’d see they’re empty:
Such photos (or videos) can neither be repaired nor recovered.
NAND loses data much like a battery loses charge. Another analogy with batteries is that the more it is used (charged and dis-charged) the more it loses the ability to hold a charge. To a degree a flash can cope with that using error correction (ECC) and adjusting threshold values used to decide if the value of a specific cell is zero or one. If damage is to too large to be compensated for, depending on location this can lead to corruption in individual files are file system meta data.
A lab may be able to play with the threshold values I already mentioned, but then we’re already at a level where special equipment must be used to bypass the drive’s controller. We are already talking about a very complex recovery then, more about that later…
Another option is to take the file in it’s current form and try repairing it, it is one of the things I frequently do for my photo repair service. The photo on the index page is a typical example of damage due to ‘bit errors’.
Internal data structures
On a conventional drive there is a direct relation between a sector address and the location on the drive. On flash based memory this is not the case. Due to the nature of flash memory it is desired that all memory is used with equal intensity to avoid the battery effect I mentioned earlier. If data is written constantly to the FAT area, which needs updating every time files are created, changed and deleted, associated cells will quickly wear out.
To counter this effect the controller is constantly wear-leveling by moving data to different ‘pages’. The ‘host’ is unaware if this: It can simply request contents of for example LBA address 44, and the flash drive controller will look up the page currently assigned to that LBA address in an address table it keeps on the flash memory itself and deliver the data to the host.
Problems arise when the controller fails or if this lookup table becomes corrupt. This ‘layout’ the controller keeps track of is totally non standard and even vary between controller revisions. On top of this data is scrambled before being written (again to counter wear), the algorithm used to scramble isn’t standard either. One often consists of two parts. One flash drive may contain multiple chips. Data is often distributed between these parts very much in a RAID like manner. So ultimately one file can end up being fragmented on the file system level, the flash translation table level, and on different chips.
Small flash drives basically come in tow flavors today: A PCB with components such as the controller, NAND chips, oscillator, fuses etc., or as a monolith. A monolith is one chip, one package with all components integrated. The PCB form allows for some degree of repair which is highly desired seen the complexity of the data structures on the NAND chip(s).
Under a microscope it may be possible to detect fractures in the traces on the board. Solder joints may have fractured or broken, fuses blown etc.. Some of these issues can be repaired using micro soldering techniques. Main challenge in this is that flash drives and their components are so tiny. It’s not advised to try it yourself just you happen to have a soldering gun.
Swapping components can be done, very common in conventional hard drive recovery, but the trouble will be finding a donor. Although a USB flash drive may be sold as ‘SuperFlashDrive 64’ for a few years it is unlikely that internally these drives are identical. Controllers will vary, firmware revisions will vary, NAND chips will vary etc.. So, while in theory possible, it will often be impossible to locate a donor.
On the so called monoliths repair is not possible at all.
Data Recovery from small flash drives
Logical data recovery
As discussed main challenge will be recovery of fragmented files (in case FAT can not be relied on). End user file recovery software or even professional logical software will often fail the task. It is the area DiskTuna specializes in. The only solution is ‘brute force’. This brute force can be applied by the means of human labor by patiently puzzling files together, or by running algorithms. These algorithms usually require lots of computing power and time. A human is often more precise, and also slower. Automated is cheaper than human brain power.
Partially failed, unstable drives
Just like with conventional drives it is sometimes possible to read the data from a flash drive using specialized tools. Using such (hardware based) tools it is sometimes possible to get a drive to ID again. Drives that cause the OS to freeze can be isolated from the OS, the data recovery hardware handles the instability. Sectors that take an unreasonable amount of time to read can be skipped after an x amount of micro seconds where software would waste seconds.
These are typically expensive, professional tools, however sometimes with some ingenuity cheaper though less effective options are possible. My own tool JpegDigger can be used with a cheap USB power hub allowing you to image drives that frequently ‘disconnect’.
As already addressed repair of failed components can be attempted. As you may notice we make every effort to avoid the final solution in the arsenal, dumping the NAND memory… One thing I’d like to add is that if we compare working with flash drives to working with conventional hard drives is that no so called clean-room is required. Physical repairs on flash drives require precision instruments but can be done on a normal environment.
Dumping NAND memory
Again the form factor comes into play here. In general NAND memory used on PCBs is a TSOP or some BGA type chip or chips. These often have a standard and know pin arrangement which means special readers are available that match those. So all the data recovery tech has to do is de-solder the chips from the board and put them in a reader.
Monolith chips do not offer these standard pin layouts or pinouts as techs call them. Even worse, these pins often aren’t even visible! So for starters the ‘casing’ of the chips needs to be removed. This can be done by hand using sand paper or a fiber pencil but also lasers are used to remove material from the chip layer by layer. Once the pins are revealed the pinout needs to be determined: What pin does what? What pin can be used to send a command to and what pin we read the data from? Tiny wires are soldered onto these pins and the adapter board to connect the chip to the reader.
Dumping the NAND can be a slow and finnicky process. To get a good dump the data recovery tech may have to experiment with power settings and so called ‘read-retries’. These read retries aren’t simply re-reading the NAND but it’s experimenting with values in command registers. My impression is that the more modern the NAND chip the problematic it is to read/dump it.
Once the NAND is dumped the biggest challenge awaits. The raw dump needs to be transformed into a ‘logical image’. As discussed data can be anywhere on the NAND, spread over different chips and in a scrambled form. Also, everything is dumped, including bad blocks and system areas which need to be stripped. Since we can not rely on the controller that normally takes care of this, the algorithms used by the controller need to be reverse engineered on the spot.
Fortunately manufacturers of the NAND recovery tools keep data bases of known chips, however for each chip/controller cracked, 2 new algorithms are introduced on the market. Some manufacturers concentrate on supporting as many algorithms as possible, others take a different approach and try making reverse engineering the algorithm ‘easy’.
Ideally a complete logical image or file system is reconstructed. In this case files can be recovered with original names and folder structure, and also the FAT can be used to reconstruct fragmented files. This is however often not possible and raw recovery is the best option which brings us back to the problem of file fragmentation we discussed earlier: without the file system it is impossible to connect specific clusters (at file system level) to specific files.
In conclusion ..
In my line of work, photo/video recovery and flash based drives are often interconnected. The corruption in photos I deal with is often directly related to the file system associated with small NAND based drives (USB Flash drives, memory cards) being FAT, or the physical medium itself. A single ‘bit flip’ due to degrading NAND can result in either file system corruption or damage individual files.
Logical file system corruption or the inability to reconstruct a perfect file system from a NAND dump can result in corrupt photos due to file fragmentation.
For many end users the willingness to pay for such recoveries is small. How can data recovery from such a cheap device be this expensive? Complexity of these recoveries however exceeds those of complex RAID recoveries. Much of the work associated with these types of recoveries is labor intensive which determines much of the cost involved. The hardware and software needed often requires updating, which need to be purchased, to keep up with the rapid changes in the world of NAND. Rather than weighing the cost of the flash media, one should evaluate the cost of the data. What is the lost data worth paying for?
With regards to DIY data recovery you’re pretty much limited to software. When it comes to free software people often end up using software like PhotoRec or Recuva. For Recuva a largely intact file system is a requirement. To work around this an often heard advice is to format the corrupt drive. Formatting a drive to work around the limitations of a recovery tool is stupid. PhotoRec as a raw file scanner runs into issues described before: it can’t cope with file fragmentation.
DIY projects dealing with the hardware side of things often run dead or turn into endeavors spanning months or even years of tinkering. DIY soldering jobs more often than not turn PCBs in total disaster areas.