JPEG and repairing JPEG using JPEG-Repair
To a PC a file is nothing but a bunch of binary data. The most accessible form for humans to ‘read’ that data is in HEX code. Below you see a hex dump of the JPEG file we’ll use as an example on this page.
When I started my attempts to repair corrupt JPEG files it was this hex editor I used. I modified, added or removed bytes. This is also called ‘patching’, directly altering data within binary data. I then used a photo viewer to watch the effects.
Essentially JPEG-Repair uses the exact same method I used back then. The difference is that it hides the HEX code and instead provides you with visual feedback when adding / removing bytes to / from the JPEG data stream. So you do not have to edit or remove bytes using a hex editor, instead you tell JPEG-Repair to remove bytes from a certain offset in the JPEG file.
JPEG-Repair is not a photo editor!
So, JPEG-Repair offers tools to alter the HEX code inside a JPEG file. This is fundamentally different from what a normal photo editor does. JPEG-Repair is more like a hex editor than a photo editor. A normal editor ‘decodes’ the JPEG file, then allows you to edit or change it (apply filters, crop, flip etc.) and then encodes the file again. A photo editor decodes a JPEG and then can forget about the RAW data. From that point on it only deals with the pixels the RAW data was decoded in. Once you are done editing it encodes all that pixel data in JPEG format. In effect you end up with a completely new file. It will completely change the hex code. In a photo editor you edit the image which is then saved, which then results in newly generated hex code. Using JPEG-Repair you edit the hex code which then results in a new image. The image JPEG-Repair shows is rendered by the standard component of Microsoft Visual Studio. However it is ‘fed’ the modified data so we can evaluate the effects of these modifications.
JPEG-repair allows to patch intact headers on files with corrupt headers. You can also edit the raw data that makes up the actual image. a JPEG file is made up of several sections. Each section is preceded by a JPEG Marker which defines what the following section is all about. Each marker defines the size of the section so that the next marker can be found. If any of those markers is corrupt, the file is corrupt. A number (of tiny) sections contains data that the decoder (which is built in to your photo viewer or editor, but also into web browsers) needs to decode the image data. The largest section is the actual encoded and compressed image data. If this ‘raw’ data that makes up the actual image (JPEG encoded data) becomes corrupt, your file may only partially ‘render’: Part of the picture shows, while the rest is distorted or absent (Grey block, colored block, striped block).
JPEG-Repair allows you to repair the section preceding the raw image data (sometimes called the header) using an intact JPEG file. It can also repair some errors in the raw JPEG data and allows you to patch (alter) this raw data. The editor allows you to remove, change and add data to the JPEG bitstream.
The actual encoded data is organized in blocks called minimum coded units (MCU). These block are typically 8×8. 16×8 or 16×16 pixels in size. Unless the image was coded with Restart Markers there are no points of reference within the image bitstream. So, there is no table, index or recognizable structures that indicate the start and size of individual MCUs.
A decoder simply starts decoding until a JPEG Marker is encountered (either Restart Marker or EOI Marker). Size of a single MCU depends on it’s content and it’s compress-ability. One MCU may occupy x bytes, the next y bytes or n bytes. It is therefor quite complex to link any specific byte offset within a JPEG to the actual image. To mitigate this JPEG-Repair ‘pseudo-decodes’ the image bitstream. By doing to it can map MCU data to specific data within the stream. It also allows JPEG-Repair to detect certain errors to occur during decoding. Decoded data however is at no point used to render and image, which is why I refer to it as the pseudo-decoder.
JPEG Repair Software
Existing JPEG File Repair software mainly addresses damage and corruption in the header section (see image above). It is a vital part because without it the actual image data can’t be decoded. If we look at the size of the different sections we see this header makes up only an entire fraction of the total data in the file. The actual chance that corruption is only limited to this tiny area is small. If damage extends beyond the header most automatic repairs will fail.