Pixel Loss from Saving Images in JPEG format

Michael D. Sullivan

The JPEG file interchange format[1] is frequently used both as a digital camera output medium and as a final form for saving images. The JPEG format is a "lossy" format, meaning that it does not save precise pixel values, but instead saves data that allows reconstruction of a close approximation of the original.[2] It's a very good approximation for many types of images, having been developed to compress and reconstruct photographic images. JPEG compression algorithms provide for variable compression levels. In practice, this means that devices and applications creating JPEG files can give the user the ability to select greater image quality and larger file size or lower quality and smaller file size.

Because JPEG is a lossy format, it does not permit the exact reproduction of the original image even when the highest quality setting is used. This means that "pixels are lost" every time an image is saved in JPEG format — more precisely, the original RGB values of pixels are discarded and not fully recovered. Every pixel is an approximation of the original.  The change in value of pixels from one generation to the next is greatest, at any given compression level, in areas of high contrast — and particularly sharp edges and lines. The approximations are least noticeable in areas of smooth color or luminosity change.  

Because of this aspect of JPEG, it is not a file format that is well suited for incremental saves of images being edited.  Every time a JPEG file is saved and then reopened, it becomes a slightly less accurate reproduction of what has gone before.  

One question that is frequently asked is how significant the loss of pixel accuracy is, particulary with repeated saves.  I conducted an experiment to analyze this.

Original Image
I created the above "original" image from a photograph of a marine iguana taken in the Galapagos, resized to 600x900 pixels; it's available here in PNG-24 format (which isn't lossy).  I then saved JPEG copies of this image from Photoshop CS2's Save for Web at 100 (maximum), 80 (very high), and 60 (high) quality levels.[3]

Next, I opened the JPEGs in Photoshop as new layers atop the original image. Then, for each JPEG, I created a new layer depicting the difference between the original and the JPEG. I did this by turning off the visibility of all other layers, setting the JPEG's blending mode to "difference", using a threshold adjustment mask to turn all pixels that had changed by a specified threshold amount[4] to solid white color, and then saving the result as a new layer. The thresholds used were 1, 2, 5, and 10 (where appropriate). 

The results are fairly astounding.  When the image above is saved at maximum quality (JPEG100), all of the pixels appearing white below represent pixels that changed their RGB value by 1 unit:

Pixels changed by 1 unit or more when saving to JPEG 100 (maximum quality)

The histogram palette, set to Luminosity, provides a count of the pixels that are white, representing the pixels that had changed by at least the specified threshold.  Here are the results in tabular form (click on a given result to see the full-sized depiction of the changed pixels):
Percentage of pixels changing value by ≥1, 2, 5, and 10
JPEG Quality Level Threshold=1 Threshold=2 Threshold=5 Threshold=10
100 66.89% 5.59% 0% 0%
80 93.95% 60.55% 2.93% 0.004%
60 97.04% 77.65% 17.50% 0.69%

As expected, the higher quality levels had much smaller changes in pixel value than the lower quality levels. What is more surprising is the level of further change when the JPEG files are resaved at the same quality level or one level higher or lower, and then reopened.  Again, click on a given result to see the full-sized depiction of the changed pixels, but files have not been included to illustrate the cumulative percentage of pixels changed from the original.

Percentage of pixels changing value by ≥1 when resaved as same or different JPEG quality,
from the starting JPEG image and the cumulative change from the original image
Starting JPEG
Quality Level
Resaved as 100 Resaved as 80 Resaved as 60
100
Cum. from original
17.39%
70.15%
94.91%
94.63%
N/A
80
Cum. from original
22.42%
94.23%
3.59%
94.07%
97.01%
97.09%
60
Cum. from original
18.09%
97.13%
56.31%
97.25%
19.05%
97.11%

As the table indicates, the percentage of pixels changing in a second-generation save to the same JPEG quality level is much lower than the percentage that changed when the file was first saved to JPEG, but resaving the file at the next lower quality level or the next higher quality level results in a much higher percentage of pixels changing value from one generation to another than if the same quality level is used.

Despite the relatively high number of pixels changing value when a JPEG is resaved, it is suprising that the number of pixels cumulatively changed from the original image does not significantly increase.  This suggests that the vast majority of the pixel loss in a second-generation save at the same quality level affects the same pixels that had already changed in the first save as a JPEG file.  The relatively few portions of the image that did not undergo any significant change when first saved as a JPEG largely remain unchanged when resaved. The pixels that are undergoing the repeated alterations upon resaving may be drifting farther and farther from the original image, but it is beyond the scope of this experiment to evaluate this.  It is noteworthy, however, that the pixel change patterns appear to be "clumpier" upon a resave — evidence of "JPEG artifacts"; this is particularly evident when the JPEG60 image is resaved at the same quality.

Finally, the following table will illustrate the relative file sizes[5] of the first and second generation JPEG files containing this image, including the amount of compression relative to the original 1,700,000 bytes of image data (measured as the reduction in size):
Original JPEG Resaved 100 Resaved 80 Resaved 60
Bytes Comp. Bytes Comp. Bytes Comp. Bytes Comp.
JPEG100 641,324 62.28% 642,178 62.22% 296,362 82.57% N/A N/A
JPEG80 295,996 82.59% 405,120 76.17% 295,954 82.59% 161,026 90.53%
JPEG60 166,812 90.19% 293,430 82.74% 204,175 87.99% 166,810 90.19%



For persons wishing to examine the data in more detail, a RAR compressed copy of the Photoshop file, containing all layers, is available by emailing me, using my initials (mds) at this domain as the email address.



Notes


1.  Strictly speaking, JPEG is not a file format but a compression algorithm developed by the Joint Photographic Experts Group for the ISO.  It became an an international standard when it was adopted by the ISO in 1990 and has been published as ITU Recommendation T.81; it's available in Adobe Acrobat (PDF) format from the ITU.  JPEG compression is used in several file formats.  The most common file format using JPEG compression is the JPEG File Interchange Format (JFIF), which was created by the Independent JPEG Group for storing single images that have been JPEG compressed.  The current version of the JFIF specification is available in PDF format from the W3C.  JFIF files are commonly known as JPEG images or JPEG files and typically have a .jpg or .JPG extension.  For further information about the JPEG file image format, see the W3C page on JPEG FIF, the JPEG FAQ, and Wikipedia on JPEG.

2.  In the simplest terms, JPEG compression stores parameters for formulas that allow the construction of images that look much like the original to the human eye, instead of storing pixel values.  The image is broken down into "tiles" that are represented by formulas indicating the relative brightness and color of different regions of each tile.  Wikipedia provides a good summary of the techniques used.

3.  There are no standardized names or categories for the various JPEG quality/compression/file size settings.  As a result, every application or device uses its own names for its own proprietary settings. Photoshop even uses different terms in the main program and in Save for Web.  I settled on the terms used in Save for Web in Photoshop CS2.

4.  Photoshop does not specify what units are used in the determination of thresholds; it appears to be luminosity, which is determined by combining R, G, and B values in accordance with a weighted formula.  Thus, the use of the threshold adjustment layer with a threshold value of 1 will not necessarily detect all changed pixels, but only those changed pixels whose luminosity is greater than or equal to 1; if a pixel's R, G, and B levels change in such a way that the weighted luminosity of the difference is not greater than 1, the pixel will remain black even though the pixel has changed.  A more accurate determination of pixel change would require creating separate threshold adjustment layers for the red, blue, and green channels, and then combining them. The method employed here is sufficient to illustrate the magnitude of the pixel changes, but does not produce a mathematically rigorous result.

5.  The JPEG files were saved without profiles or metadata.  Inclusion of profiles and metadata would make the files significantly larger.

–30–
Document made with Nvu