JPEG-Clear

What is JPEG-Clear?

JPEG-Clear is a fast and efficient antialiased mipmapping algorithm, that can transform any image file into a set of new image files, which together take up around the same amount of space as the original file, but arranges the image data much more efficiently for fast access.

The C# code implementing JPEG-Clear is completely free to use:

Contained are three sample applications:

How could JPEG-Clear speed up browsing?

For simplicity, imagine we wanted to view a page containing the following image:

mattjack.jpg (972 x 648 pixels, 70.9 KiB):
mattjack

Imagine that we were using an extremely slow network connection—say, 1.4 KB/s, equivalent to an old 14.4 kbps dialup modem. (The same argument holds for higher bandwidths with higher-resolution images; this is just an example.) This image would take about 51 seconds to load through such a connection.

Using JPEG-Clear, the following image would appear after just one second:

approx1.png (972 x 648 pixels):
approx1

The result is blurry, but is not a bad effort, given that it is based on only a 1.4 KiB image file.

After another two seconds, another 2.6 KiB file will have been downloaded, and JPEG-Clear replaces the image with this:

approx2.png (972 x 648 pixels):
approx2

and, after another five seconds, with

approx3.png (972 x 648 pixels):
approx3

Even with this painfully slow connection, someone browsing this page would have a good idea of what the image showed within seconds, without having to wait almost a minute for the entire image file to “crawl down the page”.

This example is contrived; but think of the original ten-megapixel photograph that this image was derived from. There is no need to store different versions of the image at different resolutions: it can be downloaded progressively, “top-down”, and rendered at any desired size.

For completeness, I show what would be rendered after another fifteen seconds:

approx4.png (972 x 648 pixels):
approx4

Finally, after another 40 seconds, the original image at full detail would be shown.

How does JPEG-Clear work?

Let’s go back to the original image above:

mattjack.jpg (972 x 648 pixels, 70.9 KiB):
mattjack

The JPEG-Clear algorithm starts by repeatedly downsampling this image by a factor of 2 in each direction using the magic kernel, which is lightning fast, yet produces nicely antialiased results:

downsample1.png (486 x 324 pixels):
downsample1

downsample2.png (243 x 162 pixels):
downsample2

downsample3.png (122 x 81 pixels):
downsample3

downsample4.png (61 x 41 pixels):
downsample4

(Although shown here as PNG files, these downsamples are actually kept in memory, and do not need to be written to disk.)

It stops downsampling when the larger dimension of the downsampled image is between 32 and 63 pixels.

This smallest downsample, downsample4.png above, is then saved as a standard JPEG image:

mattjack.jpg.jpc.a.jpg (61 x 41 pixels, 1.4 KiB):
mattjack.jpg.jpc.a.jpg

This is the “base” file for the JPEG-Clear file set corresponding to mattjack.jpg. The filename looks complicated, but its structure allows easy downloading of JPEG-Clear file sets, simply as a set of standard JPEG images. It is best understood by building up the extensions:

mattjack.jpg   The original image filename
mattjack.jpg.jpc   Identifies a JPEG-Clear file set
mattjack.jpg.jpc.a   The first file to load (a, b, c, ...)
mattjack.jpg.jpc.a.jpg   Saved as a standard JPEG image

The JPEG-Clear algorithm then reads mattjack.jpg.jpc.a.jpg back in (which will be slightly different from downsample4.png, because JPEG is a lossy compression format), and then upsamples once using the magic kernel:

upsample1.png (122 x 81 pixels):
upsample1

If all you had was the file mattjack.jpg.jpc.a.jpg, then this would be your best estimate of downsample3.png above; it is blurry, but recognizable.

The algorithm then computes the difference between the original downsample3.png and the estimate upsample1.png, and encodes this “diff” as a standard JPEG image file:

mattjack.jpg.jpc.b.jpg (122 x 81 pixels, 2.6 KiB):
mattjack.jpg.jpc.b.jpg

Note that this “diff” is an approximation for two reasons: firstly, because it has been saved as a JPEG image file (which is lossy); and secondly, because the differences in intensity in each channel have been encoded nonlinearly (to fit the difference range from –255 to +255 for each channel into the 0 to 255 intensity scale of each channel of a JPEG image), which has the effect of quantizing large positive and negative differences.

The algorithm then loads the file mattjack.jpg.jpc.b.jpg back in, adds the resulting image to upsample1.png (undoing the nonlinear transformation, as best it can), and again upsamples the result once using the magic kernel:

upsample2.png (243 x 162 pixels):
upsample2

Again, if all you had were the files mattjack.jpg.jpc.a.jpg and mattjack.jpg.jpc.b.jpg, then this would be your best estimate of downsample2.png above.

The algorithm again computes the difference between the original downsample2.png and the estimate upsample2.png, and encodes this “diff” as a JPEG image:

mattjack.jpg.jpc.c.jpg (243 x 162 pixels, 6.9 KiB):
mattjack.jpg.jpc.c.jpg

The process should now be clear: the algorithm continues to generate these “diffs” until it gets back up to the size of the original image:

mattjack.jpg.jpc.d.jpg (486 x 324 pixels, 19.6 KiB):
mattjack.jpg.jpc.d.jpg

mattjack.jpg.jpc.e.jpg (972 x 648 pixels, 56.2 KiB):
mattjack.jpg.jpc.e.jpg

If you have the five files mattjack.jpg.jpc.a.jpg, mattjack.jpg.jpc.b.jpg, mattjack.jpg.jpc.c.jpg, mattjack.jpg.jpc.d.jpg, and mattjack.jpg.jpc.e.jpg, then the JPEG-Clear library can reconstruct an image essentially identical to the original mattjack.jpg:

reconstructed.png (972 x 648 pixels):
reconstructed

These five files together constitute the “JPEG-Clear file set” corresponding to the original image mattjack.jpg. Their combined size is 86.7 KiB. This is slightly larger than the original file (70.9 KiB); sometimes it is slightly more, sometimes slightly less.

Using the JPEG-Clear transformation does not compress the image data, but rather it rearranges it for more efficient access. For example, imagine that you just wanted a thumbnail of mattjack.jpg, that can fit in a square frame of side length 100 pixels. The code for doing this in the JPEG-Clear library is:

var jpc = new Jpc("mattjack.jpg");
var bitmap = jpc.Render(100, 100);

That's it! For the first line of code, the JPEG-Clear library checks that at least the “base” file, mattjack.jpg.jpc.a.jpg, exists—but that's all it does. For the second line, it keeps loading up “diffs”—in this case, just mattjack.jpg.jpc.a.jpg and mattjack.jpg.jpc.b.jpg—until one of the dimensions is larger than the specified frame size. It then uses a standard library function to shrink the image smoothly to be 100 pixels wide instead of 122 pixels.

If you were to add a third line of code,

bitmap.Save("render100x100.png", ImageFormat.Png);

then you would get the following image:

render100x100.png (100 x 66 pixels):
render100x100

In other words, the JPEG-Clear library has supplied you with a thumbnail of mattjack.jpg after only reading in 4.0 KiB of image files; it has not needed to read in the entire original image, nor is it necessary to precompute these thumbnails and cache them. Exactly the same would be true if we had started with the real (ten-megapixel) photograph that this image was taken from: JPEG-Clear uses a “top-down” philosophy when storing your image data.

Could JPEG-Clear be applied to video?

Absolutely. Any video stream could be broken into a “base” video stream, plus “diff” video streams, in exactly the same fashion as shown above for still images.

A playback application need only subscribe to the “diff” streams necessary for rendering at the desired resolution. Alternatively, over an unreliable bandwidth connection (such as the Internet), a playback application can “drop” its subscription to higher-detail diff streams when bandwidth degrades, and “resubscribe” to them if and when bandwidth improves again.

What about lossless image formats?

Although not its primary focus, the JPEG-Clear library caters for images stored losslessly. It does so by storing an extra two “diffs” which, when combined with the JPEG approximations, allow the original image to be reconstituted losslessly.

For example, imagine that we started with the following lossless image:

sally.png (214 x 320 pixels, 173 KiB):
sally

The JPEG-Clear library converts this to the following file set:

sally.png.jpc.a.jpg (27 x 40 pixels, 982 B):
sally.png.jpc.a.jpg

sally.png.jpc.b.jpg (54 x 80 pixels, 1.4 KiB):
sally.png.jpc.b.jpg

sally.png.jpc.c.jpg (107 x 160 pixels, 3.0 KiB):
sally.png.jpc.c.jpg

sally.png.jpc.d.jpg (214 x 320 pixels, 8.1 KiB):
sally.png.jpc.d.jpg

sally.png.jpc.y.png (214 x 320 pixels, 82.5 KiB):
sally.png.jpc.y.png

sally.png.jpc.z.png (214 x 320 pixels, 1.3 KiB):
sally.png.jpc.z.png

The first four (JPEG) files are exactly what would have been produced if the original image had been JPEG. The last two files, on the other hand, are stored in the lossless PNG format, and encode the difference between the approximation obtained by combining the four JPEG files, and the original image. Using all six files together, the JPEG-Clear library can reconstitute sally.png, exactly.

It may seem strange that sally.png.jpc.y.png is by far the largest of the six files by size, but appears to be nothing but gray. The reason is that the differences it is encoding are small, but necessary if one wishes to have absolutely lossless compression. Enhancing the image using GIMP shows what it is storing:

enhanced.y.png (214 x 320 pixels):
enhanced.y

In other words, it is simply supplying the “JPEG noise”, not discernable to the naked eye, but necessary for a lossless reconstruction.

The final image file, sally.png.jpc.z.png, is needed because the nonlinear transformation of deltas can lead to quantization errors in intensity of up to plus or minus two units; this file supplies those quantization errors, to ensure an absolutely lossless reconstruction of the original image.

It may seem, from this example, that the JPEG-Clear transformation allows extra compression of lossless images: the original PNG file was 173 KiB in size, whereas the six JPEG-Clear files total only 97.3 KiB. (All files, including the original, were saved from C# using default compression parameters, to ensure that this sort of comparison makes sense.) However, this is only true in this case because the original image was a continuous-tone photograph—for which lossless compression is rarely, if ever, used (except, perhaps, for medical imaging purposes, where a lossless reconstruction of the original image may be needed on legal grounds). Consider, as a contrary example, the following screendump:

bing.png (717 x 395 pixels, 83.8 KiB):
bing

The JPEG-Clear file set corresponding to this lossless PNG image consists of the following:

bing.png.jpc.a.jpg (45 x 25 pixels, 950 B):
bing.png.jpc.a.jpg

bing.png.jpc.b.jpg (90 x 50 pixels, 1.6 KiB):
bing.png.jpc.b.jpg

bing.png.jpc.c.jpg (180 x 99 pixels, 3.8 KiB):
bing.png.jpc.c.jpg

bing.png.jpc.d.jpg (359 x 198 pixels, 10.6 KiB):
bing.png.jpc.d.jpg

bing.png.jpc.e.jpg (717 x 395 pixels, 35.2 KiB):
bing.png.jpc.e.jpg

bing.png.jpc.y.png (717 x 395 pixels, 375.1 KiB):
bing.png.jpc.y.png

bing.png.jpc.z.png (717 x 395 pixels, 8.0 KiB):
bing.png.jpc.z.png

In this case, the main “lossless diff”, bing.png.jpc.y.png, is more than four times larger than the original file! The reason is that a screendump like this typically consists of large areas of constant intensity plus sharp edges (in both luminance and chrominance), for which a JPEG approximation provides a poor representation (from a lossless point of view): encoding the differences from the approximation—including the JPEG noise caused by those sharp edges—is much more costly than encoding the original image. This is not to say that the JPEG appproximation is poor, from a visual perspective: using just the five JPEG files (and not the two PNG files), the JPEG-Clear reconstruction of the image is as follows:

bing.approx.png (717 x 395 pixels):

It would also be possible to implement the JPEG-Clear algorithm using lossless image storage throughout. However, each “diff” image would need one extra bit of information, to encode the sign of the difference. This would be feasible in custom imaging fields such as medical imaging, but is not feasible for general-purpose use.

Could the JPEG-Clear method be used on 3D voxel data?

Yes: see here.