Sunday, October 10, 2010

Comparison: WebP, JPEG and JPEG XR

Google recently released a new image format called WebP, which is based on the intra-frame compression from the WebM (VP8) video codec. I'd seen a few comparisons of WebP floating about, but I wasn't really happy with any of them. Google has their own analysis which I critiqued here (executive summary: bad comparison metric--PSNR--and questionable methods).

Dark Shikari's comparison was limited by testing a single quality setting and a single image. There's also this comparison and this comparison, but neither of them are that rigorous. The first is, again, a single image at one arbitrary compression level, although it does clearly show WebP producing superior results. The second is at least multiple images, but the methods are a little fuzzy and there's no variation in compression level. It's also a bloodbath (WebP wins), so I'm a bit wary of its conclusions.

So, I coded up a quick and dirty app to compress an image to WebP, convert it back to PNG and then compute the MS-SSIM metric between the original image and the final PNG. I used Chris Lomont's MS-SSIM implementation. I also did the same process for JPEG, using the built-in JPEG compression in .NET. Then, for the hell of it, I decided it'd also be fun to add JPEG XR into the mix, since it's probably the only other format that has a snow-cone's chance in hell of become viable at some future date (possibly when IE9 hits the market). I eventually got JPEG XR working, although it was a pain in the butt due to a horrible API (see below).

The input image is encoded for every possible quality setting, so the final output is a group of SSIM scores and file sizes. I also computed db for the SSIM scores the same way x264 computes it, since SSIM output is not linear (i.e. 0.99 is about twice as good as 0.98, not 1% better):

SSIM_DB = -10.0 * log10(1.0 - SSIM)

For starters, I decided to test on the image Dark Shikari used to test with, since it's a clean source and I wanted to see if I could replicate his findings. Here's the results of the parkrun test:


Quick explanation: higher SSIM (dB) is better, smaller file size is better. So in the perfect world, you'd have codecs outputting in the upper left-corner of the graph. But when images are heavily compressed the quality drops, so you end up with graphs like the above: high compression equals bad quality, low compression equals large file size.

In parkrun, WebP seems to perform decently at very low bitrates, but around ~15 dB JPEG starts to outperform it. Even at low bitrates, it never has a substantial advantage over JPEG. This does seem to confirm Dark Shikari's findings: WebP really doesn't perform well here. When I examined the actual output, it also followed this trend quite clearly. JPEG XR performs so poorly that it is almost not worth mention.

Next, I did a test on an existing JPEG image because I wanted to see the effect of recompressing JPEG artifacting. In Google's WebP analysis, they do this very thing, and I had qualms about JPEG's block-splitting algorithm being fed into itself repeatedly. Here's the image I used:


Results:


Again, absolutely dismal JPEG XR performance. Notice the weirdness in the JPEG line--I believe this is due to recompressing JPEG. I got similarly odd results when I used other JPEG source material, which does support my hypothesis that recycling JPEG compressed material in a comparison will skew the results due to JPEG's block splitting algorithm. I'll try to post a few more with JPEG source material if I can.

As far as performance goes, WebP does a little better here--it's fairly competitive to about ~23 dB, at which point JPEG overtakes it. At lower bitrates it is resoundingly better than JPEG. In fact, it musters some halfway decent looking files around ~15 dB/34 KB (example), while JPEG at that same file size looks horrible (example). However, to really do a good job matching the original, including the grainy artifacts, JPEG eventually outperforms WebP. So for this test, I think the "winner" is really decided by what your needs are. If you want small and legible, WebP is preferable. If you're looking for small(ish) and higher fidelity, JPEG is a much better choice.

Next test is an anime image:


Results:


WebP completely dominates this particular comparison. If you've ever wondered what "13 dB better" looks like, compare JPEG output to WebP output at ~17 KB file sizes. JPEG doesn't reach a comparable quality to the ~17 KB WebP file until it's around ~34 KB in size. By the time JPEG surpasses WebP around 26 dB, the extra bits are largely irrelevant to overall quality. JPEG XR unsurprisingly fails.

Since a lot of web graphics aren't still-captures from movies, I decided a fun test would be re-encoding the Google logo, which is a 25 KB PNG consisting mostly of white space:

This is a better graphic for PNG compression, but I wanted to see what I could get with lossy compression. Results:

WebP clearly dominates, which confirms another hypothesis: WebP is much better at dealing with gradients and low-detail portions than JPEG. JPEG XR is, again, comically bad.

Incidentally, Google could net some decent savings by using WebP to compress their logo (even at 30 dB, which is damn good looking, it's less than 10 KB). And I think this may be a place WebP could really shine: boring, tiny web graphics where photographic fidelity is less of a concern than minimizing file size. Typically PNG does a good job with these sorts of graphics, but it's pretty clear there are gains to be had by using WebP.

My conclusions after all this:
  1. JPEG XR performs so poorly that my only possible explanation is I must be using the API incorrectly. Let's hope this is the case; otherwise, it's an embarrassingly bad result for Microsoft.
  2. WebP is consistently better looking at low bitrates than JPEG, even when I visually inspect the results.
  3. WebP does particularly well with smooth gradients and low-detail areas, whereas JPEG tends to visually suffer with harsh banding and blocking artifacts.
  4. WebP tends to underperform on very detailed images that lack gradients and low-detail areas, like the parkrun sample.
  5. JPEG tends to surpass the quality of WebP at medium to high bitrates; if/where this threshold occurs largely depends on the image itself.
  6. WebP, in general, is okay, but I don't feel like the improvements are enough. I'd expect a next-gen format to outperform JPEG across the board--not just at very low compression levels or in specific images.
A few more asides:
  • I'd like to test with a broader spectrum of images. I also need better source material that has not been tainted by any compression.
  • No windows binary for WebP....seriously? Does Google really expect widespread adoption while snubbing 90% of non-nerd world? I worked around this by using a codeplex project that ported the WebP command line tools to Windows. I like Linux, but seriously--this needs to be fixed if they want market penetration.
  • If you ask the WebP encoder to go too low, it tends to crash. Well...maybe. The aforementioned codeplex project blows up, but I'm assuming it's due to some error in the underlying WebP encoder. That said, by the time it blows up, it's generating images so ugly that it's no longer relevant.
  • JPEG/JPEG-XR go so low that your images look like bad expressionist art. It's a little silly.
  • I absolutely loathe the new classes in .NET required to work with JPEG XR. It is a complete disaster, to put it lightly, especially if all your code is using System.Drawing.Bitmap, and suddenly it all has to get converted to System.Windows.Media.Imaging.BitmapFrame. Why, MSFT, why? If you want this format to succeed, stop screwing around and make it work with System.Drawing.Bitmap like all the other classes already do.
  • WmpBitmapEncoder.QualityLevel appears to do absolutely nothing, so I ended up using ImageQualityLevel, which is a floating point type. Annoying, not sure why QualityLevel doesn't work.
  • The JPEG XR API leaks memory like nuts when used heavily from .NET; not even GC.Collect() statements reclaim the memory. Not sure why, didn't dwell on it too much given that JPEG XR seems to not be worth anyone's time.
  • People who think H.264 intra-frame encoding is a viable image format forget it's licensed by MPEG-LA. So while it would be unsurprising if x264 did a better job with still image compression than WebP and JPEG, this is ignoring the serious legal wrangling that still needs to occur. This is not to say WebM/WebP are without legal considerations (MPEG-LA could certainly attempt to put together a patent pool for either), but at least it's within the realm of reason.
Let me know if you're interested in my code and I'll post it. It's crude (very "git'er done"), but I'm more than happy to share.

EDIT: Just to make sure .NET's JPEG encoder didn't blow, I also tried a few sanity checks using libjpeg instead. I also manually tested in GIMP on Ubuntu. The results are almost identical. If anyone knows of a "good" JPEG encoder that can outdo the anime samples I posted, please let me know of the encoder and post a screenshot (of course, make sure your screenshot is ~17 KB in size).

28 comments:

Anonymous said...

I just used PSP v5 (1998) to encode the anime image and got vastly better quality image at 16.8KB than the one you show, I would recommend you you a better jpeg encoder. Gimp's encoder is also much better - both still not as good as the webp encoding though.

kidjan said...

@ Anonymous

I did consider that the JPEG encoder I was using wasn't optimal. Gimp uses libjpeg, and it wouldn't surprise me if PSP v5 did as well. I've used libjpeg for some projects recently, I'll see if I can get substantially different results using that encoder.

kidjan said...

@ Anonymous - I verified using libjpeg, the results are not substantially different from using the MSFT encoder. I also verified by encoding in GIMP. The image produced by GIMP looks almost exactly like the one produced by the .NET runtime: bad.

I'm skeptical about your result--could you please post your JPEG at 16.8 that shows "vastly better quality?"

Anonymous said...

I'm trying to compress this source image into WebP format, and I'm getting very bad results:
http://img408.imageshack.us/img408/6503/santaorig.png
Can you try this too?

Anonymous said...

Thanks for an informative post.

However, it is quite easy to improve on the JPEG image quality you have shown as examples - while still keeping the same bitrate. Here is an example, using the anime picture as input:

http://img217.imageshack.us/img217/8403/anime16dctre.jpg

As you can see, the quality is quite alot better than your JPEG version. The file size here is 16838 bytes, vs 16476 bytes in your example, a negligible difference IMO.

WebP still looks better, though, but not as much as in your example.

I used cjpeg (from libjpeg 8b) and the tool jpegrescan to achieve this result:

cjpeg -quality 16 -progressive -dct float anime.tga > anime_16_dct.jpg
jpegrescan.pl anime_16_dct.jpg anime_16_dct_re.jpg

Anonymous said...

Same conclusion regarding the "ownbottle.jpg" example:

Here is a significantly better JPEG image, at 33706 bytes:

http://img259.imageshack.us/img259/5226/own17dctpre.jpg

(Please note, the JPEG example you link to is 116kbytes, not ~33kb)

Same utils as before, here are the parameters:

cjpeg -quality 17 -progressive -dct float ownbottle.tga > own_17_dct_p.jpg
jpegrescan.pl own_17_dct_p.jpg own_17_dct_p_re.jpg

(In both this and the anime case, I converted - losslessly - the sources to tga for easy input into cjpeg, hence the tga input filename.)

kidjan said...

@anonymous,

Thanks for the info. I actually came to the same conclusion earlier this week--I need to take into account the JPEG encoder being used. I did use libjpeg, although a lot of the settings were turned off (floating point IDCT, optimized entropy encode, 16-bit quantization tables, etc.), and I didn't use jpegrescan.

I'll be addressing this in a subsequent posting, but it'll take me some time.

kidjan said...

Oh, and thanks for catching the JPEG image being the wrong size--stupid blogger recompressed it (I doubt it affects the results much, since going from a 16 KB image to a 100+ KB image isn't going to result in any meaningful degradation). But I'll fix it when I get a chance.

Anonymous said...

You´re welcome :)

I came by your blog after reading Dark Shikaris post here:

http://x264dev.multimedia.cx/archives/541

...and decided to do some tests. Have learnt a lot during my experiments, fun to find out that good old JPEG actually is very good - and that the output from normal encoders can even be compressed (losslessly) a bit further.

kidjan said...

Well, I'm starting to wonder if it could be improved considerably during the quantization/scale phase. Most encoders use static quantization tables and linear scaling methods, which means there's a lot of room for improvement (I'd say at least 10, possibly 20%) simply be generating image/bitrate specific quantization tables.

Been doing research into that, I may take a stab at coding up a few ideas I've seen floating about.

telematix said...

I think your comparison here is a little problematic.

First off your original images clearly come from a 4:2:0 source, except the Google logo. Of course WebP will do better here since it also uses 4:2:0 internally. As a reminder 4:2:0 is completely inappropriate for photos which come from a true RGB source.

JPEG uses 4:4:4 internally and JPEG-XR supports both 4:4:4 and 4:2:0. I would try to recompress your JPEG-XR file again and use the 4:2:0 internal color space.

Better, pick some uncompressed images which come directly from a camera, not from video.

Secondly I would suggest you use the ISO command line version of the compressor for JPEG-XR (You can get this from the ITU web site). In my testing JPEG-XR beats JPEG hands down. So yes, probably an issue with the API you were using. Maybe it is picking a 565 or 10_10_10 format for some reason which JPEG-XR supports? I would be great if you could post the resulting JPEG-XR image you got to see what format it is. JPEG-XRs strength is that it supports 4:4:4, 4:2:2, 4:2:0, floating point, lossless (which makes it _much_ better than PNG), 565, and most importantly alpha channel (both lossless and lossy).

One big issue with WebP is that the color space it uses is BT.601. Ask any professional photographer out there if this makes any sense. IMO it is completely nuts. You have to do an expensive and lossy color conversion to sRGB space to display this properly on a computer screen. These guys at Google clearly have no clue about color...

And no, I do not work for Microsoft. I am all for open and better formats but WebP is not the one.

kidjan said...

@ telematix,

JPEG typically does not use 4:4:4 internally; that would depend on the encoder implementation (JPEG standard itself allows 4:4:4, 4:2:2 and 4:2:0). The vast, vast majority of JPEG encoders downsample to 4:2:0 by default. For example, both libjpeg and Microsoft's .NET JPEG encoder convert to 4:2:0 during the subsampling phase of the JPEG encode, and there are not options to use a different downsampling. See http://en.wikipedia.org/wiki/JPEG#Downsampling

No, 4:2:0 is not "completely inappropriate" for photos. Were this true, basically every consumer-grade digital camera is "completely inappropriate" since most use JPEG, and again, the vast majority of encoder implementations downsample to 4:2:0. Furthermore, I see no indication that this colorspace is "completely inappropriate." Particularly when you take into account 4:2:0 and 4:4:4 JPEG compressed files of the same size--the additional 50% compression in the 4:2:0 file can basically be used in more productive ways, like less quantization of the luma channel which is visually more important.

Also, JPEG adheres to CCIR 601 for the RGB->YCbCr conversion, which is the exact same thing as BT.601. So it's hard to fault WebP here when what people have been using for decades does the exact same thing.

I don't agree that using 4:2:0 source material would seriously alter the results. All of the encoders got the exact same input and are scored based on their encoded output compared to that input, so it's not clear to me why it's a factor. Furthermore, SSIM only operates on the luma channel anyway, so you can disregard my findings with respect to color accuracy anyway. But I also find this somewhat irrelevant, since it's been shown through decades of research that the chroma information isn't nearly as important as luma (hence the common downsizing for some cheap and easy compression).

Thanks for the heads up on the ISO command line version for JPEG-XR. I'll retest with that.

Thanks again for the comments.

Anonymous said...

I recommend trying this again using the pictools sdk. unfortunately the jpeg-xr encoder built into windows blows for some reason. i did the same test comparing images by eye using the .net encoders. I'm a little puzzled because most of the time jpeg-xr looked better to me almost all the time, though there were other images it completely failed on.

http://www.accusoft.com/picphoto.htm (i used the evaluation version)

in my experience by comparing images of the same size by eye encoded in jpeg, webp, jpeg2000 and jpeg-xr while not knowing what image was encoded with what codec, and comparing all these with the original uncompressed image.. i found that jpeg was almost always significantly the worst. jpeg-xr and jpeg2000 where almost always better than jpeg. and webp was almost always better than all of them, by a significant margin.

But i compared them by eye. i don't know if you'd agree with my assessment, but i'd be interested in seeing if jpeg-xr preformed much better on the SSIM graph with the pictools codec.

Anonymous said...

http://img84.imageshack.us/img84/6741/86742173.png

adding on to my last comment, this image is from my own tests with the pictools sdk and the google webp encoder. to me jpeg-xr looks flat out better than jpeg, and webp looks flat out better than everything else. I know this is a specific example, but the vast majority of the images and bitrates i used turned out looking like this.

phthoruth said...

If you retest, could you do a couple more images?
Something typical of images on the web.
To get high quality source, find a larger image and resize it down, which will reduce artifacts caused by compression and subsampling.

SSIM db is a good metric. Hopefully it becomes more commonly used.

420 is a current limitation in webp. Its a good choice at low bitrates (the web), but 444 would be nice for high bitrates, such as photography.

webpconv has a binary posted.
http://code.google.com/p/webp/downloads/list

Anonymous said...

there's a new improved encoder released now, along with version 0.1.2 of the sources.

check out:
http://code.google.com/speed/webp/download.html

Patrick said...

Regarding the following images mentioned in comments on this article...

http://img217.imageshack.us/img217/8403/anime16dctre.jpg

http://img259.imageshack.us/img259/5226/own17dctpre.jpg

... I think it's worth noting that they won't display in Opera 11.10 and Safari 5.0.5 on Windows 7. I'm not sure why.

kidjan said...

@patrick

Those images underwent additional compression using a tool called jpeg crush, which abuses the progressive download feature in the JPEG extension to squeeze out a bit more compression. It's possible that the decoder implementation in Opera/Safari has some defect preventing it from working correctly with progressive files, or I suppose it's even possible the files aren't "technically" conformant. Not sure which is the case.

Worth note is there are additional features in the JPEG standard (approved after the initial standard) that can greatly increase compression, like arithmetic encoding--but most decoders don't support those features. Too bad, really.

Anonymous said...

Your tests are basically nonsense. You might like to see some meaningful results at http://qualvisual.net .

kidjan said...

@Anonymous,

The link you provides uses the exact same metric I do (SSIM--see http://qualvisual.net/Moretest.php), so on what grounds my tests are "nonsense" is unclear to me. Or anyone else, since apparently your truth standards involve a one-liner and a hyperlink to some site peddling seriously dubious stats.

Thomas Richter said...

Folks,

a lot of tests like this had been made in the JPEG, actually, and unless you are very careful, results are near to useless. Just to give you a couple of hints what went wrong here:

JPEG uses by default a quantizer table that is optimized for visual performance, not for PSNR performance, Which means that indices like MS-SSIM perform better on JPEG. JPEG 2000, at least the free implementations you find, are PSNR-optimal, and not visually optimal. It is not hard to optimize JPEG 2000 to visual performance, the result of which is then that SSIM becomes much better than that of JPEG.

Similar effects go for XR: You can also tune it towards better visual performance rather than better peak-error (which is where it is usually better).

Then, on the index itself: MS-SSIM is certainly better than PSNR, but surely not the last answer to the question of objective quality indices. I personally prefer VDP if I have the time, which however takes longer to compute. It correlates better to human performance.

IOW, the measurements here are pretty much "apples to oranges".

kidjan said...

@Thomas,

I disagree with your assessment.

First, nobody (certainly not me) is claiming MS-SSIM is some sort of perfect substitution for mean opinion scores. I'm not even claiming it's better than VDP, although it is known to be better than PSNR. That said, it does correlate well with mean opinion scores, making it a good general purpose tool for assessing image quality regardless of encoder implementation.

Second, your claim that "JPEG is optimized for visual performance" is patently wrong for most encoder implementations, and I know this because A) the quantization tables are typically hard-coded in the encoder, when in reality they should be generated based on the image content and B) the linear scaling methods used for quantization tables basically precludes them from ever being tuned for "visual performance." Most JPEG implementations aren't tuned worth beans (and if they are tuned, inevitably they tuned to PSNR), especially if you start scaling the matrix to change compression levels. The encoder used here--both .NET and libjpeg--use the default quantization tables, which are _not_ tuned for "visual performance" beyond a decent guess at what good values are.

Third, you can say the results are "apples to oranges," but a visual inspection of the output renders this observation incorrect. Clearly I didn't publish the output--I should--but the trends inferred with MS-SSIM were painfully obvious when visually inspecting the output. I don't care what the encoder is tuned for; when there's 5+ db difference in MS-SSIM scores for a given file size, it's pretty obvious which encoder is better.

I agree--somewhat--with your points about encoder authors optimizing for specific full-reference metrics, but in this particular scenario, the observation simply isn't relevant given the results--again, 5+ db difference is _obvious_, and you'd be a fool to disregard that.

Charles said...

Do your original images have an alpha channel? JPEG XR may be storing the alpha channel losslessly, whereas JPEG will simply discard it. This may make a huge difference in your results.

kidjan said...

@Charles Not sure, but very good question. At some point I'd really like to retest with a better JPEG XR compressor, because whatever MSFT provided with the .NET framework was clearly just a crap piece of software. I almost regret publishing these results, since it's entirely possible JPEG XR is a decent standard being maligned by somebody's horrible code.

Sharma Natalia said...
This comment has been removed by the author.
Unknown said...

@kidjan, just a heads up to make you aware that Matt Uyttendaele from Microsoft recently posted a JPEG XR encoder/decoder for Photoshop as well as source code on CodePlex.

Blog post: http://hdview.wordpress.com/2013/04/11/jpegxr-photoshop-plugin-and-source-code/

Plugins:
http://research.microsoft.com/en-us/um/redmond/groups/ivm/JPEGXR/

Source Code:
https://jxrlib.codeplex.com/

Although I'm out of my depth on your technical analysis, common sense tells me that if your JPEG XR results are coming out with larger file sizes than the JPEG counterparts or lower quality then something is most definitely wrong with the testing process that produced these results.

Anonymous said...

It's unfortunate that you ran into API issues with JPEGXR. The analysis done here, on the open jxrlib library, seems to be much more favorable to JXR: http://hdview.wordpress.com/2013/05/30/jpegxr-updates/

They found the same thing you did with high quality settings in WebP. JXR seems to do well at both low and high quality settings.

Email Chopper said...

I think that this site contains lot of info about my desired information which is must for all like me. I like it. Your site is also informative. I enjoy your article. Your exclusive article & effective services are more necessary for me. So thanks I’m happy to read it.