EXR: Lossless Compression

Posted on Aug 4, 2021

#rendering #performance

One thing led to another, and I happened to be looking at various lossless compression options available in OpenEXR image file format.

EXR has several lossless compression options, and most of the available material (e.g. “Technical Introduction to OpenEXR” and others) basically end up saying: Zip compression is slow to write, but fast to read; whereas PIZ compression is faster to write, but slower to read than Zip. PIZ is the default one used by the library/API.

How “slow” is Zip to write, and how much “faster” is PIZ? I decided to figure that out :)

Test setup

Hardware: MacBookPro 16" (2019, Core i9 9980HK, 8 cores / 16 threads). I used latest OpenEXR version (3.1.1), compiled with Apple Clang 12.0 in RelWithDebInfo configuration.

Everything was tested on a bunch of EXR images of various types: rendered frames, HDRI skyboxes, lightmaps, reflection probes, etc. All of them tend to be “not too small” – 18 files totaling 1057 MB of raw uncompressed (RGBA, 16-bit float) data.

What are we looking for?

As with any lossless compression, there are at least three factors to consider:

Compression ratio. The larger, the better (e.g. “4.0” ratio means it produces 4x smaller data).
Compression performance. How fast does it compress the data?
Decompression performance. How fast can the data be decompressed?

Which ones are more important than others depends, as always, on a lot of factors. For example:

If you’re going to write an EXR image once, and use it a lot of times (typical case: HDRI textures), then compression performance does not matter that much. On the other hand, if for each written EXR image it will get read just once or several times (typical case: capturing rendered frames for later encoding into a movie file), then you would appreciate faster compression.
The slower your storage or transmission medium is, the more you care about compression ratio. Or to phrase it differently: the slower I/O is, the more CPU time you are willing to spend to reduce I/O data size.
Compression ratio can also matter when data size is costly. For example, modern SSDs might be fast, but their capacity still be a limiting factor. Or a network transmission of files might be fast, but you’re paying for bandwidth used.

There are other things to keep in mind about compression: memory usage, technical complexity of compressor/decompressor, ability to randomly access parts of image without decompressing everything else, etc. etc., but let’s not concern ourselves with those right now :)

Initial (bad) result

What do we have here? (click for a larger interactive chart)

This is two plots of compression ratio vs. compression performance, and compression ratio vs. decompression performance. In both cases, the best place on the chart is top right – the largest compression ratio, and the best performance.

For performance, I’m measuring it in MB/s, in terms of uncompressed data size. That is, if we have 1GB worth of raw image pixel data and processing it took half a second, that’s 2GB/s throughput (even if compressed data size might be different).

The time it has taken to write or read the file itself is included into the measurement. This does mean that results are not only CPU dependent, but also storage (disk speed, filesystem speed) dependent. My test is on 2019 MacBookPro, which is “quite fast” SSD for today, and average (not too fast, not too slow) filesystem. I’m flushing the OS file cache between writing and reading the file (via system("purge")) so that EXR file reading is closer to a “read a new file” scenario.

What we can see from the above is that:

Writing an uncompressed EXR goes at about 400 MB/s, reading at 1400 MB/s,
Zip and PIZ compression ratio is roughly the same (2.4x),
Compression and decompression performance is quite terrible. Why?

Turns out, OpenEXR library is single-threaded by default. The file format itself is much better than the image formats of yore (e.g. PNG, which is completely single threaded, fully, always) – EXR format in most cases splits up the whole image into smaller chunks that can be compressed and decompressed independently. For example, Zip compression does it on 16 pixel row chunks – this loses some of the compression ratio, but each 16-row image slice could be compressed & decompressed in parallel.

If you tell the library to use multiple threads, that is. By default it does not. So, one call to Imf::setGlobalThreadCount() later…

Threaded result

There, much better! (16 threads on this machine)

Compression ratio: Zip and PIZ EXR compression types both have very similar compression ratio, making the data 2.4x smaller.
Writing: If you want to write EXR files fast, you want PIZ. It’s faster than writing them uncompressed (400 -> 600 MB/s), and about 3x faster to write than Zip (200 -> 600 MB/s). Zip is about 2x slower to write than uncompressed.
Reading: However, if you mostly care about reading files, you want Zip instead – it’s about the same performance as uncompressed (~1600 MB/s), whereas PIZ reads at a lower 1200 MB/s.
RLE compression is fast both at writing and reading, but compression ratio is much lower at 1.7x.
Zips compression is very similar to Zip; it’s slightly faster but lower compression ratio. Internally, instead of compressing 16-pixel-row image chunks, it compresses each pixel row independently.

Next up?

So that was with OpenEXR library and format as-is. In the next post I’ll look at what could be done if, hypothetically, one were free to extend of modify the format just a tiny bit. Until then!