SPIR-V Compression: SMOL vs MARK

Two years ago I did a small utility to help with Vulkan (SPIR-V) shader compression: SMOL-V (see blog post or github repo).

It is used by Unity, and looks like also used by some non-Unity projects as well (if you use it, let me know! always interesting to see where it ends up at).

Then I remembered the github issue where SPIR-V compression was discussed at. It mentioned that SPIRV-Tools was getting some sort of “compression codec” (see comments) and got closed as “done”, so I decided to check it out.

SPIRV-Tools compression: MARK-V

SPIRV-Tools repository, which is a collection of libraries and tools for processing SPIR-V shaders (validation, stripping, optimization, etc.) has a compressor/decompressor in there too, but it’s not advertised much. It’s not built by default; and requires passing a SPIRV_BUILD_COMPRESSION=ON option to CMake build.

The sources related to it are under source/comp and tools/comp folders; and compression is not part of the main interfaces under include/spirv-tools headers; you’d have to manually include source/comp/markv.h. The build also produces a command line executable spirv-markv that can do encoding or decoding.

The code is well commented in terms of “here’s what this small function does”, but I didn’t find any high level description of “the algorithm” or properties of the compression. I see that it does something with shader instructions; there’s some Huffman related things in there, and large tables that are seemingly auto-generated somehow.

Let’s give it a go!

Getting MARK-V to compile

In SMOL-V repository I have a little test application (see testmain.cpp) that has on a bunch of shaders, runs either SMOL-V or Spirv-Remapper on them, additionally compresses result with zlib/lz4/zstd and so on. “Let’s add MARK-V in there too” sounded like a natural thing to do. And since I refuse to deal with CMake in my hobby projects :), I thought I’d just add relevant MARK-V source files…

First “uh oh” sign: while the number of files under compression related folders (source/comp, tools/comp) is not high, that is 500 kilobytes of source code. Half a meg of source, Carl!

And then of course it needs a whole bunch of surrounding code from SPIRV-Tools to compile. So I copied everything that it needed to work. In total, 1.8MB of source code across 146 files.

After finding all the source files and setting up include paths for them, it compiled easily on both Windows (VS2017) and Mac (Xcode 9.4).

Pet peeve: I never understood why people don’t use file-relative include paths (like #include "../foo/bar/baz.h"), instead requiring the users of your library to setup additional include path compiler flags. As far as I can tell, relative include paths have no downsides, and require way less fiddling to both compile your library and use it.

Side issue: STL vector for input data

The main entry point for MARK-V decoding (this is what would happen on the device when loading shaders – so this is the performance critical part) is:

spv_result_t MarkvToSpirv(
    spv_const_context context, const std::vector<uint8_t>& markv,
    const MarkvCodecOptions& options, const MarkvModel& markv_model,
    MessageConsumer message_consumer, MarkvLogConsumer log_consumer,
    MarkvDebugConsumer debug_consumer, std::vector<uint32_t>* spirv);

Ok, I kind of get the need (or at least convenience) of using std::vector for output data; after all you are decompressing and writing out an expanding array. Not ideal, but at least there is some explanation.

But for input data – why?! One of const uint8_t* markv, size_t markv_size or a const uint8_t* markv_begin, const uint8_t* markv_end is just as convenient, and allows way more flexibility for the user at where the data is coming from. I might have loaded my data as memory-mapped files, which then literally is just a pointer to memory. Why would I have to copy that data into an additional STL vector just to use your library?

Side issue: found bugs in “Max” compression

MARK-V has three compression models - “Lite”, “Mid” and “Max”. On some test shaders I had the “Max” one could not decompress successfully after compression, so I guess “some bugs are there somewhere”. Filed a bug report and excluded the “Max” model from further comparison :(

MARK-V vs SMOL-V

Size evaluation

CompressionNo filterSMOL-VMARK-V LiteMARK-V Mid
Size KBRatioSize KBRatioSize KBRatioSize KBRatio
Uncompressed 4870100.0% 163033.5% 136928.1% 108522.3%
zlib default 121324.9% 60212.4% 4118.5% 3366.9%
LZ4HC default 134327.6% 60612.5% 4108.4% 3346.9%
Zstd default 89918.5% 4469.1% 3948.1% 3296.8%
Zstd level 20 59012.1% 3487.1% 2936.0% 2575.3%

Two learnings from this:

  • MARK-V without additional compression on top (“Uncompressed” row) is not really competitive (~25%); just compressing shader data with Zstandard produces smaller result; or running through SMOL-V coupled with any other compression.
  • This suggests that MARK-V acts more like a “filter” (similar to SMOL-V or spirv-remap), that makes the data smaller, but also makes it more compressible. Coupled with additional compression, MARK-V produces pretty good results, e.g. the “Mid” model ends up compressing data to ~7% of original size. Nice!

Decompression performance

I checked how much time it takes to decode/decompress shaders (4870KB uncompressed size):

Windows
AMD TR 1950X
3.4GHz
Mac
i9-8950HK
2.9GHz
MARK-V Lite536.7ms9.1MB/s 492.7ms9.9MB/s
MARK-V Mid 759.1ms6.4MB/s 691.1ms7.0MB/s
SMOL-V 8.8ms 553.4MB/s 11.1ms438.7MB/s

Now, I haven’t seriously looked at my SMOL-V decompression performance (e.g. Zstandard general decompression algorithm does ~1GB/s), but at ~500MB/s it’s perhaps “not terrible”.

I can’t quite say the same about MARK-V though; it gets under 10MB/s of decompression performance. That, I think, is “pretty bad”. I don’t know what it does there, but this low decompression speed is within a “maybe I wouldn’t want to use this” territory.

Decompressor size

There is only one case where the decompressor code size does not matter: it’s if it comes pre-installed on the end hardware (as part of OS, runtimes, drivers, etc.). In all other cases, you have to ship decompressor inside your own application, i.e. statically or dynamically link to that code – so that, well, you can decompress the data you have compressed.

I evaluated decompressor code size by making a dynamic/shared library on a Mac (.dylib) with a single exported function that does a “decode these bytes please” work. I used -O2 -fvisibility=hidden -std=c++11 -fno-exceptions -fno-rtti compiler flags, and -shared -fPIC -lstdc++ -dead_strip -fvisibility=hidden linker flags.

  • SMOL-V decompressor .dylib size: 8.2 kilobytes.
  • MARK-V decompressor .dylib size (only with “Mid” model): 1853.2 kilobytes.

That’s right. 1.8 megabytes! At first I thought I did something wrong!

I looked at the size report via Bloaty, and yeah, in MARK-V decompressor it’s like: 570KB GetIdDescriptorHuffmanCodecs, 137KB GetOpcodeAndNumOperandsMarkovHuffmanCodec, 64KB GetNonIdWordHuffmanCodecs, 44KB kOpcodeTableEntries and then piles and piles of template instantiations that are smaller, but there’s lots of them.

In SMOL-V by comparison, it’s 2KB smolv::Decode, 1.3KB kSpirvOpData and the rest is misc stuff and/or dylib overhead.

Library compilation time

While this is not that important aspect, it’s relevant to my current work role as a build engineer :)

Compiling MARK-V libraries with optimizations on (-O2) takes 102 seconds on my Mac (single threaded; obviously multi-threaded would be faster). It is close to two megabytes of source code after all; and there is one file (tools/comp/markv_model_shader.cpp) that takes 16 seconds to compile alone. I think that got CI agents into timeouts in SPIRV-Tools project, and that was the reason why MARK-V is not enabled by default in the builds :)

Compiling SMOL-V library takes 0.4 seconds in comparison.

Conclusion

While looking at compression ratio in isolation, MARK-V coupled with additional lossless compression looks good, I don’t think I would recommend it due to other issues.

The decompressor executable size alone (almost 2MB!) means that in order for MARK-V to start to “make sense” compared to say SMOL-V, your total shader data size needs to be over 100 megabytes; only then additional compression from MARK-V offsets the massive decompressor size.

Sure, there are games with shaders that large, but then MARK-V is also quite slow at decompression – it would take over 10 seconds to decompress 100MB worth of shader data :(

All my evaluation code is on mark-v branch in SMOL-V repository. At this point I’m not sure I’ll merge it to the main branch.

This is all.


Pathtracer 16: Burst SIMD Optimization

Introduction and index of this series is here.

When I originally played with the Unity Burst compiler in “Part 3: C#, Unity, Burst”, I just did the simplest possible “get C# working, get it working on Burst” thing and left it there. Later on in “Part 10: Update C#” I updated it to use Structure-of-Arrays data layout for scene objects, and that was about it. Let’s do something about this.

Meanwhile, I have switched from late-2013 MacBookPro to mid-2018 one, so the performance numbers on a “Mac” will be different from the ones in previous posts.

Update to latest Unity + Burst + Mathematics versions

First of all, let’s update the Unity version we use from some random 2018.1 beta to the latest stable 2018.2.13, and update Burst (to 0.2.4-preview.34) & Mathematics (to 0.0.12-preview.19) packages along the way. Mathematics renamed lengthSquared to lengthsq, and introduced a PI constant that clashed with our own one :) These trivial updates in this commit.

Just that got performance on PC from 81.4 to 84.3 Mray/s, and on Mac from 31.5 to 36.5 Mray/s. I guess either Burst or Mathematics (or both) got some optimizations during this half a year, nice!

Add some “manual SIMD” to sphere intersection

Very similar to how in Part 8: SSE HitSpheres I made the C++ HitSpheres function do intersection testing of one ray against 4 spheres at once, we’ll do the same in our Unity C# Burst code.

The thought process and work done is extremely similar to the C++ side done in Part 8 and Part 9; basically:

  • Since data for our spheres is laid out nicely in SoA style arrays, we can easily load data for 4 of them at once.
  • Do all ray intersection math on these 4 spheres,
  • If any are hit, pick the closest one and calculate final hit position & normal.

HitSpheres function code gets to be extremely similar between C++ version and C# version. In fact the C# one is cleaner since float4, int4 and bool4 types in Mathematics package are way more complete SIMD wrappers than my toy manual implementations in the C++ version.

The full change commit is here.

Performance: PC from 84.3 to 133 Mray/s, and Mac from 35.5 to 60.0 Mray/s. Not bad!

Updated numbers for new Mac hardware

Implementation PC Mac
GPU 1854 246
C++, SSE+SoA HitSpheres 187 74
C#, Unity Burst, 4-wide HitSpheres 133 60
C++, SoA HitSpheres 100 36
C#, Unity Burst 82 36
C#, .NET Core 53.0 23.6
C#, mono -O=float32 --llvm w/ MONO_INLINELIMIT=100 22.0
C#, mono -O=float32 --llvm 18.9
C#, mono -O=float32 11.0
C#, mono 6.1
  • PC is AMD ThreadRipper 1950X (3.4GHz, 16c/16t - SMT disabled) with GeForce GTX 1080 Ti.
  • Mac is mid-2018 MacBookPro (Core i9-8950HK 2.9GHz, 6c/12t) with AMD Radeon Pro 560X.
  • Unity version 2018.2.13 with Burst 0.2.4-preview.34 and Mathematics 0.0.12-preview.19.
  • Mono version 5.12.
  • .NET Core version 2.1.302.

All code is on github at 16-burst-simd tag.


Random list of Demoscene Demos

I just did a “hey kids, let me tell you about demoscene” event at work, where I talked about and and showed some demos I think were influential over the years, roughly sorted chronologically.

Here’s that list, in case you also want to see some demoscene things. There’s a whole bunch of excellent demo productions I did not show (due to time constraints); and I mostly focused on Windows/PC demos. A decent way of finding others is searching through “all time top” list at pouët.net.

I’m giving links to youtube, because let’s be realistic, no one’s gonna actually download and run the executables. Or if you would, then you most likely have already seen them anyway :)

Future Crew “Second Reality”, 1993, demo

Tim Clarke “Mars”, 1993, 6 kilobytes

Exceed “Heaven Seven”, 2000, 64 kilobytes

farbrausch “fr-08: .the .product”, 2000, 64 kilobytes

Alex Evans “Tom Thumb”, 2002, wild demo

TBC & Mainloop “Micropolis”, 2004, 4 kilobytes

mfx “Aether”, 2005, demo

Kewlers & mfx “1995”, 2006, demo

mfx “Deities”, 2006, demo

farbrausch “fr-041: debris”, 2007, 144 kilobytes

Fairlight & CNCD “Agenda Circling Forth”, 2010, demo

Fairlight & CNCD “Ziphead”, 2015, demo

Eos “Oscar’s Chair”, 2018, 4 kilobytes

Conspiracy “When Silence Dims The Stars Above”, 2018, 64 kilobytes


Pathtracer 15: Pause & Links

Sailing out to sea | A tale of woe is me
I forgot the name | Of where we’re heading
– Versus Them “Don’t Eat the Captain

So! This whole series on pathtracing adventures started out without a clear goal or purpose. “I’ll just play around and see what happens” was pretty much it. Looks like I ran out of steam and will pause doing further work on it. Maybe sometime later I’ll pick it up again, who knows!

One nice thing about 2018 is that there’s a lot of interest in ray/path tracing again, and other people have been writing about various aspects of it. So here’s a collection of links I saved on the topic over past few months:

Thanks for the adventure so far, everyone!

Put the fork away | It’s not a sailor’s way
We are gentlemen | Don’t eat the captain


Iceland Vacation 2018

Hello! End of June & start of July we were traveling in Iceland, so here’s some photos and stuff.

I’ve heard that some folks somehow don’t know that Iceland is absolutely beautiful. How?! Here’s my attempt at helping the situation by dumping a whole bunch of photos into the series of tubes.

Planning

We’ve been to Iceland before; what we did differently this time was:

  • Almost 2x longer trip (11 days),
  • Our kids are 5 years older (15 and 9yo), which makes it easier! We are five years older too though :/
  • Six people in total, since now we also took my parents. This meant renting two cars.

Similar to last time, I used internets and google maps to scout for locations and do rough planning. It was basically “go around the whole country” (on the main Route 1), cutting in one place via the highland Route F35, and then a detour into Snæfellsnes peninsula.

Total driving distance ended up ~2600km (200-300km per day). That does not sound a lot, but we did not end up having “lazy days”; there is a lot to see in Iceland, and every stop along the way is basically an hour or two. For example you might want to hike up the waterfall, or get down to some cliffs in the water, etc. The map on the right shows all the places we did end up stopping at. I had a dozen more marked up, but we skipped some.

I booked everything well in advance (4 months), either via Booking.com or Airbnb. Since we were a party of six, in some more remote places there was not that many choices actually. Having a camper or tents might be much cheaper and allow more freedom, at expense of comfort.

Cost wise, some things (like housing) has visibly increased since 2013 when we were last there. Makes sense, since the amount of tourists has increased as well; capitalism gonna capital. Total cost breakdown for us was: 33% housing, 23% flights, 20% car rent, 24% everything else (food, eating out, gas, guided trips, …).

Late June is basically “early summer” in Iceland. Most/all of the highland roads are already open. There can be quite a lot of rain; I was looking at the forecasts and it did not look very good. Luckily enough, we only got serious rain for like 3 days; most other days there was relatively little rain. Temperature was mostly in +8..+15°C range, often with a really cold wind. There were moments when I wished I’d taken gloves :)

Photo Impressions

Most of the photos are taken by my wife. Equipment: Canon EOS 70D with Canon 24-70mm f/2.8 L II and Sigma 8-16mm f/4.5-5.6. Some taken with iPhone SE.

Day 1, South (Selfoss to Kirkjubæjarklaustur)

The southern part is quite crowded with tourists; going up to Dyrhólaey/Vik is plenty of sights and a good trip for a day. We also started the first day with “ok there’s a million things to see today!”.

First up, mostly waterfalls. Urriðafoss, Seljalandsfoss, a view into the infamous Eyjafjallajökull, and Skógafoss.

Fun fact! Unity codebase has a text = "Eyjafjallajökull-Pranckevičius"; line in one of the tests, that checks whether some thing deals with non-English characters. I think @lucasmeijer added that.

End of June is blooming time of Nootka Lupin; there are vast fields full of them. People go to take wedding photos and whatnot in there.

Next up, we can go to the tongue of Sólheimajökull glacier (this is a bit redundant; “jökull” already means “glacier”). I’ve never seen a glacier before, and the photos of course don’t do it justice. This is a tiny piece at the end of the glacier. Very impressive.

Dyrhólaey peninsula:

Dverghamrar basalt column formations, with Foss á Síðu waterfall in the distance (redundancy again, “foss” already means “waterfall”):

Day 2, South/East (Kirkjubæjarklaustur to Höfn)

Driving up to another glacier, Svínafellsjökull. Again, the scale is hard to comprehend; many glaciers in Iceland are 500 meters high, some going up to a kilometer. A kilometer of ice!

A short (but very bumpy) road to the side, and we are close to it:

Next up, Jökulsárlón glacial lake. Was a setting for a bunch of movies! The lake is just over a hundred years old, and is growing very fast, largely due to melting glaciers.

Right next to it there is so called “Diamond Beach”, where icebergs, after being flushed out into the sea and eroded by salt, come ashore as tiny pieces of ice. The sand is black of course, since it was originally pumice and volcanic ash.

Day 3, East (Höfn to Egilsstaðir)

Eastern side of Iceland is where there’s no tourist crowds, and no big-name attractions either. Even the main highway road becomes gravel for a dozen kilometers in one place :) Most of the Route 1 goes along the coastline that is full of fjords, which makes for a fairly long drive. There is a shortcut (route 939 aka Öxi) that lets you cut some 80km, but it’s gravel and very steep (here’s random youtube video showing it). I thought “let’s do the coastline instead, we’ll watch plenty of sea and cliffs”. Not so fast! Turns out, coastline can mean that there’s a literal cloud right on the road, and you basically don’t see anything. Oh well :)

There were some lighthouses (barely visible due to mist/fog/clouds), a nice waterfall (Sveinsstekksfoss), and also here’s a photo of our typical lunch:

We stayed in a lovely horse ranch, and also found an old car dump nearby.

Day 5, North/East (Egilsstaðir to Mývatn)

Most of the day was driving on Route 1 through Norður-Múlasýsla region. First you see towns and villages disappear, then farms disappear, and then even sheep disappear (whereas normally sheep are everywhere in Iceland). What’s left is a volcanic desert with basically a single road cutting through it.

There was a waterfall (Rjúkandi) near start of that trip, and lava fields towards the end, close to Dettifoss.

Here’s Dettifoss, which is 100m wide, 44m deep and other measurements as well (ref).

Nearby, the Krafla area with the Víti crater, Krafla power station and Hverir geothermal area with fumaroles and mudpots.

Lake Mývatn nearby has a flying mountain (not really, just low fog) and a lot of birds.

Day 5, North (Mývatn to Akureyri)

Mývatn to Akureyri is a very short drive, so we did a detour through Husavik towards Ásbyrgi canyon. Last time we were in Iceland, Husavik was lovely and Ásbyrgi was quite impressive. However this time, pretty much the whole day was heavy rain. Not much visibility, and not too pleasant to hike around and enjoy the sights. Oh well! Here’s Ásbyrgi and Goðafoss:

Akureyri has an excellent botanical garden; more photos from it at my wife’s blog.

Day 6, Highlands (Akureyri to Kerlingarfjöll)

This was where we took off the main highway and into the F35/Kjalvegur gravel road. I heard from a bunch of people the suggestion along the lines of “OMG you have to go along one of the highland roads”, and so that’s why we did it. F35 is the easiest of those; legally it requires a 4x4/AWD car but I think technically any car should be able to do it. Most other highland roads actually have river crossings; whereas F35 only has one or two small streams to cross. Most of the road is actually in very good condition (at least at start of July), with only a couple dozen kilometers that have enough stones and pits to make you go at 20-30km/h.

There is Hveravellir geothermal area near Langjökull:

We stayed at a place near Kerlingarfjöll:

And decided to hike towards a nearby rhyolite mountain area (Hveradalir). Apparently I must have misread something somewhere, since what I thought was 3km turned out to be 5km one way (mixup of miles vs kilometers in my head?), the path was steep, with blobs of snow along the way, really strong wind and a descending cloud. At some point we decided to declare ourselves losers and just turn back. Oh well :/

Turns out, you can just drive up to the same area via some mountain road. It’s steep and bumpy, and there was still tons of snow on the side, but the views up there were amazing. The wind almost blew us away though; maybe it’s good that we did not hike all the way.

Day 7, Part of Golden Circle (Kerlingarfjöll to Reykjavik)

Golden Circle” is a marketing term for probably the most touristy route in Iceland. But parts of it did happen to be on our way, so we went straight from the highlands where there’s no one around, into “all the tourists in one spot” types of places like Gullfoss.

Next up, Strokkur geyser, again with a ton of tourists:

And we spent the evening just strolling around Reykjavik.

Day 8, Part of Golden Circle (around Reykjavik)

Þingvellir national park, most famous for being a place where you can actually see the rift between Eurasian and North American tectonic plates, and also for being a place of Alþingi, one of the oldest parliaments in the world.

Next up, Kerið crater. Similar to Krafla’s Víti, except with more tourists and you can get down to the lake itself.

Then we went to the Raufarhólshellir lava cave. Things I learned: “skylight” is not just a computer graphics term (also means places where underground caves have openings towards the ground); lava flow produces really intricate “bathtub ring” patterns; and complete darkness feels eery.

Day 9, West (Reykjavik to Snæfellsnes)

Driving up to Snæfellsnes takes a good chunk of time, with generally nothing to see along the way (in relative terms of course; in many other countries these valleys and horizons would be amazing… but Iceland has too many more impressive sights). There are Gerðuberg basalt columns midway:

…but apart from that, not much. I was starting to think “ohh maybe this will be a low point of the trip”, and then! Rauðfeldsgjá gorge was very fun; you try to find your way across a water stream in a very narrow gorge, with huge chunks of snow right above you.

Just a couple minutes from there, Arnarstapi village has really nice cliffs at the water.

Five minutes from that, Hellnar village has even more impressive cliffs. I mean look at them! That layout and flow of the rocks should not exist! :)

And then! Djúpalónssandur beach with black sand and rock formations.

Near our sleeping place there’s Kirkjufell, which is featured in a ton of photos showing off wide-angle lenses :)

Day 10, West/South (Snæfellsnes to Keflavík)

Stykkishólmur town and random sights on the way back. Was an easy day without sensory overload :)

Day 11, Reykjanes Peninsula (around Keflavík)

Our flight back was in the evening, so we visited some places in Reykjanes near the airport. Gunnuhver mud pool:

Krísuvíkurberg cliffs and Dollan lava caves:

Krýsuvík geothermal area:

Kleifarvatn lake:

And the famous Bláa Lónið (Blue Lagoon), but we decided not to go inside (too many people, and didn’t feel the need either). There’s a power station right next to it, and some tractors doing cleaning. Much romance, wow :)

Next time?

I have no doubt that we’ll go to Iceland again (seriously, it’s amazing). One obvious thing would be going in the winter. So maybe that!