Blogs · Aras' website

Prophets and duct-tapers or: useful programmer traits

Posted on Sep 9, 2011

I liked Pierre’s The Prophet Programmer post. Go read it now.

Now of course that post is a rant. It exaggerates. It puts everything into one bit grayscale colors. There’s never one person completely like this “prophet programmer” and another like the idolized “best programmer… not afraid of anything!!1”.

But it does highlight at least this thing: some aspects of programmer’s behavior are either useful or not.

Obsessing over latest hypes, “the proper ways”, following books by the letter just by itself is not useful. Sure, sometimes a dash of “proper ways” or recommendations is good, but the benefits of doing that are really, really tiny. Hence it’s not worth thinking/arguing much about.

Here’s some actually useful programmer traits instead.** I’m thinking about real actual people I’m working with here, even if I’m not telling names.

He feels what needs to be done to get the solution, in the big picture. Sometimes these are unusual ideas that probably no one is doing - because everyone has always been seeing the problem in the standard way. The solutions seem obvious once you see them, but require some sort of step function in thinking to get there. Zero iteration way of hooking up touchscreen device input to test the game is to play the game on PC, stream images into the device and stream inputs back. Least hassle free asset pipeline is when there is no “export/import asset” step. Or a more famous outside example, tablets before and after the iPad. You rarely, if ever, can do things like that by doing user surveys or improving on existing solutions; you need someone who can see through and find what’s the actual problem you want to solve. This guy is worth gold.

She can cut things. “Perfection is achieved, not when there is nothing more to add, but when there is nothing left to cut away”, quoth Saint-Exupéry. To be good at doing anything you (both you and your team) need to focus, which means cutting things. Let go of bad ideas and blind alleys. If your justification for doing it is “but we already spent so much time on it”, just don’t - it will only get worse. Cut features that aren’t quite ready by the deadlines. Remove old things that aren’t useful anymore. Doing that can and will make some people upset; it’s really, really hard to postpone or even completely abandon a thing that someone put a lot of effort into. But it needs to be done; and you need her on the team to make these hard decisions.

That other guy is freaking fast. And not in a sense of “types tons of code real fast and then sometimes it works, and two weeks after someone else has to clean it up”. No - he’s cranking out good, solid, tested, working code at incredible speeds. Got ten bugs; they are fixed by next day. Got a new feature to do; commits with everything implemented (and working!) are pushed in a few days. When he goes on vacation your burndown chart changes slope. How he does it? I don’t know. But by all means, keep onto him!

The other girl can figure out any complex problem real fast. Be it a tricky bug, unexpected behavior, really weird interaction with other systems - others could be spending hours, if not days, trying to figure out what’s going on. She, on the other hand, checks just a handful of things and goes “ha! the problem’s right there”. As if applying binary search to the whole problem space, except to everyone else the space seems unsorted and they don’t even know what they’re looking for!

This dude can keep a ton of context in his head while doing anything. How will this feature interact with dozens or even hundreds of other features; he’s able to think about all of them and majority of corner cases and get everything right in one go. Would take dozens of roundtrips between coding & QA for someone else to get right. When estimating effort for new things, he can immediately list all the tricky work that will need to be done; whereas others would go “sounds easy” only to find out it’s a month of work.

She’s not satisfied with the status quo. No this isn’t good enough, she says; and let me show you where & how spectacularly it breaks. And it does not matter if everyone else is doing it this way; here’s why putting that stuff into uniform grid isn’t good. A lot of times you need this extra bump to snap out of your own “this is good enough, no one will care” thoughts.

He’s doing a lot of boring work to get others more productive. There’s a ton of boring work on even the most exciting projects, and someone has to do it. He’s often the unsung hero, quietly working on infrastructure, build times, fixing annoyances in the tools, processes and workflows; all just so that others can be better at doing exciting things. You could call him a janitor or a plumber if you wish, but any place gets rotten and broken real fast without those people.

…and the list could go on. Unlike obsessing over irrelevant details, these make a difference. Makes your team run circles around others. Helps you solve hard problems, invent things, moves you forward at enormous velocity.

You need people with those traits and attitudes.

Fast Mobile Shaders or, I did a talk at SIGGRAPH!

Posted on Aug 17, 2011

Finally after many years of dreaming I made it to SIGGRAPH! And not only that, I also did a talk/course with ReJ for 1.5 hours. This was the first time Unity had real presence at SIGGRAPH and I hope we’ll be more active & visible next time around.

Here it is, 100+ slides with notes: Fast Mobile Shaders (17MB pdf). This isn’t strictly about shaders; there’s info about mobile GPU architectures, general performance, hidden surface removal and so on. Also, graphs with logarithmic scales; can’t go wrong with that!

Testing Graphics Code, 4 years later

Posted on Jun 17, 2011

Almost four years ago I wrote how we test rendering code at Unity. Did it stand the test of time and more importantly, growing the company from less than 10 people to more than 100 people?

I’m happy to say it did! That’s it, move on to read the rest of the internets.

The earlier post was more focused on hardware compatibility area (differences between platforms, GPUs, driver versions, driver bugs and their workarounds etc.). In addition to that, we do regression tests on a bunch of actual Unity made games. All that is good and works, let’s talk about what tests the rendering team at Unity is using in the daily lives instead.

Graphics Feature & Regression Testing

In daily life of a graphics programmer, you care about two things related to testing:

1. Whether a new feature you are adding, more or less, works. 2. Whether something new you added or something you refactored broke or changed any existing features.

Now, “works” is a vague term. Definitions can range from equally vague

Works For Me!

to something like

It has been battle tested on thousands of use cases, hundreds of shipped games, dozens of platforms, thousands of platform configurations and within each and every one of them there’s not a single wrong pixel, not a single wasted memory byte and not a single wasted nanosecond! No kittehs were harmed either!

In ideal world we’d only consider the latter as “works”, however that’s quite hard to achieve.

So instead we settle for small “functional tests”, where each feature has a small scene setup that exercises said feature (very much like talked about in previous post). It’s graphics programmer’s responsibility to add tests like that for his stuff.

For example, Fog handling might be tested by a couple scenes like this:

Another example, tests for various corner cases of Deferred Lighting:

So that’s basic testing for “it works” that the graphics programmers themselves do. Beyond that, features are tested by QA and a large beta testing group, tried, profiled and optimized on real actual game projects and so on.

The good thing is, doing these basic tests also provides you with point 2 (did I break or change something?) automatically. If after your changes, all the graphics tests still pass, there’s a pretty good chance you did not break anything. Of course this testing is not exhaustive, but any time a regression is spotted by QA, beta testers or reported by users, you can add a new graphics test to check for that situation.

How do we actually do it?

We use TeamCity for the build/test farm. It has several build machines set up as graphics test agents (unlike most other build machines, they need an actual GPU, or a iOS device connected to them, or a console devkit etc.) that run graphics test configurations for all branches automatically. Each branch has it’s graphics tests run daily, and branches with “high graphics code activity” (i.e. branches that the rendering team is actually working on) have them run more often. You can always initiate the tests manually by clicking a button of course. What you want to see at any time is this:

The basic approach is the same as 4 years ago: a “game level” (“scene” in Unity speak) for each test, runs for defined number of frames, run everything at fixed timestep, take a screenshot at end of each frame. Compare each screenshot with “known good” image for that platform; any differences equals “FAIL”. On many platforms you have to allow a couple of wrong pixels because many consumer GPUs are not fully deterministic it seems.

So you have this bunch of “this is the golden truth” images for all the tests:

And each platform automatically tested on TeamCity has it’s own set:

Since the “test controller” can run on a different device than actual tests (the case for iOS, Xbox 360 etc.), the test executable opens a socket connection to transfer the screenshots. The test controller is a relatively simple C# application that listens on a socket, fetches the screenshots and compares them with the template ones. The result of it is output that TeamCity can understand; along with “build artifacts” that consist of failed tests (for each failed test: expected image, failed image, difference image with increased contrast).

That’s pretty much it! And of course, automated tests are nice and all, but that should not get too much into the way of actual programming manifesto.

Notes on Native Client & Pepper Plugin API

Posted on Jun 2, 2011

Google’s Native Client (NaCl) is a brilliant idea. TL;DR: it allows native code to be run securely in the browser.

But is it secure?

“Bububut, waitaminnit! Native code is not secure by definition” you say. Turns out, that isn’t necessarily true. With a specially massaged compiler, some runtime support and careful native code validation it is possible to ensure native code, when ran in the browser, can’t cause harm to user’s machine. I suggest taking a look at the original NaCl for x86 paper and more recently, how similar techniques would apply to ARM CPUs.

But what can you do with it?

So that’s great. It means it is possible to take C/C++ code, compile it with NaCl SDK (a gcc derived toolchain) and have it run in the browser. We can make a loop in C that multiplies a ton of floating point numbers, and it will run at native speed. That’s wonderful, except you can’t really do much interesting stuff with your own C code in isolation…

You need access to the hardware and/or OS. As game developers, we need pixels to appear on the screen. Preferably lots of them, with the help of something like a GPU. Audio waves to come out of the speakers. Mouse moves and keyboard presses to translate to some fancy actions. Post a high score to the internets. And so on.

NaCl surely can’t just allow my C code to call Direct3DCreate9 and run with it, while keeping the promise of “it’s secure”? Or a more extreme case, FILE* f = fopen("/etc/passwd", "rt");?!

And that’s true; NaCl does not allow you to use completely arbitrary APIs. It has it’s own set of APIs to interface with “the system”.

Ok, how do I interface with the system?

…and that’s where the current state of NaCl gets a bit confusing.

Initially Google developed an improved “browser plugin model” and called it Pepper. This Pepper thing would then take care of actually putting your code into the browser. Starting it up, tearing it down, controlling repaints, processing events and so on. But then apparently they realized that building on top of a decade-old Netscape plugin API (NPAPI) isn’t going to really work, so they developed Pepper2 or PPAPI (Pepper Plugin API) which ditches NPAPI completely. To write a native client plugin, you only interface with PPAPI.

So some of the pages on the internets reference the “old API” (which is gone as far as I can see), and some others reference the new one. It does not help that Native Client’s own documentation are scattered around in Chromium, NaCl, NaCl SDK and PPAPI sites. Seriously, it’s a mess, with seemingly no high level, up to date “introduction” page that tells what exactly PPAPI can and can’t do. Edit: I’m told that the definitive entry point to NaCl right now is this page: http://code.google.com/chrome/nativeclient/ which clears up some mess.

Here’s what I think it can do

Note: At work we have an in-progress Unity NaCl port using this PPAPI. However, I am not working on it, so my knowledge may or may not be true. Take everything with a grain of NaCl ;)

Most of things below found by poking around at PPAPI source tree, and by looking into Unity’s NaCl platform dependent bits.

Graphics

PPAPI provides an OpenGL ES 2.0 implementation for your 3D needs. You need to setup the context and initial surfaces via PPAPI (ppapi/cpp/dev/context_3d_dev.h, ppapi/cpp/dev/surface_3d_dev.h) - similar to what you’d use EGL on other platforms for - and beyond that you just include GLES2/gl2.h, GLES2/gl2ext.h and call ye olde GLES2.0 functions.

Behind the scenes, all your GLES2.0 calls will be put into a command buffer and transferred to actual “3D server” process for consuming them. Chrome splits up itself into various processes like that for security reasons – so that each process has the minimum set of privileges, and a crash or a security exploit in one of them can’t easily transfer over to other parts of the browser.

Audio

For audio needs, PPAPI provides a simple buffer based API in ppapi/cpp/audio_config.h and ppapi/cpp/audio.h. Your own callback will be called whenever audio buffer needs to be filled with new samples. That means you do all sound mixing yourself and just fill in the final buffer.

Input

Your plugin instance (subclass of pp::Instance) will get input events via HandleInputEvent virtual function override. Each event is a simple PPInputEvent struct and can represent keyboard & mouse. No support for gamepads or touch input so far, it seems.

Other stuff

Doing WWW requests is possible via ppapi/cpp/url_loader.h and friends.

Timer & time queries via ppapi/cpp/core.h (e.g. pp::Module::Get()->core()->CallOnMainThread(...)).

And, well, a bunch of other stuff is there, like ability to rasterize blocks of text into bitmaps, pop up file selection dialogs, use the browser to decode video streams and so on. Everything - or almost everything - is there to make it possible to do games on it.

Summary

Like Chad says, it would be good to end “thou shalt only use Javascript” on the web. Javascript is a very nice language - especially considering how it came into existence - but forcing it on everyone is quite silly. And no matter how hard V8/JägerMonkey/Nitro folks are trying, it is very, very hard to beat performance of a simple, static, compiled language (like C) that has direct access to memory and the programmer is in almost full control of both the code flow and the memory layout. Steve rightly points out that even if for some tasks a super-optimized Javascript engine will approach the speed of C, it will burn much more energy to do so – a very important aspect in the increasingly mobile world.

Native Client does give some hope that there will be a way to run native code, at native speeds, in the browser, without compromising on security. Let it happen.

A way to visualize mip levels

Posted on May 3, 2011

Recently a discussion on Twitter about folks using 2048 textures on a pair of dice spawned this post. How do artists know if the textures are too high or too low resolution? Here’s what we do in Unity, which may or may not work elsewhere.

When you have a game scene that, for example, looks like this:

We provide a “mipmaps” visualization mode that renders it like this:

Original texture colors mean it’s a perfect match (1:1 texels to pixels ratio); more red = too much texture detail; more blue = too little texture detail.

That’s it, end of story, move along!

Now of course it’s not that simple. You can just go and resize all textures that were used on the red stuff. The player might walk over to those red objects, and then they would need more detail!

Also, the amount of texture detail needed very much depends on the screen resolution the game will be running at:

Still, even with varying resolution sizes and the fact that the same objects in 3D can be near & far from the viewer, this view can answer the question of “does something have a too high/too low texture detail?”, mostly by looking at colorization mismatch between nearby objects.

In the picture above, the railings have too little texture detail (blue), while the lamp posts have too much (red). The little extruded things on the floating pads have too much detail as well.

The image below reveals that floor and ceiling have mismatching texture densities: floor has too little, while ceiling has too much. Probably should be the other way around, in a platform you’d more often be looking at the floor.

How to do this?

In the mipmap view shader, we display the original texture mixed with a special “colored mip levels” texture. The regular texture is sampled with original UVs, while the color coded texture is sampled with more dense ones, to allow visualization of “too little texture detail”. In shader code (HLSL, shader model 2.0 compatible):

struct v2f {
    float4 pos : SV_POSITION;
    float2 uv : TEXCOORD0;
    float2 mipuv : TEXCOORD1;
};
float2 mainTextureSize;
v2f vert (float4 vertex : POSITION, float2 uv : TEXCOORD0)
{
    v2f o;
    o.pos = mul (matrix_mvp, vertex);
    o.uv = uv;
    o.mipuv = uv * mainTextureSize / 8.0;
    return o;
}
half4 frag (v2f i) : COLOR0
{
    half4 col = tex2D (mainTexture, i.uv);
    half4 mip = tex2D (mipColorsTexture, i.mipuv);
    half4 res;
    res.rgb = lerp (col.rgb, mip.rgb, mip.a);
    res.a = col.a;
    return res;    
}

The mainTextureSize above is the pixel size of the main texture, for example (256,256). Division by eight might seem a bit weird, but it really isn’t!

To show the colored mip levels, we need to create mipColorsTexture that has different colors in each mip level.

Let’s say we would create a 32x32 size texture for this, and the largest mip level would be used to display “ideal texel to pixel density”. If the original texture was 256 pixels in size and we want to sample a 32 pixels texture at exactly the same texel density as the original one, we have to use more dense UVs: newUV = uv * 256 / 32 or in a more generic way, newUV = uv * textureSize / mipTextureSize.

Why there’s 8.0 in the shader then, if we create the mip texture at 32x32 size? That’s because we don’t want the largest mip level to indicate “ideal texel to pixel” density. We also want a way to visualize “not enough texel density”. So we push the ideal mip level two levels down, which means it’s four times UV difference. That’s how 32 becomes 8 in the shader.

The actual colors we use for this 32x32 mipmaps visualization texture are, in RGBA: (0.0,0.0,1.0,0.8); (0.0,0.5,1.0,0.4); (1.0,1.0,1.0,0.0); (1.0,0.7,0.0,0.2); (1.0,0.3,0.0,0.6); (1.0,0.0,0.0,0.8). Alpha channel controls how much to interpolate between the original color and the tinted color. Our 3rd mip level has zero alpha so it displays unmodified color.

Now, step 2 is somehow forcing artists to actually use this ;)