Archive for 'work'

Implementing fixed function T&L in vertex shaders

Almost half a year ago I was wondering how to implement T&L in vertex shaders.

Well, finally I implemented it for upcoming Unity 2.6. I wrote some sort of a technical report here.

In short, I’m combining assembly fragments and doing simple temporary register allocation, which seems to work quite well. Performance is very similar to using fixed function (I know it’s implemented as vertex shaders internally by the runtime/driver) on several different cards I tried (Radeon HD 3xxx, GeForce 8xxx, Intel GMA 950).

What was unexpected: the most complex piece is not the vertex lighting! Most complexity is in how to route/generate texture coordinates and transform them. Huge combination explosion there.

Otherwise – I like! Here’s a link to the article again.

Unity 2.5 is out

Unity 2.5 is finally released. In summary:
Unity 2.5

Here’s what’s new. Here’s the download page.

My 11th Unity release since I joined 3+ years ago. This is quite a crazy release that involved almost complete editor tools rewrite and lots of other juggling. Was not exactly a walk in the park, but it’s done now. Meet me at GDC in San Francisco next week and I’ll tell you the war stories (Unity booth is 5110 NH).

Here’s the obligatory source code commits graph:
2.5 svn commits
18 people involved in source code, 5315 commits, 18501 file changes. Of course, svn commits do not mean anything… I’m just fascinated by graphs and numbers.

Another Vista review (after 6 months of usage)

Ok, I don’t exactly like Windows Vista. But I just spent 6 months using Vista as my primary OS at work… because everyone else was using XP, and someone had to make sure everything works on Vista as well. So it was me.

In summary, Vista is not that bad.

Once you get used to changes in Explorer, different skin and so on – it’s actually usable. I think they have made some real improvements in the underlying technology, too bad they managed to “compensate” for all of that by inconsistencies and lack of polish in user interface.

At this point it’s minor quirks in UI that annoy me, but apart from that, Vista is okay. Look:

Icon overlay blending
Who implemented blending of icon overlays and do they still have a job? No sir, that shield icon is not properly blended here!

Burn icon
Who thought it’s a good idea to make the Burn icon bright red? In 6 months, I never used it. Why is it the brightest thing in the whole Explorer window?

Up one folder
Try going one folder up without resorting to this drop down menu. Utilities is the current folder here. And no, there’s no keyboard shortcut for “go up” either (there was in XP, which was perfect).

Shutdown awesome
And of course, the awesome shutdown menu. The two buttons – never used them. What I always use is “Shut Down” from the menu. And let’s not even talk about all the choices in the menu (no, more choices is not always better).

So yeah. It’s not stellar, it has tons of small annoyances (and some large ones – try developing web plugins with UAC on…), but it’s usable. I might have gotten used to it by now, actually.

Fixed function lighting in vertex shader – how?

Sometime soon I’ll have to implement fixed function lighting pipeline in vertex shaders. Why? Because mixing fixed function and vertex shaders in multiple passes does not guarantee identical transformation results, thus requiring depth bias or projection matrix tweaks, which leads to various artifacts that annoy people to hell.

I don’t really know why that happens, because it seems that most modern cards don’t have fixed function units, so internally they are running shaders anyway. DX9 runtime on Vista’s WDDM also seems to be only handling shaders to the driver internally. Still, for some reason somewhere the precision does not match…

How such a task should be approached?

My requirements are:

  • Should handle any possible state combination in D3D fixed function T&L.
  • D3D 9.0c, using vertex shader 2.0 is ok. For now I don’t care about OpenGL.
  • No HLSL at runtime. I don’t want to add a megabyte or more to Unity web player just for HLSL. DX9 shader assembly is ok, because we already have the assembler code.
  • Should work as fast (or close to) as the regular fixed function pipeline.

I looked at ATI’s FixedFuncShader sample. It’s an ubershader approach; one large (230 instructions or so) shader with static VS2.0 branching. It had some obvious places to optimize, I could get it down to 190 or so instructions, kill some rcp’s and reduce the amount of constant storage by 2x.

Still, it did not handle some things in the D3D T&L or had some issues:

  • It assumes one input UV, one output UV and no texture matrices. This place in T&L gets quite convoluted – any input UVs or a texgen mode can be transformed by matrices of various sizes, and routed into any output UVs.
  • It was not using full T&L lighting model. No biggie here.
  • I haven’t checked with NVShaderPerf or AMD ShaderAnalyzer yet, but last time I checked the static branch instruction was taking two clocks on some NV architecture. So ubershader approach does not come for free.

Another thing I’m considering, is to combine final shader(s) from assembly fragments, with some simple register allocation.

In T&L shader code, there’s only limited set of could-be-redundant computations, mostly computing world space position, camera space normal, view vector and so on (those could be used lighting, texgen or fog). Those computations can be explicitly put into separate fragments, and later fragments could just use their result.

What is left then is some register allocation. A shader assembly fragment could want some temporary registers for internal use (this is simple, just give it a bunch of unused registers), also want some registers as input (from previous fragments), and save some output in registers.

Again, I haven’t checked with shader performance tools, but I think, guess and hope that the drivers do additional register allocation, liveness analysis etc. when converting D3D shader bytecode into hardware format. This would mean that I can be quite sloppy with it, i.e. don’t have to implement some super smart allocation scheme.

I wrote some experimental code for the shader assembly combiner and so far it looks like a reasonable approach (and not too hard either).

Does that make sense? Or did everyone solve those problems eons ago already?

Edit: half a year later, I wrote a technical report on how I implemented all this: http://aras-p.info/texts/VertexShaderTnL.html

Achievement of the week: MakeVistaDWMHappyDance

This was the function that I added:

void GUIView::MakeVistaDWMHappyDance()
{
    // Looks like Vista has some bug in DWM. Whenever we maximize or dock
    // a view, we must do something magic, otherwise
    // white stuff appears in place of the view.
    // See http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=4208117&SiteID=1

    bool earlierThanVista = systeminfo::GetOperatingSystemNumeric() < 600;
    if( earlierThanVista )
        return;

    // What seems to work is drawing one pixel via GDI.
    // We draw it at (1,1) with usual background color.
    int grayColor = 0.61f * 255.0f;
    PAINTSTRUCT ps;
    BeginPaint(m_View, &ps);
    SetPixel(ps.hdc, 1, 1, RGB(grayColor,grayColor,grayColor));
    EndPaint(m_View, &ps);
}

I know. Reading from screen when Aero is on is slow, bad and wrong. But then, what do you do? It’s better than users staring an all-white window just because Vista decided to draw it white, no matter what you think you’re drawing into it.

…still, MakeVistaDWMHappyDance is not nearly as cool as

internal interface ICanHazCustomMenu { … }

that Nicholas added a while ago.

Cool tech vs. boring details

Some of the stuff I’ve been working on last week:

  • Fixed import progress bar for movies with no audio
  • Fixed first context menu click not working on Windows
  • Eye dropper backend on Windows
  • Export Package actually works on Windows
  • Compare Binary works on Windows
  • Add checkbox to project wizard to always open it on startup
  • F1 in bundled text editor goes to scripting docs for current word
  • Fixed q/w/e/r keys in password fields and text areas toggling active Tool on Windows
  • Fixed panes not repainting on Windows after some change is done via context menu on them
  • …and so on.

Boring tiny little details.

This probably best summarizes where lion’s share of time goes when developing anything. I’m not working on some cool spherical harmonics lightmap compression. Or on cunning ways to encode shadow map information for better filtering. Or on using CUDA to compute something interesting.

In other words, I’m not working on cool technology. Instead I’m adding missing menu items. Fixing obscure corner cases. Fighting inconsistencies in operating system APIs. Spotting misplaced pixels. Adding missing keyboard shortcuts.

Nothing interesting to blog about!

But still, methinks the difference between software that is merely “good” and software that is “great” is in the details. And only in the details.

I’ll just take care of tons of more details. Maybe it will result in something good.

Crunchtime!

A few weeks ago it was all calm in the source control. Now it’s crunchtime!

I’m the master of svn deception. I do tons of useless commits just so that the stats look good. Yeah!

…ok, back to work.

The awesome support we do

Yesterday’s experience catching up with Unity forums, as I remember it:

Take a quick look at zillions of new posts.

Answer about five questions with “what’s the value of your camera’s near plane?”.

There should be some way to automate all of this. For every 20th question, reply with “increase your near plane!”, or something.

Unite 2008

Spent last week at our conference, Unite 2008. Lots of people, lots of stuff and goodness, tired as hell, but almost recovered already.

We showed a glimpse of Unity editor for Windows at the keynote, so it is public now – yes, we are working on Windows toolchain. About the time! This is the major area I’m spending time these days – Windows, Windows, Windows. Learning WinAPI as I cruise along :) Before Unity 2.1 I spent months fixing tons of small issues, now I’m spending months doing tons of small Windows related things. Someday I’ll get back to doing tons of small things on the rendering side.

Here’s a couple of random photos that I stoleborrowed from Mantas:


Keynote in front of a Sentinel from The Matrix.


Presenters talking.


People listening!


I don’t know that guy in the center. Probably some stupid outsider. Really!

Implicit to-pointer operators must die!

For the sake of the nation,
this operator must die!

Seriously. Suppose there is some class, let’s say ColorRGBAf. That has four floats inside. Now, someone at some point decided to add this operator to it:

operator float* () { /**/ }
operator const float* () const { /**/ }

Probably because it’s easier to pass color to OpenGL this way, or something like that.

This is evil. Like, really evil. Especially if that class did not have comparison operators defined, and some totally unrelated code four years later does:

if (color != oldColor) { /* … */ }

Ouch! Sounds like someone will spend four hours debugging something that looks like an event routing issue that only happens on Windows and only with optimizations on (yes, I just did that…).

What happens here? The compiler takes pointers to two colors and compares the pointers. If for some reason both colors are temporary objects, then it can even happen that both get folded into the same variable/register/whatnot. The pointers are the same. Ouch!

Implicit “nice” operators are just disguised evil. Remove that operator, add something like GetPointer() to class if someone really wants to use that, and better even make the comparison operators private and without implementations. Yes. Much better.