Surface Shaders, one year later

Over a year ago I had a thought that “Shaders must die” (part 1, part 2, part 3).

And what do you know - turns out we’re trying to pull this off in upcoming Unity 3. We call this Surface Shaders cause I’ve a suspicion “shaders must die” as a feature name wouldn’t have flied very far.

Idea

The main idea is that 90% of the time I just want to declare surface properties. This is what I want to say:

Hey, albedo comes from this texture mixed with this texture, and normal comes from this normal map. Use Blinn-Phong lighting model please, and don’t bother me again!

With the above, I don’t have to care whether this will be used in a forward or deferred rendering, or how various light types will be handled, or how many lights per pass will be done in a forward renderer, or how some indirect illumination SH probes will come in, etc. I’m not interested in all that! These dirty bits are job of rendering programmers, just make it work dammit!

This is not a new idea. Most graphical shader editors that make sense do not have “pixel color” as the final output node; instead they have some node that basically describes surface parameters (diffuse, specularity, normal, …), and all the lighting code is usually not expressed in the shader graph itself. OpenShadingLanguage is a similar idea as well (but because it’s targeted at offline rendering for movies, it’s much richer & more complex).

Example

Here’s a simple - but full & complete - Unity 3.0 shader that does diffuse lighting with a texture & a normal map.

Shader "Example/Diffuse Bump" {
  Properties {
    _MainTex ("Texture", 2D) = "white" {}
    _BumpMap ("Bumpmap", 2D) = "bump" {}
  }
  SubShader {
    Tags { "RenderType" = "Opaque" }
    CGPROGRAM
    #pragma surface surf Lambert
    struct Input {
      float2 uv_MainTex;
      float2 uv_BumpMap;
    };
    sampler2D _MainTex;
    sampler2D _BumpMap;
    void surf (Input IN, inout SurfaceOutput o) {
      o.Albedo = tex2D (_MainTex, IN.uv_MainTex).rgb;
      o.Normal = UnpackNormal (tex2D (_BumpMap, IN.uv_BumpMap));
    }
    ENDCG
  } 
  Fallback "Diffuse"
}

Given pretty model & textures, it can produce pretty pictures! How cool is that?

I grayed out bits that are not really interesting (declaration of serialized shader properties & their UI names, shader fallback for older machines etc.). What’s left is Cg/HLSL code, which is then augmented by tons of auto-generated code that deals with lighting & whatnot.

This surface shader dissected into pieces:

  • #pragma surface surf Lambert: this is a surface shader with main function “surf”, and a Lambert lighting model. Lambert is one of predefined lighting models, but you can write your own.

  • struct Input: input data for the surface shader. This can have various predefined inputs that will be computed per-vertex & passed into your surface function per-pixel. In this case, it’s two texture coordinates.

  • surf function: actual surface shader code. It takes Input, and writes into SurfaceOutput (a predefined structure). It is possible to write into custom structures, provided you use lighting models that operate on those structures. The actual code just writes Albedo and Normal to the output.

What is generated

Unity’s “surface shader code generator” would take this, generate actual vertex & pixel shaders, and compile them to various target platforms. With default settings in Unity 3.0, it would make this shader support:

  • Forward renderer and Deferred Lighting (Light Pre-Pass) renderer.

  • Objects with precomputed lightmaps and without.

  • Directional, Point and Spot lights; with projected light cookies or without; with shadowmaps or without. Well ok, this is only for forward renderer because in Light Pre-Pass lighting happens elsewhere.

  • For Forward renderer, it would compile in support for lights computed per-vertex and spherical harmonics lights computed per-object. It would also generate extra additive blended pass if needed for the case when additional per-pixel lights have to be rendered in separate passes.

  • For Light Pre-Pass renderer, it would generate base pass that outputs normals & specular power; and a final pass that combines albedo with lighting, adds in any lightmaps or emissive lighting etc.

  • It can optionally generate a shadow caster rendering pass (needed if custom vertex position modifiers are used for vertex shader based animation; or some complex alpha-test effects are done).

For example, here’s code that would be compiled for a forward-rendered base pass with one directional light, 4 per-vertex point lights, 3rd order SH lights; optional lightmaps (I suggest just scrolling down):

#pragma vertex vert_surf
#pragma fragment frag_surf
#pragma fragmentoption ARB_fog_exp2
#pragma fragmentoption ARB_precision_hint_fastest
#pragma multi_compile_fwdbase
#include "HLSLSupport.cginc"
#include "UnityCG.cginc"
#include "Lighting.cginc"
#include "AutoLight.cginc"
struct Input {
	float2 uv_MainTex : TEXCOORD0;
};
sampler2D _MainTex;
sampler2D _BumpMap;
void surf (Input IN, inout SurfaceOutput o)
{
	o.Albedo = tex2D (_MainTex, IN.uv_MainTex).rgb;
	o.Normal = UnpackNormal (tex2D (_BumpMap, IN.uv_MainTex));
}
struct v2f_surf {
  V2F_POS_FOG;
  float2 hip_pack0 : TEXCOORD0;
  #ifndef LIGHTMAP_OFF
  float2 hip_lmap : TEXCOORD1;
  #else
  float3 lightDir : TEXCOORD1;
  float3 vlight : TEXCOORD2;
  #endif
  LIGHTING_COORDS(3,4)
};
#ifndef LIGHTMAP_OFF
float4 unity_LightmapST;
#endif
float4 _MainTex_ST;
v2f_surf vert_surf (appdata_full v) {
  v2f_surf o;
  PositionFog( v.vertex, o.pos, o.fog );
  o.hip_pack0.xy = TRANSFORM_TEX(v.texcoord, _MainTex);
  #ifndef LIGHTMAP_OFF
  o.hip_lmap.xy = v.texcoord1.xy * unity_LightmapST.xy + unity_LightmapST.zw;
  #endif
  float3 worldN = mul((float3x3)_Object2World, SCALED_NORMAL);
  TANGENT_SPACE_ROTATION;
  #ifdef LIGHTMAP_OFF
  o.lightDir = mul (rotation, ObjSpaceLightDir(v.vertex));
  #endif
  #ifdef LIGHTMAP_OFF
  float3 shlight = ShadeSH9 (float4(worldN,1.0));
  o.vlight = shlight;
  #ifdef VERTEXLIGHT_ON
  float3 worldPos = mul(_Object2World, v.vertex).xyz;
  o.vlight += Shade4PointLights (
    unity_4LightPosX0, unity_4LightPosY0, unity_4LightPosZ0,
    unity_LightColor0, unity_LightColor1, unity_LightColor2, unity_LightColor3,
    unity_4LightAtten0, worldPos, worldN );
  #endif // VERTEXLIGHT_ON
  #endif // LIGHTMAP_OFF
  TRANSFER_VERTEX_TO_FRAGMENT(o);
  return o;
}
#ifndef LIGHTMAP_OFF
sampler2D unity_Lightmap;
#endif
half4 frag_surf (v2f_surf IN) : COLOR {
  Input surfIN;
  surfIN.uv_MainTex = IN.hip_pack0.xy;
  SurfaceOutput o;
  o.Albedo = 0.0;
  o.Emission = 0.0;
  o.Specular = 0.0;
  o.Alpha = 0.0;
  o.Gloss = 0.0;
  surf (surfIN, o);
  half atten = LIGHT_ATTENUATION(IN);
  half4 c;
  #ifdef LIGHTMAP_OFF
  c = LightingLambert (o, IN.lightDir, atten);
  c.rgb += o.Albedo * IN.vlight;
  #else // LIGHTMAP_OFF
  half3 lmFull = DecodeLightmap (tex2D(unity_Lightmap, IN.hip_lmap.xy));
  #ifdef SHADOWS_SCREEN
  c.rgb = o.Albedo * min(lmFull, atten*2);
  #else
  c.rgb = o.Albedo * lmFull;
  #endif
  c.a = o.Alpha;
  #endif // LIGHTMAP_OFF
  return c;
}

Of those 90 lines of code, 10 are your original surface shader code; the remaining 80 would have to be pretty much written by hand in Unity 2.x days (well ok, less code would have to be written because 2.x had less rendering features). But wait, that was only base pass of the forward renderer! It also generates code for additive pass, for deferred base pass, deferred final pass, optionally for shadow caster pass and so on.

So this should be an easier to write lit shaders (it is for me at least). I hope this will also increase the number of Unity users who can write shaders at least 3 times (i.e. to 30 up from 10!). It should be more future proof to accomodate changes to the lighting pipeline we’ll do in Unity next.

Predefined Input values

The Input structure can contain texture coordinates and some predefined values, for example view direction, world space position, world space reflection vector and so on. Code to compute them is only generated if they are actually used. For example, if you use world space reflection to do some cubemap reflections (as emissive term) in your surface shader, then in Light Pre-Pass base pass the reflection vector will not be computed (since it does not output emission, so by extension does not need reflection vector).

As a small example, the shader above extended to do simple rim lighting:

#pragma surface surf Lambert
struct Input {
    float2 uv_MainTex;
    float2 uv_BumpMap;
    float3 viewDir;
};
sampler2D _MainTex;
sampler2D _BumpMap;
float4 _RimColor;
float _RimPower;
void surf (Input IN, inout SurfaceOutput o) {
    o.Albedo = tex2D (_MainTex, IN.uv_MainTex).rgb;
    o.Normal = UnpackNormal (tex2D (_BumpMap, IN.uv_BumpMap));
    half rim =
        1.0 - saturate(dot (normalize(IN.viewDir), o.Normal));
    o.Emission = _RimColor.rgb * pow (rim, _RimPower);
}

Vertex shader modifiers

It is possible to specify custom “vertex modifier” function that will be called at start of the generated vertex shader, to modify (or generate) per-vertex data. You know, vertex shader based tree wind animation, grass billboard extrusion and so on. It can also fill in any non-predefined values in the Input structure.

My favorite vertex modifier? Moving vertices along their normals.

Custom Lighting Models

There are a couple simple lighting models built-in, but it’s possible to specify your own. A lighting model is nothing more than a function that will be called with the filled SurfaceOutput structure and per-light parameters (direction, attenuation and so on). Different functions would have to be called in forward and light pre-pass rendering cases; and naturally the light pre-pass one has much less flexibility. So for any fancy effects, it is possible to say “do not compile this shader for light pre-pass”, in which case it will be rendered via forward rendering.

Example of wrapped-Lambert lighting model:

#pragma surface surf WrapLambert
half4 LightingWrapLambert (SurfaceOutput s, half3 dir, half atten) {
    dir = normalize(dir);
    half NdotL = dot (s.Normal, dir);
    half diff = NdotL * 0.5 + 0.5;
    half4 c;
    c.rgb = s.Albedo * _LightColor0.rgb * (diff * atten * 2);
    c.a = s.Alpha;
    return c;
}
struct Input {
    float2 uv_MainTex;
};
sampler2D _MainTex;
void surf (Input IN, inout SurfaceOutput o) {
    o.Albedo = tex2D (_MainTex, IN.uv_MainTex).rgb;
}

Behind the scenes

I’m using HLSL parser from Ryan Gordon’s mojoshader to parse the original surface shader code and infer some things from the AST mojoshader produces. This way I can figure out what members are in what structures, go over function prototypes and so on. At this stage some error checking is done to tell the user his surface function is of wrong prototype, or his structures are missing required members - which is much better than failing with dozens of compile errors in the generated code later.

To figure out which surface shader inputs are actually used in the various lighting passes, I’m generating small dummy pixel shaders, compile them with Cg and use Cg’s API to query used inputs & outputs. This way I can figure out, for example, that a normal map nor it’s texture coordinate is not actually used in Light Pre-Pass’ final pass, and save some vertex shader instructions & a texcoord interpolator.

The code that is ultimately generated is compiled with various shader compilers depending on the target platform (Cg for PC/Mac, XDK HLSL for Xbox 360, PS3 Cg for PS3, and my own fork of HLSL2GLSL for iPhone, Android and upcoming NativeClient port of Unity).

So yeah, that’s it. We’ll see where this goes next, or what happens when Unity 3 will be released.


Compiling HLSL into GLSL in 2010

Realtime shader languages these days have settled down into two camps: HLSL (or Cg, which for all practical reasons is the same) and GLSL (or GLSL ES, which is sufficiently similar). HLSL/Cg is used by Direct3D and the big consoles (Xbox 360, PS3). GLSL/ES is used by OpenGL and pretty much all modern mobile platforms (iPhone, Android, …).

Since shaders are more or less “assets”, having two different languages to deal with is not very nice. What, I’m supposed to write my shader twice just to support both (for example) D3D and iPad? You would think in 2010, almost a decade since high level realtime shader languages have appeared, this problem would be solved… but it isn’t!

In upcoming Unity 3.0, we’re going to have OpenGL ES 2.0 for mobile platforms, where GLSL ES is the only option to write shaders in. However, almost all other platforms (Windows, 360, PS3) need HLSL/Cg.

I tried a bit making Cg spit out GLSL code. In theory it can, and I read somewhere that id uses it for OpenGL backend for Rage… But I just couldn’t make it work. What’s possible for John apparently is not possible for mere mortals.

Then I looked at ATI’s HLSL2GLSL. That did produce GLSL shaders that were not absolutely horrible. So I started using it, and (surprise!) quickly ran into small issues here and there. Too bad development of the library stopped around 2006… on the plus side, it’s open source!

So I just forked it. Here it is: http://code.google.com/p/hlsl2glslfork/ (commit log here). There are no prebuilt binaries or source drops right now, just a Mercurial repository. BSD license. Patches welcome.

Note on the codebase: I don’t particularly like the codebase. It seems somewhat over-engineered code, that was probably taken from reference GLSL parser that 3DLabs once did, and adapted to parse HLSL and spit out GLSL. There are pieces of code that are unused, unfinished or duplicated. Judging from comments, some pieces of code have been in the hands of 3DLabs, ATI and NVIDIA (what good can come out of that?!). However, it works, and that’s the most important trait any code can have.

Note on the preprocessor: I bumped into some preprocessor issues that couldn’t be easily fixed without first understanding someone else’s ancient code and then changing it significantly. Fortunately, Ryan Gordon’s project, MojoShader, happens to have preprocessor that very closely emulates HLSL’s one (including various quirks). So I’m using that to preprocess any source before passing it down to HLSL2GLSL. Kudos to Ryan!

Side note on MojoShader: Ryan is also working on HLSL->GLSL cross compiler in MojoShader. I like that codebase much more; will certainly try it out once it’s somewhat ready.

You can never have enough notes: Google’s ANGLE project (running OpenGL ES 2.0 on top of Direct3D runtime+drivers) seems to be working on the opposite tool. For obvious reasons, they need to take GLSL ES shaders and produce D3D compatible shaders (HLSL or shader assembly/bytecode). The project seems to be moving fast; and if one day we’ll decide to default to GLSL as shader language in Unity, I’ll know where to look for a translator into HLSL :)


GDC 2010 report

Just returned from exciting (and exhausting) trip to Game Developers Conference 2010. Random notes:

Unity

It seems that everyone is talking about Unity this year. At GDC 2009 some people have heard about us, some others were “where the f*** this came from?!”, and some had no idea what Unity is. This year it’s hard to find anyone who hasn’t heard about Unity. I was surprised by number of AAA developers who are playing around with Unity internally (for prototyping, mobile & whatnot) and/or are big fans of Unity. I like!

We had a cool booth that was very busy at all times. As a bonus, the Unity chairs could be used as weapons!

Awesome quote: CEO of censored (competing middleware company) said: “yeah, Unity is going up, we are going down”. This is taken completely out of context of course.

We were busy demoing upcoming Unity 3 which I think will be quite awesome. Three days before the conference were spent crunching on the demos for GDC :)

Cool Stuff

Only managed to go to two sessions :(

Stephen Hill’s “Rendering Tools and Techniques of Splinter Cell: Conviction” had interesting bits & pieces of stuff. Nice work on hierarchical Z occlusion and ambient occlusion fields! (probably first time I see AO fields used in actual game production)

Mike Acton’s “Three Big Lies: Typical Design Failures in Game Programming” was entertaining. Content wise I pretty much knew what to expect. If you aren’t following Mike - do it now! Talk slides are at Insomniac’s site.

RAD’s Telemetry profiler looks totally sweet. I think they acquired this one and improved it. Some very good UI ideas in there. On a related note, Scaleform’s new profiler looks… kinda inspired by Unity’s (comparison: Scaleform on the left, Unity on the right).

Fun Stuff

Managed to sneak in some fun (dare I say “social”?) stuff.

Rendering folks dinner (thanks Johan!) was awesome, even if it made me feel kinda small & stupid among those super smart guys & gals. Shadow algorithms on receipts FTW! Middleware Meetup (thanks Dan!) was full of friendly competitors :) #gdcdrink tweetup (thanks Mike!) had lots of war stories, PS3 talk and how to do fluid simulation on 360’s pixel shaders.


Screenspace vs. mip-mapping

Just spent half a day debugging this, so here it is for the future reference of the internets.

In a deferred rendering setup (see Game Angst for a good discussion of deferred shading & lighting), lights are applied using data from screen-space buffers. Position, normal and other things are reconstructed from buffers and lighting is computed “in screen space”.

Because each light is applied to a portion of the screen, the pixels it computes can belong to different objects. If in any place of lighting computation you use textures with mipmaps, be careful. Most common use for mipmapped light textures is light “cookies” (aka “gobo”).

Let’s say we have a very simple scene with a spot light:

Light’s angular attenuation comes from a texture like this:

If the texture has mipmaps and you sample it using the “obvious” way (e.g. tex2Dproj), you can get something like this:

Black stuff around the sphere is no good! It’s not the infamous half-texel offset in D3D9, not a driver bug, not a shader compiler bug and not the nature trying to prevent you from writing a deferred renderer.

It’s the mipmapping.

Mipmaps of your cookie texture look like this (128x128, 16x16, 8x8, 4x4 shown):

Now, take two adjacent pixels, where one belongs to the edge of the sphere, and the other belongs to the background object (technically you take a 2x2 block of pixels, but just two are enough to illustrate the point). When the light is applied, cookie texture coordinates for those pixels are computed. It can happen that the coordinates are very different, especially when pixels “belong” to entirely different surfaces that are quite far away from each other.

What the GPU does when texture coordinates of adjacent pixels are very different? Chooses a lower mipmap level so that texel to pixel density roughly matches 1:1. On the edges of this “wrong” screenshot, it happens that very small mipmap level is sampled, which is either black or white color (see 4x4 mip level).

What to do here? You could disable mip-mapping (which is not good for performance and not good for image quality). You could drop some smallest mip levels which might be enough and not that bad for performance. Another option is to manually supply LOD level or derivatives to sampling instructions, using something else than cookie texture coordinates. For example, derivative in view space position, or something like that. This might not be possible on lower shader models though.


Four years ago today...

…I took a plane to Copenhagen. Oh, this sounds familiar…

Well ok, it all started a bit before:

I exchanged some emails with David and Joachim and they invited me for a gamejam in their office. Then one thing led to another, I was young and needed money (oops! wrong topic) and on January 2006 I started working on this thing called “Unity”.

Unity was at version 1.2.1 then. Since then we’ve released about a dozen new versions, added hundreds (or thousands?) of new features, a handful of new platforms and have grown a lot.

Also, we stopped saying “Sales are INSANE!!!!11” whenever they exceeded a whopping ten thousand euros per week. Seriously, that much money in 2006 was a big thing. Our Windows build machine was a single core Celeron with 512MB RAM because that’s what we could afford! Well ok, we’re still saying “sales are insane!” from time to time, just the threshold has gone way up.

Occasionally we’d get excited about strangest things. I think this email is about some car model from ATI that was on front page of our website in 2006. It’s beyond me why we’d put a car on Unity website, but somehow it seemed to make sense at the time.

It would take too much space to list all the awesome things that happened in those four years. I got to work on some things too, like Windows Web Player, Direct3D renderer, shadows, editor for Windows and whatnot. But I mostly concentrate on creating trouble, which does not seem to hinder Unity that much. I need to get more efficient!

Seriously though, it has been an amazing ride so far, and I hope it will only become better. Thanks to everyone at Unity Technologies and the community!

Rock on!