Speculation: pipelining geometry shaders

A followup to the older “discussion” about how/why geometry shaders would be okay/slow:

The graphics hardware has been quite successful so far at hiding memory latencies (i.e. when sampling textures). It does so (according to my understanding) by having a looong pixel pipeline, where hundreds (or thousands) pixels might be at one or another processing stage. ATI talks about this in big letters (R520 dispatch processor) and speculations suggest that GeForceFX had something like that (article). I have no idea about the older cards, but presumably they did something similar as well.

I am not sure how the vertex texture fetches are pipelined - pretty slow performance on GeForce6/7 suggest that they aren’t :) Probably vertex shaders in current cards operate in a simpler way - just fetch the vertices and run whole shaders on them (in contrast to pixel shaders, which seem to run just several instructions, then go to another pixels, return back, etc.).

With DX10, we have arbitrary memory fetches in any stage of the pipeline. Even the boundary between different fetch types is somewhat blurry (constant buffers vs. arbitrary buffers vs. textures) - perhaps they will differ only in bandwidth/latency (e.g. constant buffers live near the GPU while textures live in video memory).

So, with arbitrary memory fetches anywhere (and some of them being high latency), everything needs to have long pipelines (again, just my guess). This is all great, but the longer the pipeline, the worse it performs in non-friendly scenarios: pipeline flush is more expensive, drawing just a couple of “things” (primitives, vertices, pixels) is inefficient, etc.

I guess we’ll just learn a new set of performance rules for tomorrow’s hardware!

Back to GS pipelining: I imagine that the “slow” scenarios would be like this: vertices have shaders with dynamic branches or memory fetches differing vastly in execution lengths - so GS has to wait for all vertex shaders of the current primitive (optional: plus topology) to finish; and then each GS has dynamic branches or memory fetches, and outputs different number of primitives to the rasterizer. If I’d were hardware, I’d be scared :)


Reading DX10 docs...

Reading DirectX10 preview documentation right now (you know, it’s released with Dec2005 SDK). It is pretty impressive, I must say! Seems like a huge leap forward. Back to reading!


Pakimono!

(the lack of updates recently is because I have lots of stuff here going on)

I few weeks ago I was visiting OTEE and over the weekend we were jamming on a small game called Pakimono! The idea of the game was pretty cool - you’re the naked guy and have to ruin tourists’ photos :)

The whole experience was great. It was my first time using a Mac, first time working with Unity (their game development tool) etc. I coded&tuned most of the bullet-time character controller, where you drag your limbs with a mouse, trying to cover as much of the sight as possible.

The coding was a bit unusual - most of my coding life I was doing pretty low-level C++ programming. This time it was completely different - I’d setup “the game” directly in the editor, write some short C# scripts and boom! - everything works, without me having to worry about any of the low-level stuff. No recompiling or any of that stuff. Cool.

Ironically, I did not see the final Pakimono build yet. I left earlier than the others and do not have a Mac anywhere nearby. But the guys promised me a windows build!


Lost Garden

Lost Garden is good - about game design and related things from (ex) Anark guy (hi Chris!). E.g. this one (a practical definition of innovation in game design):

And if you ever hear an indie game developer talking about level design, either shoot them in the head now to help them avoid their future misery, or direct them towards this essay.


A crazy thought: 64k intro

I was planning to do a demo this year; all by myself (except music). However, after playing with Blender a bit it became clear that I am really not a 3D artist (not Blender’s fault, of course) - what a surprise :)

Now a crazy thought: do a 64 kilobyte intro instead?

Yeah, I know, one pretty unsuccessful try already was there 3+ years ago… But still: I need a musician that could use Farbrausch’s V2 synth, some clever ideas/design/code and there it is. There’s a hope that I have actually learned something in these 3 years, you know :)

By the way, this bzhykt intro produces really abstract visuals on my computer right now. Something was messed up in there…