|
|
Archive for 2005
I’m making a steady, but very slow progress on “my” 64k intro. Over the last week I couldn’t get over 13 kilobytes, so you can see that the progress is really slow. Not because I don’t code anything, but all code increase was cancelled by data size optimizations.
So far coding and data design for small sizes is not that much pain at all. Just, well, code and, well, keep your data small :) We’re only talking about the size of initial data, not the runtime size though.
A few obvious or new notes:
- Code to construct a cylinder is more complex than the one to construct a sphere. That’s what I expected. However, code to construct a box with multiple segments per side is the most complex of all!
- Dropping last byte from floats is usually okay. And instant 25% save! For some of the numbers, I plan to switch to half-style float (2 bytes) if space becomes a concern.
- Storing quaternions in 4 bytes (byte per component) is good. Actually, now that I think of it, it makes more sense to store three components at 10 bits each, and just store the sign of 4th component - better precision for the same size.
- This intro literally has the most complex and most automated “art pipeline” of any demo/game I (directly) worked on! I’ve got maxscripts generating C++ sources, custom commandline tools preprocessing C++ sources (mostly floats packing - due to lack of maxscript functionality), lua scripts for batch-compiling HLSL shaders, “development code” generating .obj models for import back into max, etc. It’s wacky, weird and cool!
- Compiling HLSL in two steps (HLSL->asm and asm->bytecode) instead of direct (HLSL->bytecode) gets rid of the constant table, some copyright strings and hence is good. (thanks blackpawn!)
- Getting FFD code to behave remotely similar to how 3dsmax does FFD is hard :)
The best thing so far is that I’ve got the music track from x_dynamics - it’s already done in V2 synth, takes small amount of space and is really good. Now I “just” have to finish the intro…
Posted on 2005-12-23 18:58 in code, demos | No Comments »
A followup to the older “ discussion” about how/why geometry shaders would be okay/slow:
The graphics hardware has been quite successful so far at hiding memory latencies (i.e. when sampling textures). It does so (according to my understanding) by having a looong pixel pipeline, where hundreds (or thousands) pixels might be at one or another processing stage. ATI talks about this in big letters (R520 dispatch processor) and speculations suggest that GeForceFX had something like that (article). I have no idea about the older cards, but presumably they did something similar as well.
I am not sure how the vertex texture fetches are pipelined - pretty slow performance on GeForce6/7 suggest that they aren’t :) Probably vertex shaders in current cards operate in a simpler way - just fetch the vertices and run whole shaders on them (in contrast to pixel shaders, which seem to run just several instructions, then go to another pixels, return back, etc.).
With DX10, we have arbitrary memory fetches in any stage of the pipeline. Even the boundary between different fetch types is somewhat blurry (constant buffers vs. arbitrary buffers vs. textures) - perhaps they will differ only in bandwidth/latency (e.g. constant buffers live near the GPU while textures live in video memory).
So, with arbitrary memory fetches anywhere (and some of them being high latency), everything needs to have long pipelines (again, just my guess). This is all great, but the longer the pipeline, the worse it performs in non-friendly scenarios: pipeline flush is more expensive, drawing just a couple of “things” (primitives, vertices, pixels) is inefficient, etc.
I guess we’ll just learn a new set of performance rules for tomorrow’s hardware!
Back to GS pipelining: I imagine that the “slow” scenarios would be like this: vertices have shaders with dynamic branches or memory fetches differing vastly in execution lengths - so GS has to wait for all vertex shaders of the current primitive (optional: plus topology) to finish; and then each GS has dynamic branches or memory fetches, and outputs different number of primitives to the rasterizer. If I’d were hardware, I’d be scared :)
Posted on 2005-12-22 14:06 in gpu | 2 Comments »
Reading DirectX10 preview documentation right now (you know, it’s released with Dec2005 SDK). It is pretty impressive, I must say! Seems like a huge leap forward. Back to reading!
Posted on 2005-12-16 19:58 in d3d | 7 Comments »
(the lack of updates recently is because I have lots of stuff here going on)
I few weeks ago I was visiting OTEE and over the weekend we were jamming on a small game called Pakimono! The idea of the game was pretty cool - you’re the naked guy and have to ruin tourists’ photos :)
The whole experience was great. It was my first time using a Mac, first time working with Unity (their game development tool) etc. I coded&tuned most of the bullet-time character controller, where you drag your limbs with a mouse, trying to cover as much of the sight as possible.
The coding was a bit unusual - most of my coding life I was doing pretty low-level C++ programming. This time it was completely different - I’d setup “the game” directly in the editor, write some short C# scripts and boom! - everything works, without me having to worry about any of the low-level stuff. No recompiling or any of that stuff. Cool.
Ironically, I did not see the final Pakimono build yet. I left earlier than the others and do not have a Mac anywhere nearby. But the guys promised me a windows build!
Posted on 2005-12-10 14:01 in games, unity | No Comments »
Lost Garden is good - about game design and related things from (ex) Anark guy (hi Chris!). E.g. this one ( a practical definition of innovation in game design):
And if you ever hear an indie game developer talking about level design, either shoot them in the head now to help them avoid their future misery, or direct them towards this essay.
Posted on 2005-11-08 9:21 in uncategorized | No Comments »
I was planning to do a demo this year; all by myself (except music). However, after playing with Blender a bit it became clear that I am really not a 3D artist (not Blender’s fault, of course) - what a surprise :)
Now a crazy thought: do a 64 kilobyte intro instead?
Yeah, I know, one pretty unsuccessful try already was there 3+ years ago… But still: I need a musician that could use Farbrausch’s V2 synth, some clever ideas/design/code and there it is. There’s a hope that I have actually learned something in these 3 years, you know :)
By the way, this bzhykt intro produces really abstract visuals on my computer right now. Something was messed up in there…
Posted on 2005-11-03 19:49 in demos | 7 Comments »
Released this HDR with MSAA demo I was talking about earlier. Here it is: aras-p.info/projHDR.html. I can also add that this St. Anne’s Church is a pretty complex beast :)
Posted on 2005-11-02 13:52 in demos | 1 Comment »
I’m still spending an occasional minute on my HDR demo. Now, everything is fine so far, except one thing: I can’t get MSAA working on some Radeons (and I don’t have a Radeon right now, which makes debugging a lot harder). The main point of my demo is to have MSAA on ordinary hw, so this is bad.
The reason seems to be that on older Radeons MSAA does not resolve alpha channel, which obsiously messes things up in my case. I’m using RGBE8 encoding for the main rendertarget, and it RGB gets MSAA’d and exponent not - then oh well, no good anti aliasing most of the time.
Of course I could always manually supersample everything, but this would defeat the whole point of the demo. Or I could render everything in two passes, one for RGB and one for exponent - but this also is not very nice…
Probably I’ll just release the demo as it is now and wait for possible feedback. Or dig up an old Radeon somewhere and debug more - but replacing the video card in my Shuttle XPC is not an easy task :)
Posted on 2005-11-02 11:09 in gpu | 1 Comment »
Yesterday I had a cool debugging session while working on my HDR demo. One of postprocessing filters produced weird results and I went off to investigate that. The usual tricks: debugging in Visual Studio to make sure right sample offsets are generated; D3D debug runtime, D3DX debug, reference rasterizer, firing up NVPerfHud and doing frame analysis, doing full capture with PIX and inspecting device state, etc.
Nothing helped.
Then I noticed that in the pixel shader, I wrote
sample = tex2D( s0, uv + vSmpOffsets[i] )
instead of
sample += tex2D( s0, uv + vSmpOffsets[i] )
Aaargh. So much for a plus sign.
How to deal with such bugs? Why some bugs are trivial to find, and some are hard? Why sometimes (often?) the time required to find the bug does not correlate with bug’s “trickiness”? Why sometimes I can find a tricky bug in big unknown codebase in a couple of minutes; yet spend two hours on the plus sign in my own small code?
I’ve got no answers to the above.
By the way: PIX is a great tool, but D3D guys should really polish the UI :)
Posted on 2005-10-26 8:30 in code | 4 Comments »
I’m doing a small HDR demo for fun. Nothing fancy - linear gamma, Reinhard’s tone mapping and whatnot - everyone does that. But the thing I made so far does not even look good! :)
I’m trying to support both HDR and FSAA at the same time on ordinary DX9 hardware (no Radeons 1k) by using RGBE8 rendertarget for the main scene. It’s all okay so far.
The most difficult task right now is making it look good. Once I have that I’ll post the results.
Posted on 2005-10-23 19:54 in demos, gpu | 3 Comments »
|