Argh MFC!

When introductory documentation for something has this, you know it won’t be pretty:

CAsyncMonikerFile is derived from CMonikerFile, which in turn is derived from COleStreamFile. A COleStreamFile object represents a stream of data; a CMonikerFile object uses an IMoniker to obtain the data, and a CAsyncMonikerFile object does so asynchronously.

So yeah, I am dealing with downloading something from the internet inside an ActiveX control that is written in MFC. A seemingly simple task - I give you an URL, you give me back the bytes. But no! That would not be a proper architecture, so instead it has asynchronous monikers which are based on monikers which are based on stream files which use some interfaces and whatnot. And for ActiveX controls the docs suggest using CDataPathProperty or CCachedDataPathProperty, which are abstractions build on top of the above crap. And I don’t even know what “a moniker” is!

Of course all this complexity fails spectacularly in some quite common situations. For example, try downloading something when the web server serves gzip compressed html output. Good luck trying to figure out why everything seemingly works, you are notified of downloading progress, but never get the actual downloaded bytes.

Turns out the solution is to change downloading behaviour of the above pile of abstractions to use “pull data” model, instead of default “push data” model. The default behaviour just seems to be broken (though it is not broken in that pile of abstractions, instead it is broken somewhere deeper in Windows code). Is this mentioned anywhere in the docs? Of course not!

This is pretty much how a code comment looks like for this:

We don’t use CCachedDataPathProperty because it’s awfully slow, doing data reallocations for each 1KB received. For 8MB file it’s 8000 reallocations and 32 GB (!) of data copied for no good reason!

While we’re at it, we don’t use CDataPathProperty either, because it’s a useless wrapper over CAsyncMonikerFile.

Oh, and we don’t use CAsyncMonikerFile either, because it has bugs in VS2003’ MFC where it never notifies the container that it is done with download, making IE still display “X items remaining” indefinitely. Some smart coder was converting information message and returning “out of memory” error if result was NULL, even if input message was NULL (which it often was). So we use our own “fixed” version of CAsyncMonikerFile instead.

Oh MFC, how we love thee.



On job titles and design patterns

I just changed my job title to say “Code Chef”. I like it, and it represents my current understanding of programming pretty well. I cook code. That’s my job.

Some N years ago I would have liked a title with “Architect” or “Analyst” or something like that. I would have called myself “developer” instead of “programmer” because hey, a developer thinks up things, whereas a programmer is a mere “code monkey”. More on code monkeys below.

But wait! Back then I also believed that knowing and using Design Patterns is essential for a programmer! In one place when I was interviewing new hires, design pattern knowledge was something I would look for… how stupid! Nowadays my view of patterns is more along the lines of “yeah, whatever”. I don’t exactly think of them as things from hell, but they could have caused more harm than good already.

Back to job titles. Code monkey is actually the key employee. A software product is largely defined by the code, heck, it is code. Sure, it also has the user interface, the fancy icons, the documentation, the website, the support, the roadmap and whatnot, but the code is the product, whereas everything else is more or less addons (possibly excluding UI… UI also defines the product).

Code design? Design patterns? Who cares about that.

It’s the final result that matters. Futurist programming for the win.

On the other hand, Memento Observer is probably very cool.


One-liners: biawesome filtering

Said by Jonathan Czeck of Graveck:

What kind of filtering does Resize() function use? Nearest-neighbor, bilinear, bicubic, biawesome?

Since then “biawesome” became a local meme at work. Biawesome is awesome on steroids.


Tricky bugs: peculiarities of dynamic linking, and magic divisions

After wasting nearly two days on some really funky animation import crash, I checked in a code change with this log message:

Fix FBX animation import crash once more. When exported symbols are not listed for a dylib, it seems to link back to calling executable (?!), making them share function impls with the same name. And because Keyframe is actually different in editor vs ImportFBX, this is wrong. Apparently this is OS X Leopard only, or something. Argh.

The code change in question was just telling the compiler “here’s the list of the functions that are exported from this dynamic library”. The list was already there, just the compiler was never told about existence of it.

The bug manifested itself as a crash when importing animations. But it would not happen when importer was run from a small unit test application. There were no memory corruptions happening, it was not running out of memory, yet the code was crashing with access violation, usually because STL’s vector was returning it’s wrong size (but the actual data of the vector was correct; it was just returning bogus size). And it was doing that only on OS X Leopard, and not on OS X Tiger. Huh?

Turns out what did happen - and I’m not sure if that’s a bug in OS X or a feature - is that the calling application did contain a class called Keyframe. And the shared library (where the crash was happening) also contained a class called Keyframe. But those classes were slightly different; first was 20 bytes in size, and second one was 16 bytes.

Now, somehow when the shared library was calling vector::size(), the function from the calling application was used. I have no idea at all how or why this was happening, but it sure was! I could see from tracing the assembly code, that it was doing difference of two pointers, and then doing something that for sure was not division by 16.

What was the code doing? Turns out it was calculating division by 20 in a cunning way:

 mov  edx,esi   # edx = end()
 sub  edx,eax   # edx -= begin()
 mov  eax,edx   # eax = edx
 sar  eax,0x2   # eax >>= 2
 imul eax,eax,0xcccccccd # eax *= 0xcccccccd

In other words, the compiler was replacing division by constant (as used in vector’s size()) by a shift and multiplication with a magic number. You can read more about the technique here or here.

But of course the code above only works if the number was actually divisible by 20; otherwise it returns totally wrong result. This is perfectly fine for computing the difference in two pointers to structures of known size… Except that inside the shared library the Keyframe structures are 16 bytes, and not 20!

So yeah. Watch out for peculiarities of dynamic linking on your platform.