Amazing Optimizers, or Compile Time Tests

I wrote some tests to verify sorting/batching behavior in rendering code, and they were producing different results on Windows (MSVC) vs Mac (clang). The tests were creating a “random fake scene” with a random number generator, and at first it sounded like our “get random normalized float” function was returning slightly different results between platforms (which would be super weird, as in how come no one noticed this before?!).

So I started digging into random number generator, and the unit tests it has. This is what amazed me.

Here’s one of the unit tests (we use a custom native test framework that started years ago on an old version of UnitTest++):

TEST (Random01_WithSeed_RestoredStateGenerateSameNumbers)
{
	Rand r(1234);
	Random01(r);
	RandState oldState = r.GetState();
	float prev = Random01(r);
	r.SetState(oldState);
	float curr = Random01(r);
	CHECK_EQUAL (curr, prev);
}

Makes sense, right?

Here’s what MSVC 2010 compiles this down into:

push        rbx  
sub         rsp,50h  
mov         qword ptr [rsp+20h],0FFFFFFFFFFFFFFFEh  
	Rand r(1234);
	Random01(r);
	RandState oldState = r.GetState();
	float prev = Random01(r);
movss       xmm0,dword ptr [__real@3f47ac02 (01436078A8h)]  
movss       dword ptr [prev],xmm0  
	r.SetState(oldState);
	float curr = Random01(r);
mov         eax,0BC5448DBh  
shl         eax,0Bh  
xor         eax,0BC5448DBh  
mov         ecx,0CE6F4D86h  
shr         ecx,0Bh  
xor         ecx,eax  
shr         ecx,8  
xor         eax,ecx  
xor         eax,0CE6F4D86h  
and         eax,7FFFFFh  
pxor        xmm0,xmm0  
cvtsi2ss    xmm0,rax  
mulss       xmm0,dword ptr [__real@34000001 (01434CA89Ch)]  
movss       dword ptr [curr],xmm0  
	CHECK_EQUAL (curr, prev);
call        UnitTest::CurrentTest::Details (01420722A0h)
; ...

There’s some bit operations going on (the RNG is Xorshift 128), looks fine on the first glance.

But wait a minute; this seems like it only has code to generate a random number once, whereas the test is supposed to call Random01 three times?!

Turns out the compiler is smart enough to see through some of the calls, folds down all these computations and goes, “yep, so the 2nd call to Random01 will produce 0.779968381 (0x3f47ac02)“. And then it kinda partially does actual computation of the 3rd Random01 call, and eventually checks that the result is the same.

Oh-kay!

Now, what does clang (whatever version Xcode 8.1 on Mac has) do on this same test?

pushq  %rbp
movq   %rsp, %rbp
pushq  %rbx
subq   $0x58, %rsp
movl   $0x3f47ac02, -0xc(%rbp)   ; imm = 0x3F47AC02 
movl   $0x3f47ac02, -0x10(%rbp)  ; imm = 0x3F47AC02 
callq  0x1002b2950               ; UnitTest::CurrentTest::Results at CurrentTest.cpp:7
; ...

Whoa. There’s no code left at all! Everything just became an “is 0x3F47AC02 == 0x3F47AC02” test. It became a compile-time test.

w(☆o◎)w

By the way, the original problem I was looking into? Turns out RNG is fine (phew!). What got me was code in my own test that I should have known better about; it was roughly like this:

transform.SetPosition(Vector3f(Random01(), Random01(), Random01()));

See what’s wrong?

.

.

.

The function argument evaluation order in C/C++ is unspecified.

(╯°□°)╯︵ ┻━┻

Newer languages like C# or Java have guarantees that arguments are evaluated from left to right. Yay sanity.