Some Unity codebase stats

I was doing fresh codebase checkout & building on a new machine, so got some stats along the way. No big insights, move on!

Codebase size

We use Mercurial for source control right now. With “largefiles” extension for some big binary files (precompiled 3rd party libraries mostly).

Getting only the “trunk” branch (without any other branches that aren’t in trunk yet), which is 97529 commits:

  • Size of whole Mercurial history (.hg folder): 2.5GB, 123k files.
  • Size of large binary files: 2.3GB (almost 200 files).
  • Regular files checked out: 811MB, 36k files.

Now, the build process has a “prepare” step where said large files are extracted for use (they are mostly zip or 7z archives). After extraction, everything you have cloned, updated and prepared so far takes 11.7GB of disk space.

Languages and line counts

Runtime (“the engine”) and platform specific bits , about 5000 files:

  • C++: 360 kLOC code, 29 kLOC comments, 1297 files.
  • C/C++ header: 146 kLOC code, 18 kLOC comments, 1480 files.
  • C#: 20 kLOC code, 6 kLOC comments, 154 files.
  • Others are peanuts: some assembly, Java, Objective C etc.
  • Total about half a million lines of code.

Editor (“the tools”), about 6000 files:

  • C++: 257 kLOC code, 23 kLOC comments, 588 files.
  • C#: 210 kLOC code, 16 kLOC comments, 1168 files.
  • C/C++ Header: 51 kLOC code, 6k comments, 497 files.
  • Others are peanuts: Perl, JavaScript etc.
  • Total, also about half a million lines of code!

Tests, about 7000 files. This is excluding C++ unit tests which are directly in the code. Includes our own internal test frameworks as well as tests themselves.

  • C#: 170 kLOC code, 11 kLOC comments, 2248 files.
  • A whole bunch of other stuff: C++, XML, JavaScript, Perl, Python, Java, shell scripts.
  • Everything sums up to about quarter million lines of code.

Now, all the above does not include 3rd party libraries we use (Mono, PhysX, FMOD, Substance etc.). Also does not include some of our own code that is more or less “external” (see github).

Build times

Building Windows Editor: 2700 files to compile; 4 minutes for Debug build, 5:13 for Release build. This effectively builds “the engine” and “the tools” (main editor and auxilary tools used by it).

Build Windows Standalone Player: 1400 files to compile; 1:19 for Debug build, 1:48 for Release build. This effectively builds only “the engine” part.

All this doing a complete build. As timed on MacBookPro (2013, 15" 2.3GHz Haswell, 16GB RAM, 512GB SSD model) with Visual Studio 2010, Windows 8.1, on battery, and watching Jon Blow’s talk on youtube. We use JamPlus build system (“everything about it sucks, but it gets the job done”) with precompiled headers.

Sidenote on developer hardware: this top-spec-2013 MacBookPro is about 3x faster at building code as my previous top-spec-2010 MacBookPro (it had really heavy use and SSD isn’t as fast as it used to be). And yes, I also have a development desktop PC; most if not all developers at Unity get a desktop & laptop.

However difference between a 3 minute build and 10 minute build is huge, and costs a lot more than these extra 7 minutes. Longer iterations means more distractions, less will to do big changes (“oh no will have to compile again”), less will to code in general etc.

Do get the best machines for your developers!

Well this is all!


On having an ambitious vision

We just announced upcoming 2D tools for Unity 4.3, and one of responses I’ve seen is “I am rapidly running out of reasons not to use Unity”. Which reminds me of some stories of a few years back.

Perhaps sometime in 2006, I was eating shawarmas with Joachim and discussing possible futures of Unity. His answer to my question, “so what’s your ultimate goal with Unity” was along the lines of,

I want to make it so that whenever anyone starts making a game, Unity will be their first choice of tech.

Of course that was crazy talk, so my reaction was somewhere between “you know that’s going to be hard” and “good luck with that”. Fast forward to 2013 and the thought is not so crazy anymore. Of course not everyone has to use Unity, but quite many do consider and use it. My slightly pessimistic, pragmatic, probability-weighted thinking proved wrong.

Some time before that, in late 2005, I got an email from some David, asking if I’d want to join their company I’ve never heard about. The company made this engine, “Unity”, that I’ve never heard about; and it was Mac-only, and I’ve only seen a Mac before.

The email said:

<…>

We are building a game development suite called Unity. Unity is changing how small-to-medium developers create games. It is a power-tool combining the flexibility of Flash with all of the power of high-end game engines.

It’s on the market right now, and making a dent. Unity thrills people wonderfully, people find they are able to create stuff they only dreamt of before.

Our users are excited by extremely advanced technology combined with an intuitive editor. A flexible shader system, a unique completely automatic asset pipeline, Ageia physX (née Novodex), and publishing standalone Windows and OS X, and OS X web player with one click (and it actually just works).

<…>

Now, that was 2005. Unity was at version 1.1. No one besides Jon Czeck was probably using it; more or less.

Crazy fantasies from someone who’s somewhere between naïve and delusional? Yeah, sounds like it. So of course after a couple of exhanges I said “no” (but then they invited me to a gamejam, and I thought that while most likely nothing big will happen out of that, at least it will be fun…).

And now it’s 2013. And no matter if you like or dislike Unity, there’s no denying it is quite a thing, and perhaps changed the industry for better. A tiny bit at least. These crazy, ambitious ideas did come through.

Reminder for myself: probabilities and pragmatism do not always win. Gotta have goals that are beyond practical possibilities.


Inter-process Communication: How?

A post of mostly questions, and no answers!

So I needed to do some IPC (Inter-process Communication) lately for shader compilers. There are several reasons why you’d want to move some piece of code into another process; in my case they were:

  • Bit-ness of the process; I want a 64 bit main executable but some of our platforms have only 32 bit shader compiler libraries.
  • Parallelism. For example you can call NVIDIA’s Cg from multiple threads, but it will just lock some internal mutex for most of the shader compilation time. Result is, you’re trying to compile shaders on 16 cores, but they end up just waiting on each other. By running 16 processes instead, they are completely independent, and the shader compiler does not even have to be thread-safe ;)
  • Memory usage and fragmentation. This is less of an issue in 64 bit land, but in 32 bit it helps to put some stuff into separate process with its own address space.
  • Crash resistance. A crash or memory trasher in a shader compiler should not bring down whole editor.

Now of course, all that comes with a downside: IPC is much more cumbersome than just calling a function in some library directly. So I’m wondering - how people do that in C/C++ codebases?

(I’m getting flashbacks of CORBA from my early enterprisey days… but hey, that was last millenium, and seemed like a good idea at the time…)

Transport layer?

So there’s a question of over what medium the processes will communicate?

  • Files, named pipes, sockets, shared memory?
  • Roll your own code for one of the above?
  • Use some larger libraries like libuv, 0MQ, nanomsg or (shudder) boost::asio?

What I do right now is just some code for named pipes (on Windows) and stdin/stdout (on Unixes). We already had some code for that lying around anyway.

Message protocol?

And then there’s a question, how do you define the “communication protocol” between the processes. Ease of development, need (or no need) for backward/forward compatibility, robustness in presence of errors etc. all come into play.

  • Manually written, some binary message format?
  • Manually written, some text/line based protocol?
  • JSON, XML, YAML etc.?
  • Helper tools like protobuf or Cap’n Proto?

Right now I’m having a manually-written, line-based message format. But it’s quite a laborous process to write all that communication code, especially when you also want to do some error handling. It’s not hard, but stupid boring work, and high chance of accidental bugs due to bored programmer copy-pasting nonsense me.

Maybe I should use protobuf? (looked at Cap’n Proto, but can’t afford to use C++11 compilers yet)

Am I missing some easy, turnkey solution for IPC in C/C++?


Iceland Vacation Report

tl;dr: Just spent a week in Iceland and it was awesome!

Some folks have asked for impressions of my Iceland vacation or some advice, so here it goes. Caveats: my first (and only so far) trip there, we went with kids, came up with the itinerary ourselves (with some advice from friends), etc.

Our trip was one week; myself, my wife and two kids (10 and 4yo). If you’re going alone, or for a honeymoon, or with a group of friends, the experience might be very different. Without small kids, I’d go for longer than a week, and try to wander further away from main roads.

Planning

Summary of what we wanted:

  • Go in summer so it’s fairly warm.
  • Stay away from people ;) Well, at least a bit.
  • No serious hiking climbing; just rent a car and go places.

Asking friends, reading the internets (wikitravel, wikipedia, lonely planet, random blogs), came up with a list of “places I’d like to go”. Used Google Maps Engine Lite to layout the plan, and google maps to estimate driving times.

The plan was mostly to explore the nothern part of Iceland, staying in Akureyri for 3 nights; and last two nights in Reykjavik.

We booked everything in advance. Larger places (Akureyri and Reykjavik) through airbnb – so far all my experiences with airbnb have been very positive, and it’s much nicer to stay in an actual apartment instead of some generic hotel/guesthouse. Towns in Iceland are extremely small though - Akureyri, being the 2nd largest city outside of Reykjavik area, is only 18000. Which means airbnb is only really practical in Reykjavik & Akureyri. We booked some small cottages & guesthouses for several nights elsewhere (through tripadvisor, booking.com etc.).

Driving

Rented a car in advance as well. For the first Iceland trip decided to go “casual driving”. Car rental is expensive. In our case, we paid as much for a simple Renault Megane as we paid for all the housing. Rent a local GPS; neither Apple nor Google maps have very good road coverage, and cell connectivity might be shaky in more remote places.

Paved roads (the 1 “ring road” and most of two-digit roads) are good quality but not wide. Larger gravel roads are okay. Smaller gravel roads are small and rocky – and we didn’t even go to more mountain places. Big chunk of area inside the land is only accessible by 4x4 vehicles; which we decided not to do this time.

Notes! When a sign says “blindhæd”, it means exactly that - a road goes through a top of some hill and you wouldn’t see a car approaching in front. Gas stations are around towns, but you can easily have 100km without a single station in between. Some clouds literally sit on the ground; and visibility while driving through that is really, really bad - couple dozen meters. Sheep often found on the gravel roads. A lot of bridges that are only wide enough for one car. Driving off-road is illegal to preserve vegetation (hey it takes several thousand years for even moss to start growing on lava fields).

Generaly driving conditions are okay (in summer and good weather at least ;)), there’s little traffic going on, and other drivers are very considerate. When two cars have to pass by on a narrow road, one of them often carefully stops several hundred meters away to let the other through. For me, the hardest thing was just that driving 4-5 hours each day is tiring (hey my usual daily dose is 30 minutes! and I don’t like driving to begin with). That, and driving through the clouds - your eyes are used to scanning the road at least several hundred meters ahead, but you can’t quite do that in the cloud.

Next time I’m going there, I want to get a 4x4 and go more remote places. The beauty and non-Earthiness of the landscapes is too stunning.

Next up: travel log with pictures. SPOILER ALERT!

Day 1: Þingvellir and Deildartunguhver

Landed in Keflavik airport past midnight, got our car and slept over in some guesthouse in Keflavik itself.

Þingvellir park with rift valley - somewhat too many people for my taste ;). Took smaler gravel roads up north. Surprise find - a lake with flat as a mirror surface; I didn’t even notice the lake at first. Sandkluftavatn is the name.

Deildartunguhver hot springs. Fairly impressive to see boiling water coming out of the earth, just like that.

Also, the smell! This is a common theme - Iceland has abundant hot water that’s used for heating & stuff, but most of it has that hard boiled eggs sulfur smell. They somehow do not mention the smell in, for example, Blue Lagoon advertising material ;)

Pathfinding in the GPS led us through some scary road where 7km lasted forever, mostly in 1st gear and trying to avoid damaging the car’s underside or rolling off a hill. A jeep would have been useful. A fence sign that could either be interpreted as “you’ll be shot for going there” or “no shooting here” provided some nice ambiguity! That was the only scary driving experience I had. Moral here: if you’re entering a road and wondering “I wonder if my car is really good for this”, turn around now. The road will not get better!

Rest of the day, highway up to Hvammstangi, slept over in small, simple & nice cottages. “Double story bed, yay!!!” – kids.

Day 2: to Akureyri

The plan was “just get from Hvammstangi to Akureyri”. Took a little detour to Skagafjördur.

Settled down in Akureyri, which we used as our “home base” for 3 nights. Really lovely town! Just small enough to be, well, small; and just large enough to have decent places to eat ;) Kids loved the swimming pool. Due to lots of natural hot water, swimming pools are everywhere in Iceland, and they are extremely cheap.

Day 3: Ásbyrgi, Dettifoss, Mývatn

Just found out now that our trip almost went along the “Diamond Circle” route. Akureyri -> Husavik -> Ásbyrgi -> Dettifoss -> Mývatn.

From Husavik people usually go on whale watching tours, but we only stopped for cupcakes.

Ásbyrgi canyon is impressive; hard to imagine all that being caused by water.

From the internets I imagined Ásbyrgi to be a cube of rock in the middle of nowhere; most of the photos show it like this. It’s not a cube; that’s just one end of a long wall.

Dettifoss is big, but I don’t have photos to do it justice. We went on the east side which is more gravel driving, but supposedly better view.

On our way back, accidental find - Hverarönd which gets you wondering “are we still on Earth?” - a bunch of fumaroles and mudpots.

Next up, Mývatn nature baths which folks say is a less touristy version of Blue Lagoon (we haven’t been to that one). Less crowded = good in my book; even if Mývatn ones are still quite crowded. Water from 36 to 45˚C (97 to 113 F), sulfur smell, oh my!

Drive back to Akureyri and observe sunlight scattering in distant cloud of rain.

Day 4: Godafoss, Dimmuborgir, Viti

Same area around Mývatn. Godafoss waterfall:

Dimmuborgir, which I wanted to check out if only because of Dimmu Borgir. It’s okay. Not metal though ;)

Cloud rolling over a mountain:

There’s also a Hverfjall crater right next to Dimmuborgir, but we decided not to climb it with kids. Next time?

Víti crater near Krafla, and some fumaroles right next to it.

Thermal power plants there look like some alien constructions, with pipes spanning vast distances. Here, Krafla power station:

Day 5: to Reykjavik

Long drive from Akureyri to Reykjavik. Unplanned find, Grábrók crater right next to the highway; in a group of 3 craters.

Nice fBm noise generator for the terrain you’ve got there, Iceland:

Arrive in Reykjavik, check out downtown. It’s full of colors!

Day 6: Geysir, Gullfoss

Geysir, the geyser that named them all, is mostly dormant now. However, Strokkur right next to it goes off each 3-5 minutes. There’s a lot of people there and I initially was wary of that (“them tourists ruin everything!”) but geysers are indeed impressive.

One of the eruptions, we were standing a bit further away to get a better view. Either the wind blew stronger, or the eruption was higher, or both – but the water just landed onto all of us. Good thing it was not hot. Achievement unlocked: got soaked by the geyser!

Gullfoss:

And finally, friendly sheep joining us for our lunch stop:

Next time?

This time, we’ve mostly been to the north and some major attractions around Reykjavik. Did not see any glaciers up close, nor anything that is in the south or middle. I guess that’s left for the next time(s). Update: “next time” has happened in 2018!

Most of the photos above shot by my wife Aistė. I’ll just end the post with this picture. BAA!


Reviewing ALL THE CODE

I like to review ALL THE THINGS that are happening in our codebase. Currently we have about 70 programmers, mostly comitting to a single Mercurial repository (into a ton of different branches), producing about 120 commits per day. I used to review all that using RhodeCode’s “journal” page, but Lucas taught me a much better way. So here it is.

Quick description of our current setup

We use Mercurial for source control, with largefiles extension for versioning big binary files.

Branches (“named branches”, i.e. not “bookmarks”) are used for branching. Joel’s hg init talks about using physical separate repositories to emulate branching, but don’t listen to that. That way lies madness. Mercurial’s branches work perfectly fine and are much superior workflow (we used to use “separate repos as branches” in the past, back when we used Kiln - not recommended).

We use RhodeCode as a web interface to Mercurial, and to manage repositories, user permissions etc. It’s also used to do “pull requests” and for commenting on the commits.

1. Pull everything

Each day, pull all the branches into your local repository clone. Just hg pull (difference from normal workflow, where you pull only your current branch, hg pull -b .).

Now you have the history of everything on your own machine.

2. Review in SourceTree

Use SourceTree’s Log view and there you have the commits. Look at each and every one of them.

Next, setup a “custom action” in SourceTree to go to a commit in RhodeCode. So whenever I see a commit that I want to comment on, it’s just a right click away:

SourceTree is awesome by the way (and it’s now both on Windows and Mac)!

3. Comment in RhodeCode

Add comments, approve/reject the commit etc.:

And well, that’s about it!

Clarifications

Why not use RhodeCode’s Journal page?

I used to do that for a long time, until I realized I’m wasting my time. The journal is okay to see that “some activity is happening”, but not terribly useful to get any real information:

I can see the commit SHAs, awesome! To see even the commit messages I have to hover over each of them and wait a second for the commit message to load via some AJAX. To see the actual commit, I have to open a new tab. At 100+ commits per day, that’s massive waste of browser tabs!

Why not use Kiln?

We used to use Kiln indeed. Everything seemed nice and rosy until we hit massive scalability problems (team size grew, build farm size grew etc.). We had problems like build farm agents stalling the checkout for half an hour, just waiting for Kiln to respond (Kiln itself is the only gateway to the underlying Mercurial repository, so even the build farm had to go through it).

Afer many, many months of trying to find solutions to the scalability problems, we just gave up. No amount of configuration / platform / hardware tweaking seemed to help. That was Kiln 2.5 or so; they might have improved since then. But, once bitten twice shy.

Kiln still has the best code review UI I’ve ever seen though. If only it scaled to our size…

Seriously, you review everything?

Not really. In the areas where I’d have no clue what’s going on anyway (audio, networking, build infrastructure, …), I just glance at the commit messages. Plus, all the code (or most of it?) is reviewed by other people as well; usually folks who have some clue.

I tried tracking review time last week, and it looks like I’m spending about an hour each day reviewing code like this. Is that too low or too high? I don’t know.

There’s a rumor going on that my office is nothing but a giant wall of monitors for watching all the code. That is not true. Really. Don’t look at the wall to your left.

How many issues do you find this way?

3-5 minor issues each day. By far the most common one: accidentally comitting some debugging code leftovers or totally unrelated files. More serious issues every few days, and a “stop! this is, like, totally wrong” maybe once a week.

Another side effect of reviewing everything, or at least reading commit messages: I can tell who just started doing what and preemptively prevent others from starting the same thing. Or relate a newly introduced problem (since these slip through code reviews anyway) to something that I remember was changed recently.