Another cool MSVC flag: /d1reportTime
A while ago I found about a fairly useful Visual C++ flag that helps to investigate where the
compiler backend/optimizer spends time – /d2cgsummary
, see
blog post.
Turns out, there is a flag that reports where the compiler frontend spends time – /d1reportTime
.
I’ve been writing about lack of compiler time profiling/investigation tools, starting from that same old post, up to the more recent ones that focused on clang - one and two. All this recent discussion about C++ compilation times (among other issues…) led me to finding this fresh project on github, UECompileTimesVisualizer by @phyronnaz. Ignore everything about UE4 there (unless you work with it of course :)), the new thing for me was that it says:
Add the following arguments to the C/C++ Command Line option in your project settings: /Bt+ /d2cgsummary /d1reportTime
/Bt+
and /d2cgsummary
are known, but /d1reportTime
I haven’t heard about before. What is it?!
The only other place on the internet that talks about it – and it’s fairly impressive for anything to have just two mentions
on the internet – is a small comment in Microsoft
forums thread a couple months ago, and the only thing it says
is: “Can you also share the output from /d1reportTime?”
So what is /d1reportTime?
Passing /d1reportTime
to the MSVC compiler (cl.exe
) will make it print:
- Which header files are included (hierarchically), with time taken for each,
- Which classes are being parsed, with time taken for each,
- Which functions are being parsed, with time taken for each.
Additionally, at end of each list, “top N” are listed too. List of classes/functions seems to contain info about which templates (or nested classes? not sure yet) are being instantiated as well.
I have no idea which version of MSVC the flag has appeared in. VS2015 seems to not have it yet; and VS2017 15.7 (compiler version 19.14.26428) already has it; it might have appeared in earlier VS2017 builds but I haven’t checked.
Running it on anything non-trivial produces tons of output, e.g. I randomly ran it on a (fairly big)
Shader.cpp
file we have in our codebase, and it produces 30 thousand lines of output. Shortened a lot:
Shader.cpp
Include Headers:
Count: 542
c:\trunk\runtime\shaders\shader.h: 0.311655s
c:\trunk\runtime\shaders\shadertags.h: 0.004842s
c:\trunk\runtime\shaders\shaderpropertysheet.h: 0.127829s
c:\trunk\runtime\math\color.h: 0.021522s
c:\trunk\runtime\serialize\serializeutility.h: 0.012031s
c:\trunk\runtime\serialize\serializationmetaflags.h: 0.005427s
c:\trunk\runtime\utilities\enumtraits.h: 0.004738s
c:\trunk\runtime\testing\testingforwarddecls.h: 0.000331s
c:\trunk\runtime\math\vector2.h: 0.003681s
<... a lot more skipped ...>
Top 24 (top-level only):
c:\trunk\runtime\graphics\scriptablerenderloop\scriptablerendercontext.h: 0.619189s
c:\trunk\runtime\graphics\scriptablerenderloop\scriptablebatchrenderer.h: 0.407027s
c:\trunk\runtime\shaders\shader.h: 0.311655s
c:\trunk\runtime\shaders\shadernameregistry.h: 0.263254s
<... a lot more skipped ...>
Total: 2.125453s
Class Definitions:
Count: 10381
ShaderTagID: 0.001146s
EnumTraits::internal::IsReflectableEnum<`template-type-parameter-1'>: 0.000057s
EnumTraits::internal::DefaultTraitsHelper<`template-type-parameter-1',1>: 0.000085s
EnumTraits::internal::DefaultTraitsHelper<`template-type-parameter-1',0>: 0.000064s
<... a lot more skipped ...>
Top 100 (top-level only):
std::_Vector_const_iterator<class std::_Vector_val<struct std::_Simple_types<struct ShaderLab::SerializedProperty> > >: 0.031536s
PersistentManager: 0.029135s
ModuleMetadata: 0.029010s
ShaderLab::SerializedSubProgram: 0.020047s
GlobalCallbacks: 0.019955s
<... a lot more skipped ...>
Total: 1.412664s
Function Definitions:
Count: 20544
ShaderTagID::ShaderTagID: 0.000368s
ShaderTagID::ShaderTagID: 0.000024s
operator ==: 0.000089s
operator !=: 0.000017s
operator <: 0.000021s
<... a lot more skipped ...>
Top 100 (top-level only):
std::operator ==: 0.031783s
ShaderLab::SerializedTagMap::Transfer: 0.014827s
math::matrixToQuat: 0.014715s
math::mul: 0.012467s
Texture::TextureIDMapInsert: 0.012006s
<... a lot more skipped ...>
Total: 2.658951s
At 30k lines it is not the most intuitive for manual use, but a lot of interesting information is there. Particularly the “Top N” lists sound like good things to start with, if looking at the output manually.
Even for something simple like (see in Compiler Explorer):
#include <vector>
int main()
{
std::vector<int> v;
v.push_back(13);
return (int)v.size();
}
The /d1reportTime
produces almost 2000 lines of output. Again, shortened here:
example.cpp
Include Headers:
Count: 74
c:\msvc\v19_16\include\vector: 0.215198s
c:\msvc\v19_16\include\xmemory: 0.186372s
c:\msvc\v19_16\include\xmemory0: 0.184278s
c:\msvc\v19_16\include\cstdint: 0.015689s
c:\msvc\v19_16\include\yvals.h: 0.015103s
c:\msvc\v19_16\include\yvals_core.h: 0.012148s
c:\msvc\v19_16\include\xkeycheck.h: 0.000246s
c:\msvc\v19_16\include\crtdefs.h: 0.011183s
<... skipped remaining 66 headers ...>
Top 1 (top-level only):
c:\msvc\v19_16\include\vector: 0.215198s
Total: 0.215198s
Class Definitions:
Count: 829
_PMD: 0.000260s
_TypeDescriptor: 0.000088s
_s__CatchableType: 0.000113s
_s__CatchableTypeArray: 0.000025s
_s__ThrowInfo: 0.000044s
_s__RTTIBaseClassDescriptor: 0.000056s
_s__RTTIBaseClassArray: 0.000027s
_s__RTTIClassHierarchyDescriptor: 0.000026s
_s__RTTICompleteObjectLocator2: 0.000048s
__s_GUID: 0.000049s
<add_rvalue_reference><`template-type-parameter-1'>: 0.000012s
<remove_reference><`template-type-parameter-1'>: 0.000006s
<remove_reference><class $RBAAB &>: 0.000005s
<... skipped remaining 816 classes ...>
Top 100 (top-level only):
std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> >: 0.006266s
std::vector<int,class std::allocator<int> >: 0.003368s
std::pair<`template-type-parameter-1',`template-type-parameter-2'>: 0.002695s
std::basic_string<`template-type-parameter-1',`template-type-parameter-2',`template-type-parameter-3'>: 0.001880s
std::vector<`template-type-parameter-1',`template-type-parameter-2'>: 0.001117s
<... skipped remaining 95 top classes ...>
Total: 0.059558s
Function Definitions:
Count: 773
operator new: 0.000162s
operator delete: 0.000031s
operator new[]: 0.000025s
operator delete[]: 0.000009s
abs: 0.000048s
abs: 0.000021s
div: 0.000059s
div: 0.000043s
fpclassify: 0.000055s
<... skipped remaining 764 functions ...>
Top 100 (top-level only):
std::logic_error::logic_error: 0.006603s
main: 0.003693s
std::_Uninitialized_move: 0.003665s
std::vector<int,class std::allocator<int> >::push_back: 0.001054s
std::numeric_limits<bool>::round_error: 0.000730s
std::vector<int,class std::allocator<int> >::_Emplace_reallocate: 0.000612s
_scwprintf_p: 0.000414s
<... skipped remaining 93 top functions ...>
Total: 0.056901s
Having reports like that is also useful to realize just how much work we’re asking the compiler to do. The C++ snippet above is “as simple as it gets” when using “standard” C++ features. Just make a vector of integers, and add a number to it. The compiler ends up including 74 headers, parsing over 800 classes and almost 800 functions. Just to make a simple “push an integer into a vector” task.
Yes, I know about precompiled headers. And I know about upcoming C++ modules. You can save on sending that email :)
Anyway, having that tooling is really good! Thanks Microsoft Visual C++ team! Also, u/gratilup mentions in this reddit thread that MSVC team is working on ETW/WPA based tooling to visualize compilation times. Great!