1: 2: 3: 4: 5: 6: 7: 8: 9: 10: | template |
It's a fair question but, unfortunately, the answer isn't straightforward. Let's see if we can unravel this mystery.
Before we go any further, I should point out this article is not targeted at someone who knows nothing about C++ templates, compilers or linkers. You need to have at least a basic understanding of these three topics otherwise what follows is likely to make little sense. Since these are all outside the scope of this article I leave it up to the reader to do their own research. At the very least, this article assumes you have at least a basic understanding of C++ templates. If not please refer to the following tutorials:
CPlusPlus.com:
http://www.cplusplus.com/d
C++ FAQ Lite
http://www.parashift.com/c
One of the things that often confuses an inexperienced C++ programmer, when first using templates, is why they can't put the declarations in the header file and the implementation in the .cpp file, just like they can with normal function or class definitions.
When C++ programs are compiled they are normally made up of a number of .cpp files with additional code included via header files. The generic term for a .cpp file and all of the headers it includes is "translation unit". Roughly speaking, a compiler translates the translation unit directly into an object file, hence the term translation unit.
Once all the translation units have been turned into object files it is the job of the linker to join all these object files together into one executable (or dynamic library). Part of the linking process is to resolve all symbols to ensure, for example, that if an object file requires a function, that it is available in one of the object files being linked and that it doesn't exist more than once (it should only be defined by one object file). If a symbol can't be resolved by the linker a linking error will result. Up until the point of linking each translation unit and resultant object file are completely agnostic, knowing nothing about each other.
So what does this have to do with templates? Well to answer this we need to know how the template instantiation process works.. It turns out that templates are parsed, not once, but twice. This process is explicitly defined in the C++ standard and although some compilers do ignore this, they are, in effect, non-compliant and may behave differently to what this article describes. This article describes how template instantiation works according to the current C++03 standard. Let's take a look at what each of these passes does:
1. Point of Declaration (PoD)
During the first parse, called the Point of Declaration, the template compiler checks the syntax of the template but does not consider the dependent types (the template parameters that form the templates types within the template). It is like checking the grammar of a paragraph without checking the meaning of the words (the semantics). Gramatically the paragraph can be correct but the arrangement of words may have no useful meaning. During the grammar checking phase we don't care about the meaning of the words only that the paragraph is syntactically correct.
So consider the following template code...
1: 2: 3: 4: 5: 6: | template |
This is syntactically sound; however, at this point we have no idea what type the dependent type T is so we just assume that in all cases of T it is correct to call member bar() on it. Of course, if type T doesn't have this member then we have a problem but until we know what type T is we don't know if there is a problem so this code is ok for the 1st pass.
2. Point of instantiation (PoI)
This is the point where we actually define a concrete type of our template. So consider these 2 concrete instantiations of the template defined above...
1: 2: 3: | foo(1); // this will fail the 2nd pass because an int (1 is an int) does not have a member function called bar() foo(b); // Assuming b has a member function called bar this instantiation is fine |
NB. it is perfectly legal to define a template that won't be corrected under all circumstances of instantiation. Since code for a template is not generated unless it is instantiated the compiler will not complain unless you try to instantiate it.
Now both the syntax and the semantics of the template are checked against the known dependent type to make sure that the generated code will be be correct. To do this the compiler must be able to see the full definition of the template. If the definition of the template is defined in a different translation unit from where it is being instantiated the compiler has no way to perform this check, so the template will not be instantiated. Remember that each translation unit is agnostic; the compiler can only see and process one at a time. Now, if the template is only used in one translation unit and the templated is defined in that translation unit this is not a problem. Of course, the whole point of a template is that it is generic code so there is a very good chance it will be used in more than one place.
So, let's recap where we are so far. If the template definition is in translation unit A and you try to instantiate it in translation unit B the template compiler will not be able to instantiate the template because it can't see the full definition so it will result in linker errors (undefined symbols). If everything is in one place then it will work. but it is not a good way to write templates. Sooner or later you'll probably end up using the template in other translation units because it is highly unlikely (although not improbable) that you'd go to all the effort of creating a generic template class/function that you'll only ever use in one place.
So how do we structure our code so that the compiler can see the definition of the template in all translation units where it is instantiated? The solution is really quite simple, put the templates definition somewhere that is visible to all PoIs and that is, of course, in a header. The header file can be included in both translation unit A and translation unit B so it will be completely visible to the template compiler in both.
It's interesting to note that the C++ standard does define the "export" keyword to try and resolve this issue. The idea is that you prefix the declaration of the template with the export keyword, which will tell the template parser to remember the definition for later reuse. It was introduced as a last minute addition to the standard and has yet to be adopted by any main stream compiler.
From a style point of view, if you want to preserve demarcation between declaration and definition with template classes you can still separate the class body and the member functions all in the same header.
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: | // First class declaration template |
On the rare occasion that your template class/function is only going to be used in one translation unit then the declaration and definition should go in there together in an unnamed namespace. This will prevent you, later, from trying to use the template somewhere else and scratching your head trying to figure out why you have linker errors about unresolved symbols Putting the template fully in the translation unit means it won't even compile if you try to reference it and the reason for that will be far more obvious.
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: | // My .cpp with a template I never plan to use elsewhere namespace { template |
Now, as usual with C++, things are not as straight forward as they could be because there is an exception to this rule about putting template code in headers. The exception is specializations. Since specializations, unlike templates themselves, are concrete entities (not templates that describe to the compiler how to instantiate a concrete entity implicitly) they have associated linker symbols so they must go into the .cpp file (or be explicitly declared inline) otherwise they'll breech the "One Definition Rule". Also, specializations of template class member functions must be outside the class, they cannot be implicitly inline within the class body. Unfortunately, Visual Studio doesn't enforce this... it is wrong, the C++03 standard clearly states they must go outside the body.
As a final note, there are other ways this issue of template declaration/definition seperation can be resolved (such as putting the template definition in a .cpp file, that you then include when needed); however, none of these are as simple or straightforward as just leaving the code definition in the header, which is the generally accepted best practice.
I hope this article has helped demystify why templates are generally defined in header files, contrary to normal good coding practice.
For more on C++ templates I recommend the C++ Templates FAQ
0 comments:
Post a Comment