diff options
-rw-r--r-- | doc/manual.cli | 399 |
1 files changed, 206 insertions, 193 deletions
diff --git a/doc/manual.cli b/doc/manual.cli index c88b825..cf1d5d2 100644 --- a/doc/manual.cli +++ b/doc/manual.cli @@ -755,7 +755,9 @@ the cross-referencing mechanism for linking a program from multiple, separately-compiled translation units. Not all external names end up becoming symbols and symbols are often \i{decorated} with additional information, for example, a namespace. We often talk about a symbol having to be satisfied by -linking an object file or a library that provides it. +linking an object file or a library that provides it. Similarly, duplicate +symbols issues may arise if more than one object file or library provides +the same symbol. What is a C++ module? It is hard to give a single but intuitive answer to this question. So we will try to answer it from three different perspectives: that @@ -773,14 +775,14 @@ build speedups since importing a module, unlike including a header, should be essentially free. Modules are also the first step to not needing the preprocessor in most translation units. Finally, modules have a chance of bringing to mainstream reliable and easy to setup distributed C++ compilation, -since now build systems can make sure compilers on the local and remote hosts -are provided with identical inputs. +since with modules build systems can make sure compilers on the local and +remote hosts are provided with identical inputs. To refer to a module we use a \i{module name}, a sequence of dot-separated identifiers, for example \c{hello.core}. While the specification does not assign any hierarchical semantics to this sequence, it is customary to refer -to \c{hello.core} as a submodule of \c{hello}. We discuss submodules and the -module naming guidelines below. +to \c{hello.core} as a submodule of \c{hello}. We discuss submodules and +provide the module naming guidelines below. From a consumer's perspective, a module is a collection of external names, called \i{module interface}, that become \i{visible} once the module is @@ -792,12 +794,13 @@ import hello.core What exactly does \i{visible} mean? To quote the standard: \i{An import-declaration makes exported declarations [...] visible to name lookup in -the current translation unit, in the same namespaces and contexts [...]}. One -intuitive way to think about this visibility is \i{as-if} there were only a -single translation unit for the entire program that contained all the modules -as well as all their consumers. In such a translation unit all the names would -be visible to everyone in exactly the same way and no entity would be -redeclared. +the current translation unit, in the same namespaces and contexts [...]. [ +Note: The entities are not redeclared in the translation unit containing the +module import declaration. -- end note ]} One intuitive way to think about +this visibility is \i{as-if} there were only a single translation unit for the +entire program that contained all the modules as well as all their +consumers. In such a translation unit all the names would be visible to +everyone in exactly the same way and no entity would be redeclared. This visibility semantics suggests that modules are not a name scoping mechanism and are orthogonal to namespaces. Specifically, a module can export @@ -805,9 +808,9 @@ names from any number of namespaces, including the global namespace. While the module name and its namespace names need not be related, it usually makes sense to have a parallel naming scheme, as discussed below. Finally, the \c{import} declaration does not imply any additional visibility for names -declared inside namespaces and to access such names we must continue using the -standard mechanisms, such as qualification or using declaration/directive. -For example: +declared inside namespaces. Specifically, to access such names we must +continue using the standard mechanisms, such as qualification or using +declaration/directive. For example: \ import hello.core; // Exports hello::say(). @@ -820,13 +823,13 @@ say (); // Ok. \ Note also that from the consumer's perspective a module does not provide -any symbols, only C++ entity names. If we use a name from a module, then we -may have to satisfy the corresponding symbol(s) using the usual mechanisms: +any symbols, only C++ entity names. If we use names from a module, then we +may have to satisfy the corresponding symbols using the usual mechanisms: link an object file or a library that provides them. In this respect, modules are similar to headers and as with headers module's use is not limited to libraries; they make perfect sense when structuring programs. Furthermore, a library may also have private or implementation modules that are not -meant to be used by the library's users. +meant to be consumed by the library's users. The producer perspective on modules is predictably more complex. In pre-modules C++ we only had one kind of translation unit (or source @@ -859,9 +862,9 @@ files, we recommend that a different extension be used to distinguish them as such, similar to header files. While the compiler vendors suggest various (and predictably different) extensions, our recommendation is \c{.mxx} for the \c{.hxx/.cxx} source file naming and \c{.mpp} for \c{.hpp/.cpp}. And if you -are using some other naming scheme, perhaps now is a good opportunity to -switch to one of the above. Using the source file extension for module -implementation units appears reasonable and that's what we recommend. +are using some other naming scheme, then perhaps now is a good opportunity to +switch to one of the above. Continuing using the source file extension for +module implementation units appears reasonable and that's what we recommend. A module declaration (exporting or non-exporting) starts a \i{module purview} that extends until the end of the module translation unit. Any name declared @@ -934,12 +937,12 @@ were used at all). Non-exported names, on the other hand, have \i{module linkage}: their symbols can be resolved from this module's units but not from other translation units. They also cannot clash with symbols for identical names from other modules (and non-modules). This is usually achieved by -decorating the non-exported symbols with a module name. +decorating the non-exported symbols with the module name. -This ownership model has one important backwards-compatibility implication: a +This ownership model has an important backwards compatibility implication: a library built with modules enabled can be linked to a program that still uses -headers. And vice versa: we can build and use a module for a library that was -built with headers. +headers. And even the other way around: we can build and use a module for a +library that was built with headers. What about the preprocessor? Modules do not export preprocessor macros, only C++ names. A macro defined in the module interface unit cannot affect @@ -981,8 +984,8 @@ confusingly indicate that there is no known conversion from a C string to \"something\" called \c{std::string}. But with the understanding of the difference between \c{import} and \c{#include} the reason should be clear: while the module interface \"sees\" \c{std::string} (because it imported its -module), we do not (since we did not). So the fix is to explicitly import -\c{std.core}: +module), we (the consumer) do not (since we did not). So the fix is to +explicitly import \c{std.core}: \ import std.core; @@ -999,7 +1002,7 @@ A module, however, can choose to re-export a module it imports. In this case, all the names from the imported module will also be visible to the importing module's consumers. For example, with this change to the module interface the first version of our consumer will compile without errors (note that whether -this is a good design choice is debatable): +this is a good design choice is debatable, as discussed below): \ export module hello; @@ -1032,7 +1035,7 @@ export \ Besides starting a module purview, a non-exporting module declaration in the -implemenation unit also makes non-internal linkage names declared or made +implementation unit also makes non-internal linkage names declared or made visible in the \i{interface purview} visible in the \i{implementation purview}. In this sense non-exporting module declaration acts as an extended \c{import}. For example: @@ -1048,7 +1051,7 @@ export module hello.extra; // Start of interface purview. import hello.core; // Visible (exports core()). void -extra (); // visible. +extra (); // Visible. static void extra2 (); // Not visible (internal linkage). @@ -1078,12 +1081,12 @@ The final perspective that we consider is that of the build system. From its point of view the central piece of the module infrastructure is the \i{binary module interface}: a binary file that is produced by compiling the module interface unit and that is required when compiling any translation unit that -imports this module (as well as the module's implementation units). +imports this module as well as the module's implementation units. -So, in a nutshell, the main functionality of a build system when it comes to +Then, in a nutshell, the main functionality of a build system when it comes to modules support is figuring out the order in which all the translation units -should be compiled and making sure that every compilation is able to find the -binary module interfaces it needs. +should be compiled and making sure that every compilation process is able to +find the binary module interfaces it needs. Predictably, the details are more complex. Compiling a module interface unit produces two outputs: the binary module interface and the object file. The @@ -1097,7 +1100,7 @@ interfaces must be discoverable as well, recursively. In fact, currently, all the imports are handled like this, though a different implementation is at least plausible, if unlikely. -While the details vary between compilers, the contents of the module binary +While the details vary between compilers, the contents of the binary module interface can range from a stream of preprocessed tokens to something fairly close to object code. As a result, binary interfaces can be sensitive to the compiler options and if the options used to produce the binary interface (for @@ -1118,7 +1121,7 @@ compile them, again, on the side. Compiler support for C++ Modules is still experimental. As a result, it is currently only enabled if the C++ standard is set to \c{experimental}. After loading the \c{cxx} module we can check if modules are enabled using the -\c{cxx.features.modules} boolean variable. This is what the corresponding +\c{cxx.features.modules} boolean variable. This is what the relevant \c{root.build} fragment could look like for a modularized project: \ @@ -1126,13 +1129,13 @@ cxx.std = experimental using cxx -assert $cxx.features.modules 'c++ compiler does not support modules' +assert $cxx.features.modules 'compiler does not support modules' mxx{*}: extension = mxx cxx{*}: extension = cxx \ -To support C++ modules the \c{cxx} (build system) module defines several +To support C++ modules the \c{cxx} module (build system) defines several additional target types. The \c{mxx{\}} target is a module interface unit. As you can see from the above \c{root.build} fragment, in this project we are using the \c{.mxx} extension for our module interface files. While @@ -1140,18 +1143,18 @@ you can use the same extension as for \c{cxx{\}} (source files), this is not recommended since some functionality, such as wildcard patterns, will become unusable. -The \c{bmi{\}} group and its \c{bmie{\}}, \c{bmia{\}}, and \c{bmis{\}} -members are used for binary module interfaces targets. We normally do -not need to mention them explicitly in our buildfiles except, perhaps, -to specify additional, module interface-specific compile options. We -will see some example of this below. +The \c{bmi{\}} group and its \c{bmie{\}}, \c{bmia{\}}, and \c{bmis{\}} members +are used to represent binary module interfaces targets. We normally do not +need to mention them explicitly in our buildfiles except, perhaps, to specify +additional, module interface-specific compile options. We will see some +examples of this below. To build a modularized executable or library we simply list the module -interfaces as its prerequisites, just as we do source files. As an example, -let's build the \c{hello} program that we have started in the introduction -(you can find the complete project in the \l{https://build2.org/pkg/hello -Hello Repository} under \c{mhello}). Specifically, we assume our project -contains the following files: +interfaces as its prerequisites, just as we do for source files. As an +example, let's build the \c{hello} program that we have started in the +introduction (you can find the complete project in the +\l{https://build2.org/pkg/hello Hello Repository} under +\c{mhello}). Specifically, we assume our project contains the following files: \ // file: hello.mxx (module interface) @@ -1214,9 +1217,10 @@ exe{hello}: cxx{driver} lib{hello} lib{hello}: {mxx cxx}{hello} \ -As you might have surmised from the above, the modules support implementation -automatically resolves imports to module interface units that are specified -either as direct prerequisites or as prerequisites of library prerequisites. +As you might have surmised from this example, the modules support in +\c{build2} automatically resolves imports to module interface units that are +specified either as direct prerequisites or as prerequisites of library +prerequisites. To perform this resolution without a significant overhead, the implementation delays the extraction of the actual module name from module interface units @@ -1258,10 +1262,11 @@ they are already in the project directory called \c{hello/}. This will work since our module names can still be guessed correctly and unambiguously. If a guess turns out to be incorrect, the implementation issues diagnostics -and exits with an error. To resolve this situation we can either adjust the -interface file names or we can specify the module name explicitly with the -\c{cc.module_name} variable. The latter approach can be used with interface -file names that have nothing in common with module names, for example: +and exits with an error before attempting to build anything. To resolve this +situation we can either adjust the interface file names or we can specify the +module name explicitly with the \c{cc.module_name} variable. The latter +approach can be used with interface file names that have nothing in common +with module names, for example: \ mxx{foobar}@./: cc.module_name = hello @@ -1301,7 +1306,7 @@ module hello; \ Note, however, that the modules support in \c{build2} provides temporary -\"magic\" that allows us to use the new syntax even with VC. +\"magic\" that allows us to use the new syntax even with VC (don't ask how). \h2#cxx-modules-symexport|Module Symbols Exporting| @@ -1336,9 +1341,9 @@ have a unique per-library name (that \c{LIBHELLO_} prefix) because a header from one library can be included while building another library. We can continue using the same export macro and header with modules and, in -fact, that's the recommended approach when maintaining dual, header/module -arrangements for backwards compatibility (discussed below). However, for -module-only codebases, we have an opportunity to improve the situation in two +fact, that's the recommended approach when maintaining the dual, header/module +arrangement for backwards compatibility (discussed below). However, for +modules-only codebases, we have an opportunity to improve the situation in two ways: we can use a single, keyword-like macro instead of a library-specific one and we can make the build system manage it for us thus getting rid of the export header. @@ -1446,25 +1451,25 @@ line continuations. Modules are a physical design mechanism for structuring and organizing our code. Their explicit exportation semantics combined with the way modules are -built makes many aspects of creating and consuming modules significantly +built make many aspects of creating and consuming modules significantly different compared to headers. This section provides basic guidelines for designing modules. We start with the overall considerations such as module granularity and partitioning into translation units then continue with the structure of typical module interface and implementation units. The following -section disscusses practical approaches to modularizing existing code and -providing the dual, header/module interface for backwards-compatibility. +section discusses practical approaches to modularizing existing code and +providing dual, header/module interfaces for backwards-compatibility. Unlike headers, the cost of importing modules should be negligible. As a result, it may be tempting to create \"mega-modules\", for example, one per library. After all, this is how the standard library is modularized with its fairly large \c{std.core} and \c{std.io} modules. -There is, however, a significant drawback to going this route: every time we -make a change, all consumers of such a mega-module will have to be recompiled, +There is, however, a significant drawback to this choice: every time we make a +change, all consumers of such a mega-module will have to be recompiled, whether the change affects them or not. And the bigger the module the higher -the chance that the change does not affect a large portion of the consumers. -Note that this is not an issue for the standard library modules since they are -not expected to change often. +the chance that any given change does not affect a large portion of the +module's consumers. Note also that this is not an issue for the standard +library modules since they are not expected to change often. Another, more subtle, issue with mega-modules (which does affect the standard library) is the inability to re-export only specific interfaces, as will be @@ -1475,50 +1480,49 @@ The other extreme in choosing module granularity is a large number of consumers. The sensible approach is then to create modules of conceptually-related and -commonly-used entities possibly together with aggregate modules for ease of -importation. Which also happens to be generally good design. +commonly-used entities possibly complemented with aggregate modules for ease +of importation. This also happens to be generally good design. As an example, let's consider an XML library that provides support for both parsing and serialization. Since it is common for applications to only use one -of the functionalities, it probably makes sense to provide the \c{xml.parser} -and \c{xml.serializer} modules. While it is not too tedious to import both, -for convenience we could also provide the \c{xml} module that re-exports the -other two. +of the functionalities, it makes sense to provide the \c{xml.parser} and +\c{xml.serializer} modules. While it is not too tedious to import both, for +convenience we could also provide the \c{xml} module that re-exports the two. Once we are past selecting an appropriate granularity for our modules, the next question is how to partition them into translation units. A module can consist of just the interface unit and, as discussed above, such a unit can contain anything an implementation unit can, including non-inline function -definitions. Some then view this as an opportunity to get rid of the +definitions. Some may then view this as an opportunity to get rid of the header/source separation and have everything in a single file. There are a number of drawbacks with this approach: Every time we change anything in the module interface unit, all its consumers have to be recompiled. If we keep everything in a single file, then every time we change -the implementation we will trigger a recompliations that would have been -avoided had the implementation been factored out into a separate unit. +the implementation we trigger a recompilations that would have been avoided +had the implementation been factored out into a separate unit. Another issues is the readability of the interface which could be significantly reduced if littered with implementation details. We could keep the interface separate by moving the implementation to the bottom of the -interface file but then we might as well move it to a separate file and avoid -unnecessary recompilations. +interface file but then we might as well move it into a separate file and +avoid the unnecessary recompilations. The sensible guideline is then to have a separate module implementation unit -exept perhaps for modules with a simple implementation that is mostly -inline/template. Note that more complex modules may have sevaral +except perhaps for modules with a simple implementation that is mostly +inline/template. Note that more complex modules may have several implementation units, however, based on our granularity guideline, those -should be fairly rare. +should be rare. Once we start writing our first real module the immediate question that -ususally comes up is where to put \c{#include} directives and \c{import} +normally comes up is where to put \c{#include} directives and \c{import} declarations and in what order. To recap, a module unit, both interface and implementation, is split into two parts: before the module declaration which obeys the usual or \"old\" translation unit rules and after the module declaration which is the module purview. Inside the module purview all non-exported declarations have module linkage which means their symbols are invisible to any other module (including the global module). With this -understandig, consider the following module interface: +understanding, consider the following module interface: \ export module hello; @@ -1532,16 +1536,16 @@ include, recursively) are now declared as having the \c{hello} module linkage. The result of doing this can range from silent code blot to strange-looking unresolved symbols. -The guideline this leads to should be clear: including a header in module +The guideline this leads to should be clear: including a header in the module purview is almost always a bad idea. There are, however, a few types of headers that may make sense to include in the module purview. The first are headers that only define preprocessor macros, for example, configuration or export headers. There are also cases where we do want the included -declarations to end up in the module purview. The most common example is files -that contain inline/template function implementations that have been factored -out for code organization reasons. As an example, consider the following -module interface that uses an export headers (which sets up symbols exporting -macros) as well as an inline file: +declarations to end up in the module purview. The most common example is +inline/template function implementations that have been factored out into +separate files for code organization reasons. As an example, consider the +following module interface that uses an export headers (which presumably sets +up symbols exporting macros) as well as an inline file: \ #include <string> @@ -1560,15 +1564,16 @@ export namespace hello A note on inline/template files: in header-based projects we could include additional headers in those files, for example, if the included declarations -are only needed in the implementation. For the reason just discussed, this -won't work with modules and we have to move all the includes into the +are only needed in the implementation. For the reasons just discussed, this +does not work with modules and we have to move all the includes into the interface file, before the module purview. On the other hand, with modules, it -is safe to use using-directives (for example, \c{using namespace std;}) in -inline/template files (and, with care, even in the interface file). +is safe to use namespace-level using-directives (for example, \c{using +namespace std;}) in inline/template files (and, with care, even in the +interface file). -What about imports, where should we import other modules. Again, to recap, -unlike a header inclusing, an \c{import} declaration only makes exported names -visible without (re)declaring them. As result, in a module implementation +What about imports, where should we import other modules? Again, to recap, +unlike a header inclusion, an \c{import} declaration only makes exported names +visible without redeclaring them. As result, in module implementation units, it doesn't really matter where we place imports, in or out of the module purview. There are, however, two differences when it comes to module interface units: only imports in the purview are visible to implementation @@ -1576,7 +1581,7 @@ units and we can only re-export an imported module from the purview. The guideline is then for interface units to import in the module purview unless there is a good reason not to make the import visible to the -implementation units. And for implementation units is to always import in the +implementation units. And for implementation units to always import in the purview for consistency. For example: \ @@ -1596,7 +1601,8 @@ export namespace hello #include <libhello/hello.ixx> \ -Based on these guidelines we can also create a module interface unit template: +By putting all these guidelines together we can then create a module interface +unit template: \ // Module interface unit. @@ -1628,63 +1634,62 @@ module <name>; // Start of module purview. <module implementation> \ -Let's also discuss module naming. Module names are in a separate \"name -plane\" and do not collide with namespace, type, or function names. Also, as -mentioned earlier, the standard does not assign a hierarchical meaning to -module names though it is customary to assume that module \c{hello.core} -is a submodule of \c{hello} and importing the latter also imports the -former. +Let's now discuss module naming. Module names are in a separate \"name plane\" +and do not collide with namespace, type, or function names. Also, as mentioned +earlier, the standard does not assign a hierarchical meaning to module names +though it is customary to assume module \c{hello.core} is a submodule of +\c{hello} and importing the latter also imports the former. It is important to choose good names for public modules (that is, modules packaged into libraries and used by a wide range of consumers) since changing them later can be costly. We have more leeway with naming private modules -(that is, the ones used by programs or internal to the libraries) though it's -worth it to come up with a consistent naming scheme here as well. +(that is, the ones used by programs or internal to libraries) though it's +worth coming up with a consistent naming scheme here as well. The general guideline is to start names of public modules with the library's namespace name followed by a name describing the module's functionality. In -particular, if a module is dedicated to housing a single class (or, more -generally, has a single primary entiry), then it makes sense to use its name -as the module name's last component. +particular, if a module is dedicated to a single class (or, more generally, +has a single primary entity), then it makes sense to use its name as the +module name's last component. As a concrete example, consider \c{libbutl} (the \c{build2} utility library): All its components are in the \c{butl} namespace so all its module names start -with \c{butl.}. One of its components is the \c{small_vector} class template +with \c{butl.} One of its components is the \c{small_vector} class template which resides in its own module called \c{butl.small_vector}. Another component is a collection of string parsing utilities that are grouped into -the \c{butl::string_parser} namespace with the corresponding module name -called \c{butl.string_parser}. +the \c{butl::string_parser} namespace with the corresponding module called +\c{butl.string_parser}. -When is it a good idea to re-export a module? The two straightfowards cases +When is it a good idea to re-export a module? The two straightforward cases are when we are building an aggregate module out of submodules, for example, \c{xml} out of \c{xml.parser} and \c{xml.serializer}, or when one module -extends or superceeds another, for example, as \c{std.core} extends +extends or supersedes another, for example, as \c{std.core} extends \c{std.fundamental}. It is also clear that there is no need to re-export a -module that we only use in the implementation of our module. The case when we -use a module in our interface is, however, a lot less clear cut. +module that we only use in the implementation. The case when we use a module +in our interface is, however, a lot less clear cut. But before considering the last case in more detail, let's understand the issue with re-export. In other words, why not simply re-export any module we -import in our interface? In essence, re-export implictly injects another +import in our interface? In essence, re-export implicitly injects another module import anywhere our module is imported. If we re-export \c{std.core} -then any consumer of our module will also automatically \"see\" all the names +then consumers of our module will also automatically \"see\" all the names exported by \c{std.core}. They can then start using names from \c{std} without -explicitly importing \c{std.core} and everthing will compile until one day -they may no longer need to import our module or we no longer need to import +explicitly importing \c{std.core} and everything will compile until one day +they no longer need to import our module or we no longer need to import \c{std.core}. In a sense, re-export becomes part of our interface and it is generally good design to keep interfaces minimal. And so, at the outset, the guideline is then to only re-export the minimum -necessary (and which is the reason why it may make sense to divide +necessary. This, by the way, is the reason why it may make sense to divide \c{std.core} into submodules such as \c{std.core.string}, \c{std.core.vector}, -etc). +etc. -Let's now discuss a few concere examples to get a sense of when re-export +Let's now discuss a few concrete examples to get a sense of when re-export might or might not be appropriate. Unfortunately, there does not seem to be a -hard and fast rule and instead one has to rely on a good sense of design. +hard and fast rule and instead one has to rely on their good sense of design. To start, let's consider a simple module that uses \c{std::string} in its -inteface: +interface: \ export module hello; @@ -1736,8 +1741,8 @@ if (a == b) // Error. \ We don't reference \c{std::vector} directly so presumably we shouldn't need to -import its module. However, the comparion won't compile: our \c{small_vector} -implementation re-uses the comparion operators provided by \c{std::vector} +import its module. However, the comparison won't compile: our \c{small_vector} +implementation re-uses the comparison operators provided by \c{std::vector} (via implicit to-base conversion) but they aren't visible. There is palpable difference between the two cases: the first merely uses @@ -1756,7 +1761,7 @@ incur some development overhead compared to the old, headers-only approach. \h2#cxx-modules-existing|Modularizing Existing Code| -The aim of this section is to provide practical guideliness to modularizing +The aim of this section is to provide practical guidelines to modularizing existing codebases as well as supporting the dual, header/module interface for backwards-compatibility. @@ -1781,11 +1786,12 @@ export There are several issue that usually make this unworkable. Firstly, the header we are trying to export most likely includes other headers. For example, our \c{hello.hxx} may include \c{<string>} and we have already discussed why -including it in the module purview is a bad idea. Secondly, the included -header may declare more names than what should be exported, for example, some -implementation details. In fact, it may declare names with local linkage -(uncommon for headers but not impossible) which is illegal to export. Finally, -the header may define macros which will no longer be visible to the consumer. +including it in the module purview, let alone exporting its names, is a bad +idea. Secondly, the included header may declare more names than what should be +exported, for example, some implementation details. In fact, it may declare +names with internal linkage (uncommon for headers but not impossible) which +are illegal to export. Finally, the header may define macros which will no +longer be visible to the consumers. Sometimes, however, this can be the only approach available (for example, if trying to non-intrusively modularize a third-party library). It is possible to @@ -1794,7 +1800,7 @@ headers that should not be exported. Here we rely on the fact that the second inclusion of the same header will be ignored. For example: \ -#include <string> // Pre-include to suppress inclusion in hello.hxx. +#include <string> // Pre-include to suppress inclusion below. export module hello; @@ -1805,10 +1811,11 @@ export \ Needless to say this approach is very brittle and usually requires that you -place all the inter-related headers into a single module. +place all the inter-related headers into a single module. As a result, its use +is best limited to exploratory modularization and early prototyping. When starting modularization of a codebase there are two decisions we have to -make at the outset: the level of the modules support we can rely upon and the +make at the outset: the level of the C++ modules support we can assume and the level of backwards compatibility we need to provide. The two modules support levels we distinguish are just modules and modules @@ -1828,51 +1835,49 @@ offering the library as headers if we have a large number of existing consumers that cannot possibly be all modularized at once (or even ever). So the situation we may end up in is a mixture of consumers trying to use the same build of our library with some of them using modules and some \- -headers. The situation where we may want to consume a library built with -headers via modules is also not far fetched: the library might have been built -with an older version of the compiler (for example, it was installed from a -distribution's package) while the consumer is being built with a compiler -version that supports modules. Note that as discussed earlier the modules -ownership semantics supports both kinds of \"cross-usage\". +headers. The case where we may want to consume a library built with headers +via modules is not as far fetched as it may seem: the library might have been +built with an older version of the compiler (for example, it was installed +from a distribution's package) while the consumer is being built with a +compiler version that supports modules. Note also that as discussed earlier +the modules ownership semantics supports both kinds of such \"cross-usage\". Generally, compiler implementations do not support mixing inclusion and importation of the same entities in the same translation unit. This makes migration tricky if you plan to use the modularized standard library because -of its parvasive use. There are two plausible strategies to handling this +of its pervasive use. There are two plausible strategies to handling this aspect of migration: If you are planning to consume the standard library exclusively as modules, then it may make sense to first change your entire codebase to do that. Simply replace all the standard library header inclusions with importation of the relevant \c{std.*} modules. -The alternative strategy is to first complete the modularization of your -entire project (as discussed next) while continuing consuming the standard -library as headers. Once this is done, we can normally switch to using the -modularized standard library quite easily. The reason for waiting until the -complete modularization is to eliminate header inclusion between components in -our project which would often result in conflicting styles of the standard -library consumption. +The alternative strategy is to first complete the modularization of our entire +project (as discussed next) while continuing consuming the standard library as +headers. Once this is done, we can normally switch to using the modularized +standard library quite easily. The reason for waiting until the complete +modularization is to eliminate header inclusions between components which +would often result in conflicting styles of the standard library consumption. Note also that due to the lack of header re-export support discussed earlier, it may make perfect sense to only support the modularized standard library when modules are enabled even when providing backwards compatibility with headers. In fact, if all the compiler/standard library implementations that your project caters to support the modularize standard library, then there is -little sense not to impose such as restriction. +little sense not to impose such a restriction. -The overall strategy for modularizing our own componets is to identify and +The overall strategy for modularizing our own components is to identify and modularize inter-dependent sets of headers one at a time starting from the -lower-level components (so that any newly modularized set only depends on the -already modularized ones). After converting each set we can switch its +lower-level components. This way any newly modularized set will only depend on +the already modularized ones. After converting each set we can switch its consumers to using imports keeping our entire project buildable and usable. -While it would have been even better to be able modularize just a single -component at a time, this does not seem to work in practice because we will -have to continue consuming some of the components as headers. Since such -headers can only be imported out of module purview it becomes hard to reason -(both for us and the compiler) what is imported/included and where. For -example, it's not uncommon to end up importing the module in its -implementation unit which not something that all implementations handle -garcefully. +While ideally we would want to be able modularize just a single component at a +time, this does not seem to work in practice because we will have to continue +consuming some of the components as headers. Since such headers can only be +imported out of the module purview, it becomes hard to reason (both for us and +often the compiler) what is imported/included and where. For example, it's not +uncommon to end up importing the module in its implementation unit which not +something that all the compilers can handle gracefully. Let's now explore how we can provide the various levels of backwards compatibility discussed above. Here we rely on two feature test macros to @@ -1885,7 +1890,7 @@ we can use the module interface and implementation unit templates presented earlier and follow the above guidelines. If we continue consuming the standard library as headers, then we don't need to change anything in this area. If we only want to support the modularized standard library, then we simply replace -the standard library header inclusions with the corresponing module +the standard library header inclusions with the corresponding module imports. If we want to support both ways, then we can use the following templates. The module interface unit template: @@ -1893,7 +1898,7 @@ templates. The module interface unit template: // C includes, if any. #ifndef __cpp_lib_modules -<std-includes> +<std includes> #endif // Other includes, if any. @@ -1901,8 +1906,10 @@ templates. The module interface unit template: export module <name>; #ifdef __cpp_lib_modules -<std-imports> +<std imports> #endif + +<module interface> \ The module implementation unit template: @@ -1911,9 +1918,9 @@ The module implementation unit template: // C includes, if any. #ifndef __cpp_lib_modules -<std-includes> +<std includes> -<extra-std-includes> +<extra std includes> #endif // Other includes, if any. @@ -1921,8 +1928,10 @@ The module implementation unit template: module <name>; #ifdef __cpp_lib_modules -<extra-std-imports> // Only imports additional to interface. +<extra std imports> // Only additional to interface. #endif + +<module implementation> \ For example: @@ -1996,7 +2005,7 @@ unit template: // C includes, if any. #ifndef __cpp_lib_modules -<std-includes> +<std includes> #endif // Other includes, if any. @@ -2005,24 +2014,26 @@ unit template: export module <name>; #ifdef __cpp_lib_modules -<std-imports> +<std imports> #endif #endif + +<module interface> \ The module implementation unit template: \ #ifndef __cpp_modules -#include <module-interface-file> +#include <module interface file> #endif // C includes, if any. #ifndef __cpp_lib_modules -<std-includes> +<std includes> -<extra-std-includes> +<extra std includes> #endif // Other includes, if any @@ -2031,15 +2042,17 @@ The module implementation unit template: module <name>; #ifdef __cpp_lib_modules -<extra-std-imports> // Only imports additional to interface. +<extra std imports> // Only additional to interface. #endif #endif + +<module implementation> \ Besides these templates we will most likely also need an export header that appropriately defines a module export macro depending on whether modules are used or not. This is also the place where we can handle symbol exporting. For -example, here is what it can look like for our \c{libhello} library: +example, here is what it could look like for our \c{libhello} library: \ // export.hxx (module and symbol export) @@ -2136,25 +2149,25 @@ inclusion or importation depending on the modules support availability, for example: \ -#ifndef __cpp_modules -#include <libhello/hello.mxx> -#else +#ifdef __cpp_modules import hello; +#else +#include <libhello/hello.mxx> #endif \ -Predictably, the final backwards compatibility level (\c{modules-and-headers}) +Predictably, the final backwards compatibility level (\i{modules-and-headers}) is the most onerous to support. Here existing consumers have to continue working with the modularized version of our library which means we have to -retain all the existing headers. We also cannot assume that just because +retain all the existing header files. We also cannot assume that just because modules are available they are used (a consumer may still prefer headers), which means we cannot rely on (only) the \c{__cpp_modules} and \c{__cpp_lib_modules} macros to make the decisions. -One way to arrange this is to retain the header and adjust it according to the -previous level template but with one important difference: instead of using -the standard modules macro we use our custom ones (we can also have -unconditional \c{#pragma once}). For example: +One way to arrange this is to retain the headers and adjust them according to +the \i{modules-or-headers} template but with one important difference: instead +of using the standard module macros we use our custom ones (and we can also +have unconditional \c{#pragma once}). For example: \ // hello.hxx (module header) @@ -2182,12 +2195,12 @@ LIBHELLO_MODEXPORT namespace hello \ Now if this header is included (for example, by an existing consumer) then -none of these macros will be defined and the header will act as, well, a plain -old header. Note that we will also need to make the equivalent change in the -export header. +none of the \c{LIBHELLO_*MODULES} macros will be defined and the header will +act as, well, a plain old header. Note that we will also need to make the +equivalent change in the export header. -We also provide the module interface unit which appropriately defines the two -custom macros and then simply includes the header: +We also provide the module interface files which appropriately define the two +custom macros and then simply includes the corresponding headers: \ // hello.mxx (module interface) @@ -2206,8 +2219,8 @@ custom macros and then simply includes the header: The module implementation unit can remain unchanged. In particular, we continue including \c{hello.mxx} if modules support is unavailable. However, if you find the use of different macros in the header and source files -confusing, then instead it can be adjusted as follows (note that now we are -including \c{hello.hxx}): +confusing, then instead it can be adjusted as follows (note also that now we +are including \c{hello.hxx}): \ // hello.cxx (module implementation) @@ -2241,13 +2254,13 @@ import std.io; ... \ -In this case it may also make sense to factor the \c{*_MODULES} macro -defintions into a common header. +In this case it may also make sense to factor the \c{LIBHELLO_*MODULES} macro +definitions into a common header. In the \i{modules-and-headers} setup the existing consumers that would like to continue using headers don't require any changes. And for those that would like to use module if available the arrangement is the same as for the -previous compatibility level. +\i{modules-or-headers} compatibility level. If our module needs to \"export\" macros then the recommended approach is to simply provide an additional header that the consumer includes. While it might |