-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
One Generation Step For All Files #8
Open
LittleCoinCoin
wants to merge
11
commits into
jsoysouvanh:dev
Choose a base branch
from
LittleCoinCoin:one-gen-for-all-files
base: dev
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
One Generation Step For All Files #8
LittleCoinCoin
wants to merge
11
commits into
jsoysouvanh:dev
from
LittleCoinCoin:one-gen-for-all-files
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Major: - The toml dependency of Kodgen extensively uses the value of macro *__cplusplus* in order to discriminate which version of the *std* is used. - However, [since Visual Studio 2017 15.7](https://learn.microsoft.com/en-us/cpp/build/reference/zc-cplusplus?view=msvc-170), Microsoft leaves the value of *__cplusplus* equal to 199711L unless the compile option `/Zc:__cplusplus` is used. - Should `/Zc:__cplusplus` be used, the value of *__cplusplus* will be updated depending on the value of `/std:c++XX` (from /std:c++14 onward) - Therefore, if we want Cmake instruction `target_compile_features(${KodgenTargetLibrary} PUBLIC cxx_std_17)` to have the expected reach when using MSVC (after 15.7), we MUST add the option `/Zc:__cplusplus` - Here is the table to check that 1914 is equivalent to VS 2017 15.7 : https://learn.microsoft.com/en-us/cpp/overview/compiler-versions?view=msvc-170#version-macros
Major: - When user is specifying to use msvc as the compiler for the code generated by Kodgen, `wswhere` is used to check that msvc is indeed available on the user's system. - In particular, it is searching for the latest x86/x64 build tools indicated by the ID: `Microsoft.VisualStudio.Component.VC.Tools.x86.x64` - However, in case such component was intalled through _Microsoft BuildTools_ instead of _Visual Studio Community_, _Professional_, or _Entreprise_, the command would not pick up the component as expected. See issue for the reason: microsoft/vswhere#22 - In that case, users must optin by passing the `-products` argument. Considering the goal is to find at least one such MSVC build tools, passing `-products *` seems like the most general method.
Major: - Current version of vswhere in the project was 2.8 - The latest build released is [version 3.1.7](https://github.com/microsoft/vswhere/releases/tag/3.1.7)
Major: - `preGenerateCode` and `postGenerateCode` are now supposed to be run outside of `generateCode` - this makes sense semantically as we also find function names `initialGenerateCode` and `finalGenerateCode` inside `generateCode` which roles are currently no different. - Moreover, the documentation for `preGenerateCode` was saying that "Called just before CodeGenUnit::foreachModuleEntityPair." when this description actually corresponds to `initialGenerateCode` (same for `postGenerateCode`/`finalGenerateCode`) - In addition, this will unlock the possibility to run the said functions independently of everything in `generateCode`. Thus, in future commit, allowing to have stuff run only once per CodeGenUnit before and after the `generateCode` - This has some consequences: - `preGenerateCode` and `postGenerateCode` are now public - `createCodeGenEnv` also needed to be taken outside of `generateCode` so in order for the same environment to be used for all three generation functions. - The override of the corresponding functions in MacroCodeGenUnit were also made public - This was confirmed to induce NO GENERATION DIFFERENCES on the example project. It is, for now, a refactoring that does not impact the generated code and should therefore be backward compatible.
Major: - An enum to set high level flags that will orient the way the generation is handled from a very high level point. - The formating of the enum follows the one from `EEntityType` - The usefullness of this enum will reveal in future commits: - we will use it to decide whether a file should be generated for every parsed file. - or a unique file will be generated from all parsed file. - In the current state, this enum is used to replace the boolean previously called `forceRegenerateAll` that was eventually passed down to `CodeGenManager::identifyFilesToProcess` through `CodeGenManager::run` Remark: - We could add a field `generationStrategies` to the code manager. It could also be part of the settings the user can set in the TOML file
Major: - We are using the information in the `generationStrategies` to indicate whether we want to follow the legacy On-File-Generated-For-Each-Parsed-File - Renamed the function name (`processFiles` --> `oneGenerateForEachParsedFile`) to better match the intent.
Major: - The function which parses all files and runs the generation only once for all - The parsing follows the same multi-threaded approach as before - But the generation is single threaded and `preGenerateCode` and `postGenerateCode` are ran only once per CodeGenUnit as their semantic meaning implies. - Thanks to this, if we keep following the recommendation of the API to use `postGenerateCode` as the place where the files are written (i.e. where the generation is finalized), this will result in generating the files only once per CodeGenUnit.
Major: - `CodeGenUnit::preGenerationCode` was requiring FileParsingResult as an argument for the sole purpose of setting its value in the CodeGenEnv of the generation unit - Given the presence of this argument was posing a problem in how we wanted to use `preGenerateCode` in the new function `CodeGenManager::oneGenerateForAllParsedFiles`, we decided to remove it from there and instead put it at the beginning of the implementation of `CodeGenUnit::generateCode` - This means `preGenerateCode` cannot use information from FileParsingResult anymore, BUT: - As noted in previous commits messages, the semantic usage of `preGenerateCode` was redundant with the function `initialGenerateCode` - meaning that any user that was using FileParsingResult in `preGenerateCode` can "simply" copy-paste what he was doing inside the body of `initialGenerateCode` with the same effect - this is a compromise to also have access to the new function `CodeGenManager::oneGenerateForAllParsedFiles` Minor: - The declaration and definitions of `preGenerateCode` were adapted in `MacroCodeGenUnit` to match the changes Remark: - Following this modification, we could manage to generate code as expected with `CodeGenManager::oneGenerateForAllParsedFiles` on an example project external to this repo. - Example internal to this repo to come in future commits.
Major: - Nothing big, we are simply relocating existing source files of the example in preparation of the files we are going to add to demonstrate the generation scheme of One-Gen-For-All-Files
Major: - The purpose of the example is to reuse the existing `SomeClass` and `SomeOtherClass`, and generate another class `DataState` which take the names of these entities for the purpose of storing them in vectors. - I made it overly complicated for the purpose of examplifying possible code generation - The data vectors can be accessed with a generated enum - The data can be emplaced back and retrieved with function pointers to the getters - The function pointers are in an array that can be accessed with the enum - The getters and emplace functions are templated and `constexpr` tests are used to reduce the code the compiler has to process. - For the generation, we added a custom `CodeGenUnit` with its settings, environment, one module and one property code gen. - The settings include the choice for the namespace, class name, filename and extension. - The environment is used to share when an entity was first to be used for generation as the final code contains `if/else if/else statements`. The source file also contains an enum to indicate locations for the generated code. - The module controls where to put the generated code thanks to the code location enum in the environment source file. This is copying the system in the `MacroCodeGenUnit` - The property code gene is using a new keyword `Data` that we added to mark `SomeClass` and `SomeOtherClass` - The generator's `main.cpp` includes the new code gen unit using the strategy `kodgen::EGenerationStrategies::OneGenerateForAllFiles` - The example `main.cpp` was updated to include the datastate and emplace `SomeClass` and `SomeOtherClass`, and use these.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation
Kodgen is currently powerful to generate (at least) one new file for each parsed file, but is impractical to generate (at least) one new file for all parsed files. The need for such generation feature might arise when one wishes to automatically generate a classes that will tally logic related to many other classes defined in different files.
The use case that motivated this pull request is to be able to have a class containing vectors of objects to store and manipulate them.
Goal
Modify (improve?) Kodgen's API to be able to switch between the current (unique) generation scheme that is One-Generate-For-Each-File and a new generation strategy that would be One-Generate-For-All-Files.
Suggested solution
The solution in this pull request relies on the existing API, but slightly modifies the internal execution flow. The proposed changes are based on the observation that
preGenerateCode
andpostGenerateCode
names may suggest a different behavior from what is actually implemented. Indeed, both methods appear in theCodeGenUnit::generateCode
in the following order:Note that in the original API documentation of
preGenerateCode
andpostGenerateCode
, it says that they are run right before and afterforeachCodeGenEntityPair
when this description actually matchesinitialGenerateCode
andfinalGenerateCode
.In any case, both functions run every time
generateCode
is called, that is for each parsed file (seeCodeGenManager::processFile
). Which meanspreGenerateCode
andpostGenerateCode
are not strictly executed outside a generation step associated to a parsed file, hence are not trulypre
andpost
steps.Trying to use the functions in a way closer to what their names suggest is helpful to easily access to both One-Generate-For-Each-File and One-Generate-For-All-Files. In fact, the solution moves
preGenerateCode
andpostGenerateCode
outside ofgenerateCode
, associated with a new enum calledEGenerationStrategies
to support both the legacy generation scheme One-Generate-For-Each-File, and the new generation scheme One-Generate-For-All-Files.Major changes to Code Generation Process
Modifications to
CodeGenUnit
createCodeGenEnv
,preGenerateCode
, andpostGenerateCode
are public and not protected anymore. See 54dfac8preGenerateCode
is not takingFileParsingResult
anymore. See 3c1c709FileParsingResult
we could be using in thepreGenerateCode
function.FileParsingResult
is set in theCodeGenEnv
at the top ofgenerateCode
.FileParsinResult
frompreGenerateCode
toinitialGenerateCode
.MacroCodeGenUnit
were updated to reflect these changes.New Generation Strategies
EGenerationStrategies
enum to define various generation strategies, includingForceReparseAll
,ForceRegenerateAll
,OneGenerateForEachFile
, andOneGenerateForAllFiles
. (Kodgen/Include/Kodgen/CodeGen/EGenerationStrategies.h
). See 5ea0e22ForceReparseAll
,ForceRegenerateAll
are notably used to assume in finer detail the parsing and generation behavior when going over all detected files inCodeGenManager::identifyFilesToProcess
. This was previously controlled by a unique boolean valueforceRegenerateAll
. See 5ea0e22Modifications to
CodeGenManager
See 3c52973 for all bellow:
processFile
was renamedoneGenerateForEachParsedFile
oneGenerateForAllParsedFiles
to handle the new strategy. It is based upononeGenerateForEachParsedFile
.std::string
before assembling everything in a file), is NOT multi-threaded anymore. Given that we have only one generation step centralized by a uniqueCodeGenUnit
, multi-threading would lead to data race where thestd::string
in the code generation unique might get appended at the same time by different generation tasks. A solution to resolve the data race was not investigated at all.run
method to pass aEGenerationStrategies
argument instead of the original booleanforRegenerateAll
that could only influence the identification of source files on which the generation process should be applied.Demonstration
You can check commit 84caea5
The example project in kodgen was updated to include a demonstration of the usage of the Generation Strategies. In particular, a new set of objects deriving from
CodeGenUnit
,CodeGenUnitSettings
,CodeGenEnvironment
,CodeGenModule
, andPropertyCodeGen
was implemented. These are respectively calledDataStateCGU
,DataStateCGUS
,DataStateCGE
,DataStateCGM
, andDataPCG
. They also serve as a demonstration of how it is possible to build one's own code generation unit and its dependences in addition to the existingMacroCodeGenUnit
& Co.DataStateCGU
& Co are used to generate a classDataState
(which can be renamed in the settings) that stores a variable number of data. In fact, every class (or struct) marked with kodgen's corresponding macro and that is further marked with the property"Data"
will be added to the data state."Data"
was added to the already existing example classesSomeClass
andSomeOtherClass
. As a result, the generated data state contains twostd::vectors
to store instances of these classes (that can be added with a generatedDataState::EmplaceBackData
), and that can, of course, be accessed with a getter.However, the content of the data state is note as simple as it could be, simply to exemplify how to complex we can get with kodgen. In particular, the getter is a function pointer in an array one can access thanks to an enum. In addition, the getter and emplace back functions have
constexpr
if statements to help the compiler.