-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parser refactoring proposal #6701
Comments
Saying the current code is "kind of a huge mess" without a) being specific about what is so bad about the current code and b) what advantages this new structure would bring is not very productive. Other than that the file is (too) big I don't see a huge mess myself. Your "proposed components" simply list functionality that we already have. I presume you want these things to be more "separate", but the how of that is the most interesting. For example the fbs and proto schema parsers share a LOT of functions. Where do these functions go? Some kind of inheritance tree? Given that 95% of the work is about fbs, is fbs parsing well served by having the parsing split up over 2 files dictated by what happens to be shared with proto? etc etc? We can certainly split up
While I welcome the idea of making independent binaries (and codegens in other languages) possible, I don't see why we should require it for all code generators, especially all the mainstream languages we already have. It's not like I would not want to pull in Abseil or some other big library for the convenience of flag parsing. So far, FlatBuffers has been a "no dependencies" library, which makes it convenient to build on many platforms, and we'd need a pretty strong reason to break that. @vglavnyy who has worked a lot with this code as well. |
Sure. I didn't want to dive into all the details immediately, just lay out the high level structure. Basically, the main issue is that Parser does everything. I think it should be refactored into a few classes that have better separation of concerns. However, what do you think of a plugin model + config file where people can attach their own code generators? (and perhaps transpilers, if we can flesh out those details)
I agree that this will be incremental. I think I was unclear in what I meant by 'rewrite' vs 'refactor'. I more or less meant "major overhaul where basically every line will be moved"
I agree, especially with the chicken/egg problem, as discussed elsewhere. Though, we should convert one code generator to test the feature.
I would also like
Yep, these are the details I'd like to discuss in this issue. Here's my first pass on what can be taken out of Parser:
BaseParser can be a parent of the other parsers, but I don't think anything else should be inherited. Maybe holding a BaseParser as a member would be better. I think such details should be determined during implementation.
I think we should move the BaseParser:
FlatbufferParser:
ProtobufferParser:
JsonParser:
weird ones I'd have to think more about
* These parse functions depend on symbol resolution. In my opinion, symbol resolution should happen in a second pass so its easier to reason about the actions of a Parser. The Json Parser is also a transpiler, which is slightly different from FlatbufferParser and ProtobufferParser which essentially fill SymbolTablesHolders. Of course, there are 40 more methods to consider. |
In the period 2020-2021 (18 months) the
Total lines: 3600. Code generators are more unstable. Until now not all fbs-features are supported by languages. For example:
Changes in The best thing that we can do: extract and clarify Parser's public interface ( |
I strongly disagree with the notion that Parser should not be improved because it hasn't been modified much recently.
Im also in favor of moving to C++11 or 17, though I wonder if it a goal of Flatbuffers to be compatible with old C++ for portability reasons. I don't understand why new language features would block a refactoring.
I'm not sure if I understand what you mean here, can you clarify? I think you're suggesting getting the code generators out of holding a The direction I prefer to push in is by providing the flatbuffer type system in a reflection fb, as per #6428, and for that to be the public interface to both code generators and users. |
Which ones are directly connected with the parsers (fbs, json, flex)?
Of course it can't block. It was only my opinion.
I don't mind if a generator uses Parser. There are several kinds of Parser's users:
The current public API is As for me, the |
1, 2, 3, 4, 5, 6, 7, 8, 11, 12, and 15 could be implemented by adding new types or annotations which needs some adjustment to Parser as it populates idl.h. Any checks and adjustments to Conform would be implemented in Parser too. I don't think any of these are directly connected with parsing, per se, certainly not json/flexbuffer parsing, but all the fb schema type checking functionality lives in Parser. There was a request for XML parsing in a different issue, #6033, that's definitely very challenging to implement with the current structure of Parser.
Yep, that's probably how the refactoring would start. I hope it doesn't end there, because "single responsibility principle" and all that. Though I do recognize its a lot of effort.
Yea I agree, I think fixing that should be part of this refactoring effort. Maybe we can agree on "clean up Parser API" as a "P1", and "refactor internals" as a "P2". |
|
Good news! This proposal can potentially benefit from #6704 |
This issue is stale because it has been open 6 months with no activity. Please comment or this will be closed in 14 days. |
I'm still interested but definitely won't have 20% time until at least April :( |
Yep, still interested in these ideas |
This issue is stale because it has been open 6 months with no activity. Please comment or label |
This issue was automatically closed due to no activity for 6 months plus the 14 day notice period. |
As I tend to do, I'll try to keep this top comment updated with the latest state. This is based off of #6428 (comment).
Proposal
Why
flatc
is really complicated and hard to follow. Particularly,idl_parser.cpp
is kind of a huge mess. I actually think the right way out is to rewrite components, rather than trying to refactor it, but this can be decided at implementation time...What
Proposed components
.fbs
or.proto
files and parses them to reflection bfbsWe may want flexbuffers2json and json2flexbuffers, to be part of flatc for legacy purposes but imo, it should be a different binary since it doesn't touch any flatbuffers.
Rougher ideas
std::system
(if that works).flatc
should access some config file, e.g..flatc
, that enables users to add more code generators, e.g.The text was updated successfully, but these errors were encountered: