-
-
Notifications
You must be signed in to change notification settings - Fork 328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider a third-party compiler/lexer for the ExpressionLanguage #601
Comments
@Hywan as the maintainer of the HoaCompiler, WDYT of this choice here, is it something the compiler would be good at? |
Hello and thanks for considering Analysing a language and compiling it into something else is the essence of The grammar might be minimalist if I am correctly reading your examples. The visitor will be simple too. A simple example you might want to look at is the So this is a big yes 😉. That said, A grammar is used to represent any kind of data. Thus, we can use it to validate a data (which is the classical usage), or to… generate a data. This was a big part of my PhD thesis about Praspel. Long story short, with a grammar expressed with PP and 1 algorithm within 3, you can generate data that match the grammar. I am copy-pasting the example from the $sampler = new Hoa\Compiler\Llk\Sampler\Coverage(
// Grammar.
Hoa\Compiler\Llk\Llk::load(new Hoa\File\Read('Json.pp')),
// Token sampler.
new Hoa\Regex\Visitor\Isotropic(new Hoa\Math\Sampler\Random())
);
foreach ($sampler as $i => $data) {
echo $i, ' => ', $data, "\n";
}
/**
* Will output:
* 0 => true
* 1 => {" )o?bz " : null , " %3W) " : [false, 130 , " 6" ] }
* 2 => [{" ny " : true } ]
* 3 => {" Ne;[3 " :[ true , true ] , " th: " : true," C[8} " : true }
*/ This approach and these algorithms are used to do what we call: Grammar-based Testing. See the research paper here:
Several people (like @jubianchi or @vonglasow) are using this approach to generate test data or to populate a database. They write a grammar, they generate data based on this grammar and boom. The most common example I hear is: Describing a JSON payload with the grammar and generate data from it. There are 3 algorithms. They are described in the hack book of Considering the goal of your project, these algorithms can be very very… very useful for you. There is one more thing… To be able to generate data from a grammar, we need to be able to generate data for the tokens. Token values are represented by PCRE. So… you guessed it, we are able to generate data based on a regular expression. See the // 1. Read the grammar.
$grammar = new Hoa\File\Read('hoa://Library/Regex/Grammar.pp');
// 2. Load the compiler.
$compiler = Hoa\Compiler\Llk\Llk::load($grammar);
// 3. Lex, parse and produce the AST.
$ast = $compiler->parse('ab(c|d){2,4}e?');
// 4. Set up the sampler.
$generator = new Hoa\Regex\Visitor\Isotropic(new Hoa\Math\Sampler\Random());
// 5. To infinity and beyond!
echo $generator->visit($ast);
/**
* Could output:
* abdcde
*/ I don't mean to make some advertisements here, but I really think it can provide really cool features. |
Thanks for the detailed answer @Hywan :) I hope I'll have time to look into this soon. To be completely transparent this part is not exactly my priority now as I have still quite a lot to do for alice, AliceDataFixtures and HautelookAliceBundle. The priority being stabilising the three libraries and easing the migration. I would love however to have the time and energy to look into it before the stable release, it will avoid to go stable with the whole Expression Language marked as internal. That said, maybe someone else is ready to tackle this RFC :P |
We can if needed. If you play the role of the PO, draft all the issues etc., I am sure we could find time to help :-). |
Hehe I need to update the doc, but otherwise I think for a developer, the best doc is ParserIntegrationTest. Anything internally on how to generate this result is internal and can be completely changed. There is definitely a scenario or two missing, I tried to be as exhaustive as possible but well I'm not a machine and the sheer number of combinations not coverable either, but it gives a good base I would say. |
@theofidry Where is the grammar defined? |
That's the thing there is no proper grammar system. Basically there's a lexer (which has its own share of tests) which transforms expressions into Tokens like: yield '[Escaped arrow] surrounded' => [
'foo \< bar \> baz', // input
[ // expected
new Token('foo ', new TokenType(TokenType::STRING_TYPE)),
new Token('\<', new TokenType(TokenType::ESCAPED_VALUE_TYPE)),
new Token(' bar ', new TokenType(TokenType::STRING_TYPE)),
new Token('\>', new TokenType(TokenType::ESCAPED_VALUE_TYPE)),
new Token(' baz', new TokenType(TokenType::STRING_TYPE)),
],
]; And then the parser will, depending of the type of the token, parse the value accordingly depending of the token type. So as of now, it's pretty manual hence the desire to change to something more standard :) |
I see. I guess the users have a documentation with all the possible syntax? |
Yep, #377 which may be slightly outdated right now and ParserIntegrationTest. Tests being a big part of the doc here for the better or the worst :/ |
Great! I don't have time right now but I will try to find some. Maybe some Hoackers could help me. What's your schedule? |
I hope to have finished most of it by the end of the month. Then it will be a few updates or bugfixes here and there and let it live for 2-3 months before a stable release. |
@Hywan I took a glance this weekend for the Compiler, looks like a good solution to replace the in-house lexer. I still have a few issues with your PP language but I think it's just a matter of getting familiar with it. I'm not sure if I should do it after or before the stable release yet. A little question though: why are hoa projects not semver? |
@theofidry Funny, I opened your issue this weekend too 😛. I can help to write the grammar (in PP) if you need help. Hoa libraries are compatible with semver, but here is the answer: https://hoa-project.net/En/Source.html#Rush_Release. |
Cool :) I'll push a POC soon to be able to discuss on it then :) |
Perfect! Please, ping me. |
@theofidry perhaps there are other out-of-the-box workarounds. Rather than coming up with a special language (which users would have to spend time learning), what if the project adopts the Expression Language? # before
Is\Bundle\PlanBundle\Entity\Event:
event_bare (template):
title: <sentence(3)>
show: '@show_*'
rooms[0]: '@room_*'
startDateTime: '<dateTimeBetween("-1 month", "+4 month")>'
endDateTime: '<dateTimeInInterval($startDateTime, "+4 hours")>'
isDraft: false
version: '10%? @version_*'
tags 25%?: ['<randomElement(@tag_{0..3})>']
__calls:
- setRevenue (25%?): ['<moneyBetween(10000, 300000)>']
- setVisitorCount (25%?): ['<numberBetween(100, 500)>'] # after
Is\Bundle\PlanBundle\Entity\Event:
event_bare (template):
title: faker.sentence(3)
show: alice.one('show_*')
rooms: faker.randomElements(alice.some('room_*'), faker.randomNumber(1, 2))
startDateTime: faker.dateTimeBetween('-1 month', '+4 months')
endDateTime: faker.dateTimeInInterval(this.startDateTime, '+4 hours')
isDraft: false
version: alice.sometimes(0.1, alice.one('version_*'))
tags: alice.sometimes(0.25, faker.randomElements(alice.some('tag_*')))
revenue: alice.sometimes(0.25, myown.moneyBetween(10000, 300000)
visitorCount: alice.sometimes(0.25, faker.numberBetwen(100, 500)) In a nutshell I'd propose these changes. These are just some thoughts that came up as I was thinking about this.
What are your thoughts? Surely, this is a breaking change, but I think that this change would let maintainers focus more of their time on features rather than having to wrestle with the idiosyncrasies of the syntax. |
Hi @kgilden. It's an interesting proposal indeed. A couple of notes however:
from: version: '10%? @version_*'
tags 25%?: ['<randomElement(@tag_{0..3})>'] to: version: alice.sometimes(0.1, alice.one('version_*'))
tags: alice.sometimes(0.25, faker.randomElements(alice.some('tag_*'))) I am not sure this is equivalent. Indeed for The same for I however don't think it invalidates your suggestion. I like the idea, but I'm mitigated since:
|
Thanks @theofidry, Cool that you're considering this. And apologies for not being quite rigorous in my proposal. I suppose the gist of my proposal is to replace the current syntax out with expression language.
Agreed that this would be a big BC break and I hate them as much as any other dev. Maybe it would be possible to keep BC by introducing syntax versions (user specifies on top of the file which version of the syntax they prefer to use, i.e.
Could you show an example of what you have in mind? In my opinion one of the nice things about this library is that fixture generation is terse. Sure, I could use plain Doctrine Fixtures, but the end result tends to be complex and difficult to update. So if PHP templates keeps to the same terseness, I'd be 👍 with that. Anything goes for me that would allow me to sometimes use function nesting without any surprises (such as #842). I'd be interested in what other users of this library think of this as well. |
Sure, sorry I didn't do that yesterday, I had to give it a bit more thoughts & time: <?php
use Is\Bundle\PlanBundle\Entity\Event;
use Nelmio\Alice\Alice;
return [
Event::class => [
'foo1' => [
'title' => Alice::faker()->sentence(3),
'show' => Alice::reference('@show_*'),
'startDateTime': Alice::faker()->dateTimeBetween('-1 month', '+4 month'),
'isDraft' => false,
'version' => Alice::optional(10, Alice::reference('@version_*')),
'__calls': [
'setRevenue (25%?)': [Alice::faker()->moneyBetween(10000, 300000)]
],
],
'foo2' => new Foo(),
],
]; There is three immediate advantages there:
|
Honestly I'd be cool with both directions. As long as it would be possible to add custom extensions that in turn are dependent on other services (i.e. Symfony DI). However, I'm a bit worried that perhaps it becomes more verbose and gives too much "power". I like the fact that the current YAML syntax limits developers from writing long complex code and keeps the focus more on relationships between fixtures. |
I don't think this would be too difficult and I agree it's a requirement: HautelookAliceBundle depends on it as well.
That's a risk, but I think it's ok. Right now the vast majority of the issues are about a lexing/parsing problem which can only be solved by this PR and even so, people feel overburdened from this YAML syntax and trying to learn alice DSL. Also for the record, in 1.x & 2.x it was also possible to a certain extend (just not as discoverable). |
Please excuse the the perhaps not so relevant comment, but does this mean nested functions like
All I found was this related issue hautelook/AliceBundle#327. |
They can and they are to a certain extend. It however relies on regexes which is extremely flimsy |
Is there a workaround? Anything with more than one argument seems to break with the same error. I tried escaping the coma and what not. App\Entity\Dummy:
dummy:
functionValue: '<strtolower("BAR")>'
nestedFunctionValue: '<strtolower(<(implode(" ", ["HELLO", "WORLD", \<foo()>]))>)> \<bar()>'
Escaped expressions also fail:
I'm using Alice 3.5.7, by the way. |
The easiest workaround is:
|
Ok, so these really don't work. Thank you for the clarification. I wanted to upgrade from Alice 2.3 and AliceBundle 1.4, and I have a lot of fixtures to change. I was thinking of cramming everything into the providers, but I'll just stick to the old versions for now. |
If your fixtures works it's fine, just be aware that only barely 10% of alice 2.x is actually tested so it's based a lot on luck... I think however #998 is the real solution that will make everyone happy tbh |
As mentioned in #600, the current lexer/parser of the Expression Language is completely custom. While it does the job, I can't say I'm very proud of the implementation and it's far from my field of expertise. Relying on a third-party library for that task would make sense, maybe:
And eventually other (I didn't take the time to properly look into it).
The goal of this component is to be able to transform values as described in #377. Maybe a more detailed example is the actual integration test of the Expression Language parser: ParserIntegrationTest
The text was updated successfully, but these errors were encountered: