-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SVG2 path data coordinates are currently spec'd as integers. #335
Comments
It has been noted, but I don't think we had a dedicated issue. See also: |
If it helps, I think the SVG 1.1 grammar can be written as:
But there are many other ways to do it. IMPORTANT, the first two arguments for
(Again, I'm going by the 1.1 grammar.) Also, you need to decide whether the argument to The presentation of The above syntax makes floats like |
Thanks for that @fuchsia . For the bearing command and for the third argument in the Arc command, any number (signed, with decimal) should be valid. Negative angles refer to counter-clockwise rotation in SVG, and there is no logical reason to restrict it to integer degrees. The relevant SVG 1.1 syntax was as follows:
See #331 for discussion about numbers with trailing decimals, like |
Okay, unless I've screwed something else up, the grammar should be:
I've reverted to I've created You can rename If SVG numbers are brought into line with CSS numbers, then fraction changes to:
i.e. drop the first I'm not really a grammar geek *deletes rant about the grammar using right-recursion over the Kleene star* so it's possible this will produce a horrible parse tree or has transgressed LALR(1) in some way. But it's better than what's there. |
Looks good at a glance, but it really needs more than a glance. Pinging @tshinnic, who has also been a bit of a path-grammar geek who can't believe nobody has fixed this in all this time. PS. We're still waiting to hear whether or not the SVG Working Group will be revived, with spec editors who actually have clear responsibility & maybe paid time to work on this. As it is, nothing's happened because no one has any authority to do anything. |
A paid editor would be great! But if nothing keeps happening, maybe there's enough volunteers to fork it and document the de facto standard. Thanks for your help, anyway. |
I have been somewhat overambitious in wanting to "wrap up" the several grammar issues into one nice package with a bow on top, to present to all and sundry. But I kept finding little nits requiring sanity evaluation: mine vs. draft. e.g. number formats #331 The draft grammar has accumulated quite a number of changes over time, trying to respond to multiple ideas as they occurred. The first major reason to update the draft grammar was to move away from the SVG 1.1 ABNF format in order to agreeably share the common EBNF format seen with CSS and others. Then there were a few bugs to be worked out in smaller changes. As this issue and others have appeared since, we have seen there are more bugs to be fixed. The last major change to the draft grammar was to address the work item called "segment-completing closepath", which fixes a lack not recognized since SVG 1.0 days. Over this series of changes errors have crept in, sometimes merely because they were "thoughts in progress", like the wish to agree with the CSS number formats (which I only now understand is warranted #331) but which resulted in the broken number format mentioned in this issue. And the other issue mentioned above (see also #324) Unfortunately the aggregate result is a grammar that will not parse some currently legal constructs. What I thought to do was introduce a PR in three parts or sets of interrelated changes. Organizing into three parts best fits the three areas mentioned above. Re-deriving an EBNF grammar from the SVG 1.1 grammar would create a "clean slate" that would make the following issue-specific changes much clearer. Then addressing the individual bugs in the SVG grammar with small changes would allow problem description, change discussion, and acceptance of these small, separate issues to proceed smoothly. And then applying the rather large change that supports the addition of "segment-completing closepath" would be done. I wonder how best to proceed? Should there be one PR with a large change (EBNF), then a number of small changes (bugs), then another large change (closepath)? Or should there be three PRs in the same order, or many? Two things are clear to me. Amending the grammar from its current state would be very confusing, and would not help at all the "why this grammar rule?" questions of the future. Second, the 'closepath' Finally, Amelia's comment above shows that tracking all these grammar specific issues is 'dispersed'. Should there be a separate issue collecting all these together as work items or would a new Github label/tag be better? .
|
Given the state of SVG standardization & software implementer commitments, it would probably be very helpful if we can develop two versions of the grammar:
Similarly, for any other syntax fixes/changes relative to SVG 1.1, it would be nice to have a clear commit documenting changes relative to the SVG 1.1-equivalent grammar. But it would also be useful to have tests of what the current software support is like, to demonstrate whether this is really a "new feature" or a "make the spec reflect reality" situation.
I've already told Tom, but I'll repeat here: for things that are obvious errors (like floats vs integers, unintentionally changing the grammar relative to SVG 1.1), we can probably at least merge them in to the Editor's Draft, which is built from the master branch of this repo. However, we'd need to keep careful note of what changes are being made and why, so that eventually it can be merged back in to the official W3C spec process. To eventually merge it back into the W3C process, we'd also need to have licensing agreements from anyone contributing content that would become part of the spec. See the Contributing guidelines for this repo, and the W3C FAQ. |
Step 1: the SVG 1.1 grammer in EBNF
I've tried my hardest to be consistent. (And, trust me, that comes very unnaturally to me; if there are two ways of doing something, I'll manage it in three.) Changes I've made:
I've turned this EBNF into a recursive descent parser on jsfiddle. It's green for valid paths and red for invalid ones. PS why is it wrong to define a comma as one of the space characters and allow them anywhere? (Something deep inside me objects. But I don't know why.) |
From the W3C tests (old) I got the following test strings. These should pass:
And these should fail:
And they really do pass and fail in the major browsers. I know the need for the As for the comma thing, it seems to me that the SVG 1.0 specifiers were fixated on using the comma only as a meaningful separator for numeric operands. Thus not a generalized separator. (This for clarity?) Similarly they were fixated on allowing for minimal path string sizes, and so we inherit the weirdness for I need to gather my notes... And look at your work above, compared to my unambitious straight translate to EBNF form. And one of the subprojects I don't know I can get to, is to collect together the multiple sources/versions of tests for path strings. The W3C tests are old and miss cases. Chrome/FireFox have tests but they're held in those projects somewhere. Good and complete tests would have prevented some of the confusion in the 1.5 decades since. Oh, and give the dozens of implementations of path string parsers something real to test against. (sigh) |
Well my validator accepted your three lime paths and rejected your red ones. I then grabbed all 120 paths from the w3c list (thanks for pointing me at that) and all are accepted, except two occurrences of
which are your three red paths, plus two more. (I also found your lime paths in I haven't looked to see if I'm passing something that should fail, but all those failures look legit. Things that aren't tested include: decimals, any form of exponent, leading plus signs and negative arc radii. (I get that negative arc radii are permitted in practice. The lack of tests is probably why. And our aim, for the time being, is to recreate what the original authors intended.) I would have wanted to see more tests like this:
The particular problem with It's not an issue of character budget, either. Or, at least, it could be fixed without making pathdata longer. But, in defence of the original authors, they weren't the ones who deleted the API that we use to parse paths. The comment about commas was an aside about human psychology. I have exactly the same flaw: I cling to the need for commas even as I see they are pointless. Compare |
I definitely agree on the basic tests. There was one goof reported against a parser - as a security/spoofing problem(!) - where an example 10000 char path string was submitted with no explanation. One was supposed to just eyeball the glitch present somewhere in the string. Depending on the font the problem might not leap out at you.
As a treat the reporter had the following path commands display something quite rude. 😄 The report was suggesting that silently truncating SVG path strings at the first error could allow nefarious data to be introduced unnoticed. That parser was following the rules in 9.5.1. Error handling in path data. Whether that program being too coy about a detected error was wrong I couldn't say. Implicit in the rules is how something like (Still working on notes - while the interest here was in moving from ABNF to EBNF syntax "because CSS" (I thought), it is interesting that the former CSS specs never used a 'BNF' form, but only ref'd Yacc and Flex syntaxes, and now CSS 2.2 uses only prose and railroad diagrams. Maybe the future is "steam punk SVG grammar"?) |
@fuchsia: Thank you. I played with your JSFiddle and then copied the parser into an ad-hoc test program that runs over a (non-representative) set of 400000 SVG 1.x paths. Your parser produces the same true/false results as one I did. Impressive. Did you derive the parser code ad-hoc or via a parser generator? (Asking because wanting to play with the I'm going to try to find test for paths from Firefox/Chrome groups. I really want to be able to say "yes, this is backward compatible." |
You might be amused by this bug report and the helpful explanation at the comment.
vs.
Of course, Mozilla ended up allowing the invalid path anyway by adding a 'fix'. (sigh) (Going through the Mozilla test cases...) |
@tshinnic your analysis of what I did is incorrect. I did not make Mozilla support invalid paths. Firefox rejects M0,0, L100,100 and allows M0,0, 100,100 per the SVG 1.1 BNF. See https://bugzilla.mozilla.org/show_bug.cgi?id=938569#c8 for more details. |
@longsonr As I look now, I indeed misread which comma you were enabling. Not
which everyone agrees isn't in the grammar, but rather you re-enabled
where numbers for the implicit lineto commands follow directly after the moveto command. Commas between numbers are legal, and only legal between numbers. And you added a test, I have to think I misread the fix partly because the fault was so unlikely. But the most amazing part of all this is that this test did not exist originally. With a little looking around I can see the test was copied into the Webkit project. That is certainly a good thing. But I'm not seeing the same or similar test under the Blink project or the web-platform-tests or Github mirror. I can't guess after IE/Edge. Is it really true those projects don't have a test for this after 16 years? Is this area really so disorganized? The meagerness of tests, the lack of rigor, for testing basic SVG things is astonishing to me. When, how do we make up for the oversight back then? |
Starting now would be perfectly fine - tests can be submitted to the WPT repo (it's not actually a mirror you link above - it's the source of truth - PRs welcome.) FWIW, SVG has had a bit of a rough upbringing (not to say that other similar era, and complexity, specs had it easier.) I wasn't around in 2001 when 1ed SVG 1.1 was released, but I'm assuming the testsuite wasn't really at the top of the list of priorities. Even then it was probably one of the larger ones for a spec in those days, clocking in at 300-400 tests or so I think... For spec development beyond 1ed it was probably as easy to get WG members to write tests as it was to get them to edit the actual spec. (2ed and Tiny 1.2 had a few more tests, but not orders of magnitude more.) From an implementer's perspective it usually comes down to resources (people and their time) - you may not have the luxury of spending all the time you want analyzing every nook and cranny of a particular section of the specification, because you need to implement another even more obscure and less well-defined part than what you're currently working on. So occasionally an edge error-case is missed in test coverage (more often than not I'd say even.) So if this particular case stuck out like a sore thumb to you, that's great! That doesn't mean that everyone else would think of it right away - especially those tasked with implementing a 300+-page specification on the clock. I take it you'd be equally astounded to hear about the number of browser tests that come to be via bugs (like in the Gecko bug referenced above.) Also, for a long time this whole "sharing tests" thingamajib (WPT) was really unheard of, which probably ended up hurting interop immensely (for SVG and everything), but things were different back then in some regards. So to loop back to the question at hand: WPT is currently best way of fixing this. Submit tests (the more the merrier.) |
@tshinnic Yeah, it took a while to see the 'O'. Actually, railroad diagrams aren't normative for CSS. (See CSS Syntax S4.1) It's the step-by-step algorithms that are. I'd be happy to follow either path. (Railroad diagrams are isomorphic to an EBNF which has had character ranges and the @fsoder, Yeah, my production code is equally undertested. Same reasons. More so, because I've been spending time on this. ;) I need to create a separate issue for this, but while you're here, how does Firefox handle large numbers? Chrome appears to use single precision with weirdness. It will draw But I've just looked at those in Firefox and I don't see errors or drawing. It would be nice to get this standardised. All One idea I had to simplify testing was to propose a read only property that returned the rendered path, possibly in a normal form. Then you could do CorrectionPreviously, the 1.1 grammar had:
So comma-wsp was entirely optional, even though it's supposed to be mandatory. It's now:
I can't actually see a way to exploit this, because, if it was omitted in the one place's its mandatory (before the first I've also removed some excess brackets (having finally found the precedence rules for EBNF 😳). I have the 2.0 grammar working, but that will have to wait. |
Step 2: new SVG 2 grammar
|
@AmeliaBR Nobody is objecting, so can we have a go at merging this into the Editors draft? If so, what do I do - fork, hack a text file and make a pull request? (Bear in mind I'm noob at this and at git.) |
@fuchsia Yes, please do make a PR with the improved grammar, and link to this issue in your commit message. You would edit paths.html. |
@svgeesus It's done, pending IPR approval. I see the the committee has broken out the path spec into a separate standard, and added Catmull Rom curves. (Both of which I approve of, especially the Catmull Rom.) Do you want me to add Catmull Rom to the grammar and do a pull request on that? |
@fuchsia The Catmull-Rom definition in SVG Paths is definitely still preliminary, and problematic (it creates a gap in the path on either end of the spline curve). So I wouldn't spend too much time on it. That said, if you are able to add it in as currently defined and create a PR updating the rest of the grammar in that spec to match your update to SVG 2, it would make it easier to compare the differences and make corrections in the future. |
PS, you could add that as an additional commit to the same PR #346, instead of making a separate PR. That way, we can merge them both at the same time. |
Yeah, that's an "interesting" way to define Catmull-Rom curves. Still, I've added them, as is, to the PR. (Not that I would know how to make a separate PR, even if I wanted to.) The only differences with the above are the extra productions:
And the addition of a State dump of my brain:
|
I see #385 has removed the bearing command and the magic Z that fills in missing coordinates. Do you want me to remove them from the patch? |
@fuchsia If you're able to do that, it would be great. (Leave them in the edits to the SVG Path module, specs/paths/master/Overview.html). I'd promised folks that I would make those edits & finally review & merge your PR when I next had a chance to devote extra time to SVG work, but I'm not sure when that will be. |
Okay, I've removed the bearing commands and the magic Z from master/paths.html and have run it through my thimbleful of tests. (*cough* There were none for the bearing commands. *cough*) I think the only diff between master/paths.html and SVG 1.1 is that the former allows negative arc radii. As an aside, if I've followed the WG's bread crumb trail correctly, then your(plural) plan seems to be to allow CSS style
If that's the case, then FWIW I think it would be a really good idea. It may seem ugly as sin but |
Not blocking updated 2.0 CR publication - assigning 2.1 WD milestone |
@ericwilligers I think you are still working on optimising the path syntax description. Assigning the issue to you for review. |
Fixed by #697 |
I'm sure you're aware, but just in case you're not: the current version of Path Data Grammar in SVG 2 specifies coordinates as signed integers. Support for decimals and exponents seems to have been lost in the haze of rewrites.
The text was updated successfully, but these errors were encountered: