-
Notifications
You must be signed in to change notification settings - Fork 9
Ambiguities or what do those error messages from the typechecker mean ?
Augeas does a fair amount of typechecking on lenses to detect ambiguities. Ambiguities happen when Augeas can't decide what lens to use on a specific string. For example, in the lens
let vert = let l1 = [ key /[a-z]+/ ] in let l2 = [ label "special" . store /[a-z]+/ ] in l1 | l2
Augeas can't decide whether it should use l1 or l2 when it tries to process with l. For example, if l is applied to the string hello, both l1 and l2 can be applied, though they produce different trees.
You should never use a lens that hasn't been typechecked successfully — using a lens with type errors will generally result in very erratic behavior. Since typechecking can be very slow and expensive, it is not performed by default. Augeas comes with a separate command line tool, augparse that should be used for typechecking lenses (see the augparse man page for usage) — once a module has been successfully typechecked, there's no need to typecheck it again.
Augeas checks for two different kinds of ambiguities, 'horizontal' and 'vertical' ambiguities. The name comes from the way a grammar is traditionally written, for example
S := A B C | C D
A vertical ambiguity happens when Augeas has difficulty distinguishing between the two branches of a | (vertical since they are written on separate lines). A horizontal ambiguity happens when Augeas has difficulty distinguishing between two entries in a concatenation, for example, between A and B. Of course, with Augeas you don't write down grammars, but lenses. With Augeas, you might write a lens
let l := l1 . l2 | l3 . l4
A horizontal ambiguity could happen for example between l1 and l2; such an ambiguity would mean that there's some string for which Augeas can't decide how to split it between l1 and l2. A vertical ambiguity could happen between l1 . l2 and l3 . l4, and would mean that there are strings for which Augeas can't decide which of the two branches of the union to take — when Augeas encounters such an error, it will print an example of a string that is ambiguous together with the location of the lenses involved.
Since Augeas transforms strings to trees and also trees back to strings, ambiguities can happen in either transformation. The string to tree transformation is called the 'get' direction, and error messages about ambiguities concern strings that cause some headaches. The tree to string direction is called the 'put' direction, and error messages in the put direction are caused by trees that cause headaches.
An example of a horizontal ambiguity is the lens
let horiz = del /s*/ "" . store /.*/which will cause augparse to complain with
Syntax error in lens definition /tmp/t.aug:3.2-.39:Failed to compile horiz /tmp/t.aug:3.10-.39:exception: ambiguous concatenation First regexp: /s+/ Second regexp: /[a-z]+/ 'ssa' can be split into 's|=|sa' and 'ss|=|a' First lens: /tmp/t.aug:3.10-.22: Second lens: /tmp/t.aug:3.25-.39:
The first thing to note is how Augeas indicates where an error happened: the notation /tmp/t.aug:3.2-.39 says that this is about something found in file /tmp/t.aug from line 3, column 2, to column 39 on the same line. Augeas indicates locations with FILENAME:LINE1.COL1-LINE2.COL2, but will omit LINE2 if it is the same as LINE1
The error caused by the lens horiz in the example above, the error message ambiguous concatenation means that this is a horizontal ambiguity, i.e. an ambiguous construct l1 . l2. The location of the two lenses involved is given after the First lens and Second lens at the end of the error message, in the example those are the del and the store.
Those lenses use the regular expressions listed after First regexp and Second regexp, in this case s+ and [a-z]+. The ambiguous string is ssa, and Augeas indicates how it could be split by inserting a marker |=| where it could split it. In the example, it is possible to split ssa into ss and a, and into s and sa.
We already saw an example of a vertical ambiguity in the lens vert in the first example. The error produced for this lens by augparse is
Syntax error in lens definition /tmp/t.aug:3.2-6.11:Failed to compile vert /tmp/t.aug:6.4-.11:exception: overlapping lenses in union.get Example matched by both: 'a' First lens: /tmp/t.aug:4.13-.29: Second lens: /tmp/t.aug:5.13-.49:
The error message states that two lenses in a union are overlapping, i.e. can both be applied to some string. The location of the lenses is again given at the end of the error message, together with an example of a string that is matched by both lenses, in this case the string a. Let's try to fix this by leaving out the store:
let vert2 = let l1 = [ key /[a-z]+/ ] in let l2 = [ label "special" ] in l1 | l2
Now, augparse complains with a different error:
Syntax error in lens definition /tmp/t.aug:3.2-6.11:Failed to compile vert2 /tmp/t.aug:6.4-.11:exception: overlapping lenses in tree union.put Example matched by both: { "special" } First lens: /tmp/t.aug:4.13-.29: Second lens: /tmp/t.aug:5.13-.32:
This error now complains about the put, i.e. the tree to string direction of the lens vert2, and shows that the tree consisting of a single node labelled special is ambiguous: it can either be transformed into a string using l1, resulting in the string special, or using l2, resulting in the empty string, since label produces no string output.
The last kind of ambiguity is ambiguous iteration, which is a special case of a horizontal ambiguity, caused by a construct of the form l*. For example, the lens
let iter = let l = [ key /[a-z]+/ ] in l*causes augparse to complain
Syntax error in lens definition /tmp/t.aug:3.2-5.6:Failed to compile iter /tmp/t.aug:5.4-.6:exception: ambiguous iteration Iterated regexp: /[a-z]+/ 'aa' can be split into 'a|=|a' and 'aa|=|' Iterated lens: /tmp/t.aug:4.12-.28:
The error message again lists the location of the problematic lens, and the regular expression used by that lens, [a-z]+. An example of an ambiguous string is given, the string aa here, which can either be parsed by splitting it into a and a, in which case it would be parsed as l . l, or by not splitting it and parsing at as a single string aa, i.e. parsing it as one instance of l. In the first interpretation, it results in a tree with two nodes, both labelled a, in the second interpretation it produces a tree with a single node, labelled aa.