Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow leading logical operators #5068

Closed

Conversation

helixbass
Copy link
Collaborator

@helixbass helixbass commented May 16, 2018

@GeoffreyBooth I’ve been thinking about Prettier and formatting of long logical conditions (including your expressed opinion that you don’t like breaking lines after and/or)

One of the few things I’ve consistently wished that I could do in Coffeescript was to lead subsequent lines with the logical operator rather than have to break after the logical operator at the end of the line - I find it much easier to digest the “shape” of the logical expression if I don’t have to look to the ends of the lines to see how they relate logically

And in looking at code I’m formatting with the Prettier plugin, what I would like to see as “pretty” formatting would be when breaking long logical expressions, to lead lines with the logical operator. Eg instead of this which Prettier is currently generating (I have it breaking lines at logical operators similarly to how Prettier for JS does it - if you really don’t want to break long lines at logical operators, that’s something that we could easily add an option for):

shouldOmitBracesButNotIndent =
  (parent.type is 'ExpressionStatement' and
    grandparent.type is 'ClassBody') or
  (parent.type is 'ArrayExpression' and parent.elements.length is 1) or
  isObjectPropertyValue path, {stackOffset, last: yes}

I’d like to see it formatted like this:

shouldOmitBracesButNotIndent =
  (parent.type is 'ExpressionStatement'
    and grandparent.type is 'ClassBody')
  or (parent.type is 'ArrayExpression' and parent.elements.length is 1)
  or isObjectPropertyValue path, {stackOffset, last: yes}

To me that’s way easier to follow the logical flow

And I’d never realized it’s a simple change to allow this long-desired (by me) feature, just making the logical operators be LINE_CONTINUERs. It wouldn’t break existing code, since no valid lines could currently begin with and(space)/or(space)/&&/||

PS the code snippet above is from the Prettier plugin printer source, which I’ve been writing in JS but have hacked together a very primitive version of es2coffee (repo, built as a Babel plugin that feeds its transformed AST to the Prettier plugin) which can transform it to Coffeescript pretty well - the decision to make our AST types as similar as possible to Babel AST types has proven beneficial

@GeoffreyBooth
Copy link
Collaborator

I’m still more of the mindset that using line continuers like and to imply a line continuation, rather than making it explicit with the \ operator, is hacky and bad style. Either let your line keep going and going (and use soft word wrap in your editor) or use \. But I’m not sure I’m in the majority on this one.

Since this is possible with and etc. at the end of the line, it seems reasonable to also allow it at the start. That doesn’t mean I think Prettier should automatically format like this; that’s another discussion. One of the other tenets of Prettier is that it’s opinionated, so we shouldn’t have too many options for people to have to wade through.

@Inve1951
Copy link
Contributor

One of the few things I’ve consistently wished that I could do in Coffeescript was to lead subsequent lines with the logical operator rather than have to break after the logical operator at the end of the line.

Same here.

There's an addition I'd love to see here which is indentation sensitive line continuation similar to leading dot property accessor. Consider this:

bGreets =
  /^hello/i.test myString
  and
    /\s+world$/i.test myString
    or /\s+coffee$/.test myString
  or
    /^goodbye$/.test myString

@jashkenas
Copy link
Owner

FWIW, I think it makes sense to allow any binary operator, when placed at the beginning or end of a line, to parse as a continuation of the preceding or following line...

@GeoffreyBooth
Copy link
Collaborator

I would also warn that the leading dot chaining notation, e.g.

foo()
.then fn
.catch handleError

Has been the largest source of bugs in CoffeeScript’s history. The more things that cause line continuation, the more potential trouble that gets caused, e.g.:

foo()
.then fn() and
  otherFn()
  .then yetAnotherFn # Does this chain off of foo() or otherFn()?

The problem is that suddenly whitespace is insignificant here. You can indent or not and it doesn’t matter. It’s hard for users to follow what implied line continuation matches with what, and it goes against the general philosophy that indentation should inform these things.

Since and already implies a line continuation, it’s not terrible to allow it to do so from the start of a line rather than the end. But I would encourage us not to keep adding ways to imply a line continuation, as it’ll keep creating new issues like we’ve seen with leading dot.

@helixbass helixbass force-pushed the allow-leading-logical-operators branch from fe0298c to 2a7d1fe Compare May 19, 2018 23:52
@helixbass
Copy link
Collaborator Author

@GeoffreyBooth you're right LINE_CONTINUER is sort of inherently problematic. So I thought about it a bit more and reimplemented to use a primarily grammar-based solution

Basically it adds rules for Expression TERMINATOR && Expression and Expression TERMINATOR || Expression. This caused a grammar ambiguity so I added a rewriter pass to convert eg TERMINATOR && -> LEADING_&& and the grammar rule is actually Expression LEADING_&& Expression. It's possible there's some grammar cleverness that could avoid the ambiguity, but it's a simple rewriter pass

Then, since the desired behavior (sort of like chaining) is to be able to either align or indent the leading-logical lines (ie both of these should be ok):

a
and b
a
  and b

, instead of making the leading-logical regex be a LINE_CONTINUER, I made it a new INDENT_SUPPRESSOR - basically then a TERMINATOR will be generated rather than an INDENT even if the line is indented

This has proved to work well for the cases I've come up with (added a bunch of tests involving things like if blocks, implicit objects/calls, functions with inline/indented bodies), everything seems to be behaving in accordance with what the indentation implies. It does close open implicit calls/objects in an intuitive way with respect to indentation by virtue of the rewriter pass coming after addImplicitBracesAndParens() (so the TERMINATOR triggers closing them before being rewritten away)

I think it makes sense to allow any binary operator, when placed at the beginning or end of a line, to parse as a continuation of the preceding or following line

@jashkenas it'd be great to be able to lead lines with any binary operator as the same readability benefits apply (you can just scan the left side of the expression to get a sense of its "shape"). But I got scared off thinking about potential tricky cases eg continuing / vs regex, continuing +/- vs unary , as well as knowing that LINE_CONTINUER is kind of a hack. But maybe if this grammar-based approach proves sound we could allow it for some unambiguous binary operators as well. The only other thing I'd worry about is that by allowing a lot of leading-continuer "symbols" we could potentially be blocking the way for some future language constructs that would want to use those symbols as leading tokens

@Inve1951 with this branch you couldn't write your example exactly as written - for one thing you couldn't indent the clause after "standalone" and/or (which I don't hate the look of but it kind of reads as a "block"). Also are there implied parens around the inner or because it's "nested" inside the and? If so, again I don't hate how it looks but that's a rather complex and specialized indentation style/rule that'd have to be supported

@@ -661,6 +662,18 @@ exports.Rewriter = class Rewriter
@detectEnd i + 1, condition, action
return 1

# Convert TERMINATOR followed by && or || into a single LEADING_&& or
# LEADING_|| token to disambiguate grammar.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not LEADING_AND and LEADING_OR? We prefer the English word names as a general rule in the codebase. Unless you need separate tokens for LEADING_AND and LEADING_&&?

Speaking of which, this should work for and as well as &&.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@GeoffreyBooth sure, I used LEADING_&& to mimic the existing && token type, but agreed it reads badly, updated

It works equivalently for and and &&, as by the time the rewriter/grammar sees them they're the same token type (and both variants are included in INDENT_SUPPRESSOR). A couple of the tests use &&/||

src/lexer.coffee Outdated
@@ -1294,6 +1302,7 @@ POSSIBLY_DIVISION = /// ^ /=?\s ///
HERECOMMENT_ILLEGAL = /\*\//

LINE_CONTINUER = /// ^ \s* (?: , | \??\.(?![.\d]) | \??:: ) ///
INDENT_SUPPRESSOR = /// ^ \s* (?: and\s+(?!:)\S | or\s+(?!:)\S | && | \|\| ) ///
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code could use a nice detailed comment explaining these two variables. Not the regexes themselves necessarily, though that wouldn’t hurt, but what a line continuer is and what INDENT_SUPPRESSOR is for (in terms of when and why you would suppress indents, of course).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@GeoffreyBooth sure, added comments for LINE_CONTINUER and INDENT_SUPPRESSOR - the comments I added explain the token mechanics, but not sure if you think there should also be eg examples of what syntax these regexes allow?

@Inve1951
Copy link
Contributor

@Inve1951 with this branch you couldn't write your example exactly as written - for one thing you couldn't indent the clause after "standalone" and/or (which I don't hate the look of but it kind of reads as a "block"). Also are there implied parens around the inner or because it's "nested" inside the and? If so, again I don't hate how it looks but that's a rather complex and specialized indentation style/rule that'd have to be supported

@helixbass correct. The indented block after standalone and implies parens in my example. I think that's a very straight forward and readable way to write it. I'm aware that in order to support this there's probably a lot of work involved. I brought it up right away to avoid a breaking change when considering this later only.

Speaking of which, that would already be a breaking change. 😕

Copy link
Collaborator Author

@helixbass helixbass left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@GeoffreyBooth updated per your comments

src/lexer.coffee Outdated
@@ -1294,6 +1302,7 @@ POSSIBLY_DIVISION = /// ^ /=?\s ///
HERECOMMENT_ILLEGAL = /\*\//

LINE_CONTINUER = /// ^ \s* (?: , | \??\.(?![.\d]) | \??:: ) ///
INDENT_SUPPRESSOR = /// ^ \s* (?: and\s+(?!:)\S | or\s+(?!:)\S | && | \|\| ) ///
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@GeoffreyBooth sure, added comments for LINE_CONTINUER and INDENT_SUPPRESSOR - the comments I added explain the token mechanics, but not sure if you think there should also be eg examples of what syntax these regexes allow?

@@ -661,6 +662,18 @@ exports.Rewriter = class Rewriter
@detectEnd i + 1, condition, action
return 1

# Convert TERMINATOR followed by && or || into a single LEADING_&& or
# LEADING_|| token to disambiguate grammar.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@GeoffreyBooth sure, I used LEADING_&& to mimic the existing && token type, but agreed it reads badly, updated

It works equivalently for and and &&, as by the time the rewriter/grammar sees them they're the same token type (and both variants are included in INDENT_SUPPRESSOR). A couple of the tests use &&/||

@helixbass
Copy link
Collaborator Author

@Inve1951 here's the code you linked to:

bTest = 
  abc and
    d or g

To be clear, this PR isn't a "breaking change" with regards to this example nor should it be a breaking change at all

If you're saying that your proposed "indented or clause implies parens" syntax is breaking then ya I'd say this is a good example of why that syntax, while perhaps convenient, would be at the very least confusing with regards to existing syntax. I don't think there's a great way around the fact that someone can indent in a misleading way like in the above example (which implies abc and (d or g) but in fact is equivalent to (abc and d) or g). Although I'll say that using Prettier helps point it out by both explicitly parenthesizing mixed and/or expressions for you and for longer expressions using indentation to help show the shape of the parsed expression. Eg the Prettier plugin I'm working on formats your example like:

bTest = (abc and d) or g

@@ -649,7 +650,8 @@ exports.Rewriter = class Rewriter
condition = (token, i) ->
[tag] = token
[prevTag] = @tokens[i - 1]
tag is 'TERMINATOR' or (tag is 'INDENT' and prevTag not in SINGLE_LINERS)
[nextTag] = @tokens[i + 1] unless i is @tokens.length - 1
tag is 'TERMINATOR' and nextTag not in LEADING_LOGICAL or (tag is 'INDENT' and prevTag not in SINGLE_LINERS)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@GeoffreyBooth made a small update to support leading logical operators inside eg an if condition (added corresponding tests). For example:

if someCondition
    and someOtherCondition
    or somethingElse
  doSomething()

Basically what was happening was since tagPostfixConditionals() runs before tagLeadingLogical(), the IF was seeing a following TERMINATOR (which hadn't yet been converted into eg LEADING_AND) and deciding that it was a POST_IF

Can't just move tagLeadingLogical() to before tagPostfixConditionals(), since there's a dependency: tagLeadingLogical() should run after addImplicitBracesAndParens(), which should run after tagPostfixConditionals()

So basically detect what will become an eg LEADING_AND here to avoid tagging as a postfix conditional

@carlsmith
Copy link
Contributor

carlsmith commented May 21, 2018

This is nice:

if someCondition
    and someOtherCondition
    or somethingElse
  doSomething()

But the indentation is confusing. I would have thought the block should be more indented than the predicate, like:

if someCondition
    and someOtherCondition
    or somethingElse
        doSomething()

Does the block need to be less indented?

The named and and or operators complement if, making the whole thing more readable. Having other operators, like

if someValue
    + foo
    * bar
        doSomething()

Doesn't have the same appeal. There's a case for limiting this to named logical operators, and (a bit weaker case for) only when they're part of a predicate.

@helixbass
Copy link
Collaborator Author

helixbass commented May 21, 2018

@carlsmith nope the indentation of the continuing lines is up to you (as long as they’re not outdented), eg all of these work:

if someCondition
and someOtherCondition
or somethingElse
  doSomething()
if someCondition
  and someOtherCondition
  or somethingElse
    doSomething()
if someCondition
  and someOtherCondition
  or somethingElse
  doSomething()
if someCondition
    and someOtherCondition
      or somethingElse
  doSomething()

@GeoffreyBooth
Copy link
Collaborator

@helixbass The third example shouldn’t work. This is invalid today:

if someCondition and
  someOtherCondition or
  somethingElse
  doSomething()

@helixbass
Copy link
Collaborator Author

@GeoffreyBooth ya I noticed that behavior with respect to chaining. While obviously it's a poor choice to "indent" the continuations the same as the following indented block, it also feels inconsistent that every other indentation is allowed:

if a
.b
  c
if a
 .b
  c
if a
    .b
  c

but just that one (where the indentation "happens to match up") fails:

if a
  .b
  c

I also wasn't sure exactly what was causing the current behavior (for chaining or your example), I'm guessing it's that LINE_CONTINUER does this when it sees what would be an indent:

@indebt = size - @indent unless backslash

which I haven't fully thought through but seemed questionable

So if it's desirable to preserve this behavior (of failing if you try to "indent" continuation lines the same as the following actual indent) then I'll look more closely at how LINE_CONTINUER currently achieves that and think about whether there's a better way to implement it for INDENT_SUPPRESSOR

@GeoffreyBooth
Copy link
Collaborator

I’m not sure what’s desirable. I think the whole idea of indentation being insignificant strikes me as wrong, that we’re going against a core tenet of the language.

@helixbass
Copy link
Collaborator Author

helixbass commented May 21, 2018

@GeoffreyBooth I understand the unease. I do think that for both chaining and this, it makes sense to be flexible as far as how continuation lines are "indented", ie it wouldn't be a great idea syntax-wise to enforce a rigid "you have to align continuation lines with the initial line" or "you have to indent continuation lines consistently with respect to the initial line and a possible following indented block". So I think the mechanism here (INDENT_SUPPRESSOR) is at least an improvement as far as reducing the amount of non-grammatical magic/special-casing

I believe Python works this way, where once you're continuing a line, indentation is not as strict

I think the important questions then are whether there are strange interactions around these features that could be handled better. On a separate branch, I've been trying applying INDENT_SUPPRESSOR to chaining (so basically getting rid of LINE_CONTINUER) and it did basically work (as far as passing existing tests) and simplified code. But it seems clear the biggest edge case is around implicit objects eg

a: b
    .c()

If you see eg #4533, this has already been kind of an issue. But it's still wildly inconsistent, it only seems to apply the chain to the preceding object-property value if it's a non-initial property of a multi-property object, eg the above is currently equivalent to {a: b}.c(). While this:

a: b
c: d
    .e()

is equivalent to {a: b, c: d.e()}. And this:

a: b
    .c()
d: e

is equivalent to {a: b}.c(); {d: e}

So in thinking about how to handle the equivalent case for leading logical operators (which we haven't defined the behavior of yet) and at least in theory how it should work for chaining I'd propose:

  • the better choice between whether the chain/logical operator applies to the whole object or the preceding value is the preceding value, as use cases for chaining or logical operator following an object seem quite small (logical operators don't make sense, objects are consistently truthy)
  • it should be consistent, so all three of the above variants (or their corresponding examples with logical operator continuers rather than chains) should compile into a single object where one of the values was continued
  • I should implement that behavior for leading logical operators as part of this branch
  • then we should consider whether it'd be worth introducing some breakage at some point to at least make chaining consistent with regards to implicit objects and perhaps migrate chaining to also use INDENT_SUPPRESSOR

@vendethiel
Copy link
Collaborator

vendethiel commented May 23, 2018

Detecting such things with regexes has a downside, if very minor:

$ ./bin/coffee -bcs
a
  and ::b

[stdin]:2:1: error: unexpected indentation
  and ::b
^^

Also while supporting more operators here would be – imho – a good thing, although there's no way to support operators that are both unary and binary.

@helixbass
Copy link
Collaborator Author

@vendethiel ah interesting edge case, ya that wouldn't have parsed anyway but it is a slightly unexpected error message. Updated the regex and added a test case

To explain more fully for those interested:

Initially the INDENT_SUPPRESSOR regex just looked for leading and/or followed by whitespace (or &&/||). Then our lexer.coffee happened to contain a case that I hadn't thought of where that is actually a legitimate beginning of a line: and/or as an object key with whitespace before the :, eg and : 1. So extended the regex to check that the first non-whitespace character after and/or isn't :

But now @vendethiel pointed out an edge case to that edge case: a : may not be a : token, it might be part of :: (even though you'd always get parse errors for and and :: as consecutive tokens). So extended the regex to verify that it's a single : (not followed by another :), so that his example, even though still not valid code, will parse as a continuation and fail on unexpected :: rather than unexpected indentation

@carlsmith
Copy link
Contributor

Thanks for the clarification.

@GeoffreyBooth GeoffreyBooth changed the base branch from master to ast June 11, 2018 04:45
@helixbass
Copy link
Collaborator Author

@GeoffreyBooth implemented leading-logical-operator-continues-object-property-value behavior per previous comment and merged ast, so this should be ready for review

@GeoffreyBooth
Copy link
Collaborator

This isn't really required for generating an AST, is it? It's just for making the Prettier output a little prettier?

If so, this feels like something we should add in a second phase, once we have regular AST output working. This change could have other side effects that we should study before merging it in.

@helixbass helixbass force-pushed the allow-leading-logical-operators branch from 57c967e to 99a5b81 Compare March 17, 2019 20:10
@voxsoftware
Copy link

Hi, for all, thanks for effort in make better coffeescript but I really thinks that this proposal is not good or necessary. I prefer use explicit \

shouldOmitBracesButNotIndent =
  (parent.type is 'ExpressionStatement' \
    and grandparent.type is 'ClassBody') \
  or (parent.type is 'ArrayExpression' and parent.elements.length is 1) \
  or isObjectPropertyValue path, {stackOffset, last: yes}

@Inve1951
Copy link
Contributor

Of course it's not necessary but neither are any of the other implicit line continuations; They are nice to have tho.
When a line starts with a logical operator it can really only mean one thing so why not support it.

I really thinks that this proposal is not good or necessary.

Why do you dislike it? I personally crave leading logicals and hope they get added into master. @voxsoftware
Note that your code exampe would still be valid.

@voxsoftware
Copy link

@Inve1951 yes, my example is using explicit, I prefer that syntax and avoid errors can inherited from have a non explicit version

@helixbass
Copy link
Collaborator Author

Re-opened as #5279

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants