Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce TypelessStructLiteralArg and NewConstructorArg interfaces #183

Merged
merged 22 commits into from
Oct 29, 2024

Conversation

makenowjust
Copy link
Collaborator

@makenowjust makenowjust commented Oct 27, 2024

Close #119
Close #177

TypelessStructValue (#177) and NewConstructorValue (#119) share their struct shapes. Then, this PR introduces a new struct named AsExpr for expression AS name-kind values and replaces them with it.

I believe this is not a hasty generalization because these struct shapes are exactly the same, and they are used for common semantics. However, I'm not sure AsExpr is the best name. If you have a better name, please tell me it!

Thank you.


See #183 (comment).

@apstndb
Copy link
Contributor

apstndb commented Oct 27, 2024

TypelessStructValue (#177) and NewConstructorValue (#119) share their struct shapes. Then, this PR introduces a new struct named AsExpr for expression AS name-kind values and replaces them with it.

I believe this is not a hasty generalization because these struct shapes are exactly the same, and they are used for common semantics.

I have concerned about there are undocumented differences in NEW constructors and typeless STRUCT literals.

https://github.com/google/zetasql/blob/194cd32b5d766d60e3ca442651d792c7fe54ea74/zetasql/parser/bison_parser.y#L8927-L8942

    | expression "AS" "(" path_expression ")"
      {
        // Do not parenthesize $4 because it is not really a parenthesized
        // path expression. The parentheses are just part of the syntax here.
        $$ = MAKE_NODE(ASTNewConstructorArg, @$, {$1, $4});
      }

I believe it is a syntax for proto extensions and it is not yet available in Cloud Spanner, but I think these semantics are not same. It may be a reason to distinguish between TypelessStructValue and NewConstructorValue .

spanner> SELECT NEW examples.shipping.`Order`("Japan" AS (shipping_address.country));
ERROR: spanner: code = "InvalidArgument", desc = "NEW constructor does not support proto extensions [at 1:50]\\nSELECT NEW examples.shipping.`Order`(\\\"Japan\\\" AS (shipping_address.country))\\n      

@apstndb
Copy link
Contributor

apstndb commented Oct 27, 2024

This difference is natural because:

  • NewConstructor is a syntax for PROTO type.
  • TypelessStructLiteral is a syntax for STRUCT type.

@makenowjust
Copy link
Collaborator Author

IMHO, when expression AS (path) is needed, we can introduce a new interface NewConstructorArg, and a new struct ExprAsPath, so we can go this way even if STRUCT and NEW syntax are different.

@apstndb
Copy link
Contributor

apstndb commented Oct 27, 2024

Personally I think it is acceptable to generalize them temporarily, even if their semantics are not same,
But I am wondering if this generalization really have benefit.

we can introduce a new interface NewConstructorArg, and a new struct ExprAsPath

I think it can't be a general name because it is expression AS (path_expression), not expression AS path_expression.
It seems that (path_expression) is restricted form of generalized_extension_path.
In ZetaSQL, new_constructor_arg is really special structure with no other similar syntax, because its syntax is specialized to Protocol Buffers.

@apstndb
Copy link
Contributor

apstndb commented Oct 27, 2024

I understood. We can add another interface when it is needed.
NewConstructorArg won't be concrete struct so it is not needed to define.

@apstndb
Copy link
Contributor

apstndb commented Oct 27, 2024

However, I'm not sure AsExpr is the best name. If you have a better name, please tell me it!
Rename AsExpr to ExprAsName, and use AsAlias

IMHO, ExprAsName doesn't imply the name is optional, and there are other expr AS name structure(e.g. in * REPLACE(expr AS name, ...)).
I think it can be confusing so more explicit name, like ExprOptName, may be prefered.

syntax AS ident Rule in ZetaSQL ast node in ZetaSQL Parser func in memefish Name in memefish
AS ident required   opt_as_alias_with_required_as ASTAlias tryParseAsAlias() AsAlias
[AS] ident optional   opt_as_alias ASTAlias tryParseAsAlias() AsAlias
expr none none ASTExpression ExprArg or Expr  
expr AS ident required required (star_replace_item, select_column_expr_with_as_alias, ...) ASTExpressionWithAlias (not used externally?)  
expr [AS ident] required optional expression_with_opt_alias, (expression_using_argument, ...) ASTExpressionWithOptAlias   ExprAsName?
expr [AS] ident optional required      Alias
expr [[AS] ident] optional optional (select_column_expr, ...) concrete nodes    

(There are many variants.)

parser.go Outdated Show resolved Hide resolved
ast/ast.go Outdated
Values []*ExprAsName
}

// ExprAsName is value with optional name in typeless struct literal or new constructor.
Copy link
Contributor

@apstndb apstndb Oct 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is not only for typeless struct literal and new constructor.
We can encourage to use this struct for any expr [AS ident] syntax, without semantic implications.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can update the document anytime.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It means that this struct is added only for NewConstructorArg and TypelessStructValue in this PR context and we need to discuss to change this struct semantics in another PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From this conversation, it sounds like you don't intend to widely recommend the use of this struct.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thought:

  • TypelessStructValue and NewConstructorArg is not same abstraction, but have common concrete structure.
  • Concrete structure will able to be reused without modification.
  • IMO, currently ExprAsName is no strong reason to be introduced.

@makenowjust
Copy link
Collaborator Author

Finally, I conclude there is no better name for the struct expression [AS name]. That is, we should not introduce such a struct.

Therefore, we introduce new interfaces TypelessStructLiteralArg and NewConstructorArg. Currently, two interfaces have the same implementations and parsers, but they are separated for future extensibility.

@makenowjust makenowjust changed the title Introduce AsExpr instead of TypelessStructValue and NewConstructorValue Introduce TypelessStructLiteralArg and NewConstructorArg interfaces Oct 28, 2024
Copy link
Contributor

@apstndb apstndb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel it is natural to introduce these interfaces and use ExprArg and Alias.
I have not noticed their SelectItem implementation can be used to implement them.

I agree their implementations are better than my original PRs.

parser.go Outdated
@@ -479,22 +479,26 @@ func (p *Parser) parseSelectItem() ast.SelectItem {
}
}

func (p *Parser) tryParseAsAlias() *ast.AsAlias {
func (p *Parser) tryParseAsAlias(requiredAs bool) *ast.AsAlias {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like p.tryParseAsAlias( /* requiredAs */ false) is a little difficult to write.
Can't we introduce two method for requiredAs=true and requiredAs=False? (like (tryParseAsAliasAsRequired and tryParseAsAliasAsOptional)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tryParseAsAliasAsRequired seems like a strange name to someone unfamiliar with this project (e.g. twice As looks like typo, and try and required seem contradictory), and it is hard to guess what the method does. By making the explicit argument, it is clear that the method parses AS name as possible, and the comment makes it clear that the argument gives the required AS.

Copy link
Contributor

@apstndb apstndb Oct 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thought:

twice As looks like typo

I think that the current AsAlias([AS] alias) may be better to be called as Alias, and the current Alias(expr [AS] alias, alias required) can be called as ExprWithAlias or AliasedExpr.
tryParseAliasAsRequired and tryParseAliasAsOptional are felt better.

the comment makes it clear that the argument gives the required AS.

I think maintaining /* requiredAs */ comment is harder because it can't be assisted by go-lsp gopls or IDEs, and it is also strange to someone(me) unfamiliar with this project. At least, it must be noted in doc comment of tryParseAsAlias().

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tryParseAliasAsRequired also seems strange to me because As looks like a usual preposition, and try and required have conflicting meanings.

I think that Alias is the identifier after AS, not the AS name itself. In that sense, I think that the current Alias should be changed to AliasedExpr, but I do not agree with naming AsAlias as Alias.

Copy link
Contributor

@apstndb apstndb Oct 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tryParseAliasAsRequired also seems strange to me because As looks like a usual preposition, and try and required have conflicting meanings.

It means they can be better if the subject of "required" or "optional" is unambiguous.
The subject is AS keyword, so they can be called as like tryParseAsAliasRequireKeyword, tryParseAsAliasOptionalKeyword.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's ambiguous enough that there are multiple methods to parse a particular struct, and I don't think it's a good idea to distinguish them by method names. If it really should be parsed by multiple methods, we should introduce multiple structs, but I don't think in this case.

Copy link
Contributor

@apstndb apstndb Oct 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's ambiguous enough that there are multiple methods to parse a particular struct

I agree structs and methods should be one by one.
In other languages, it may be a usecase of named parameter which is missing in Go.
In Go, it may be better to introduce typed constant with enum semantics.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, because of code jumps and editor popups, I decided to remove the /* requiredAs */ comment.

Copy link
Contributor

@apstndb apstndb Oct 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have agreed about tryParseAsAlias(), so it is just my thought.

I think that Alias is the identifier after AS, not the AS name itself. In that sense, I think that the current Alias should be changed to AliasedExpr, but I do not agree with naming AsAlias as Alias.

SQL has common structure "alias, which is optionally prefixed by AS", and they are safe to unparse always with AS in GoogleSQL.
I prefer to call it Alias than AsAlias because someones have coding guideline which prefer to omit AS.
As far as I know, AS of table alias is often omitted. (Probably due to Oracle's past not allowing AS.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name Alias ​​has less information than the name AsAlias, so I don't think it should be changed proactively.

Copy link
Contributor

@apstndb apstndb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we have finished discussions and tryParseAsAlias with typed enum resolves our concernings.
Thank you for finishing my PRs.
LGTM!

@makenowjust makenowjust merged commit db40951 into main Oct 29, 2024
4 checks passed
@makenowjust makenowjust deleted the as-expression branch October 29, 2024 13:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants