-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(spanner/spansql): add support for protobuf column types & Proto bundles #10945
feat(spanner/spansql): add support for protobuf column types & Proto bundles #10945
Conversation
Now that Spanner supports protobuf message and enum-typed columns and casts, add support for parsing those those types. Since protobuf columns aren't distinguished by a keyword, adjust the parser to see any unquoted identifier that's not a known type as a possible protobuf type and loop, consuming `.`s and identifiers until it hits a non-ident/`.` token. (to match the proto namespace components up through the message or enum names) To track the fully-qualified message/enum type-name add an additional field to the `Type` struct (tentatively) called `ProtoRef` so we can recover the message/enum name if canonicalizing everything.
Add support for parsing and serializing CREATE, ALTER and DROP PROTO BUNDLE DDL statements.
8e788c3
to
0c841d8
Compare
@dfinkel tests failed, please check and fix them |
This else if block got lost while squashing/reordering commits (resolving a conflict). Bring it back so the tests pass.
Thanks @rahul2393 for the prompt review and approval! Sorry about the test failures! The tests for this package now pass locally. |
Friendly ping @rahul2393 (thanks for the prompt approval) |
@rahul2393 if you have a few minutes, can you make another pass over this PR? |
Thanks @rahul2393 ! |
🤖 I have created a release *beep* *boop* --- ## [1.72.0](https://togithub.com/googleapis/google-cloud-go/compare/spanner/v1.71.0...spanner/v1.72.0) (2024-11-07) ### Features * **spanner/spansql:** Add support for protobuf column types & Proto bundles ([#10945](https://togithub.com/googleapis/google-cloud-go/issues/10945)) ([91c6f0f](https://togithub.com/googleapis/google-cloud-go/commit/91c6f0fcaadfb7bd983e070e6ceffc8aeba7d5a2)), refs [#10944](https://togithub.com/googleapis/google-cloud-go/issues/10944) ### Bug Fixes * **spanner:** Skip exporting metrics if attempt or operation is not captured. ([#11095](https://togithub.com/googleapis/google-cloud-go/issues/11095)) ([1d074b5](https://togithub.com/googleapis/google-cloud-go/commit/1d074b520c7a368fb8a7a27574ef56a120665c64)) --- This PR was generated with [Release Please](https://togithub.com/googleapis/release-please). See [documentation](https://togithub.com/googleapis/release-please#release-please).
{`CAST(Bar AS ENUM)`, Func{Name: "CAST", Args: []Expr{TypedExpr{Expr: ID("Bar"), Type: Type{Base: Enum}}}}}, | ||
{`CAST(Bar AS PROTO)`, Func{Name: "CAST", Args: []Expr{TypedExpr{Expr: ID("Bar"), Type: Type{Base: Proto}}}}}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe they are not a valid GoogleSQL query because ENUM
and PROTO
is keywords.
$ gcloud spanner databases execute-sql ${SPANNER_DATABASE} --sql 'SELECT CAST(Bar AS PROTO)'
ERROR: (gcloud.spanner.databases.execute-sql) INVALID_ARGUMENT: Syntax error: Unexpected keyword PROTO [at 1:20]\nSELECT CAST(Bar AS PROTO)\n ^
- '@type': type.googleapis.com/google.rpc.LocalizedMessage
locale: en-US
message: |-
Syntax error: Unexpected keyword PROTO [at 1:20]
SELECT CAST(Bar AS PROTO)
^
$ gcloud spanner databases execute-sql ${SPANNER_DATABASE} --sql 'SELECT CAST(Bar AS ENUM)'
ERROR: (gcloud.spanner.databases.execute-sql) INVALID_ARGUMENT: Syntax error: Unexpected keyword ENUM [at 1:20]\nSELECT CAST(Bar AS ENUM)\n ^
- '@type': type.googleapis.com/google.rpc.LocalizedMessage
locale: en-US
message: |-
Syntax error: Unexpected keyword ENUM [at 1:20]
SELECT CAST(Bar AS ENUM)
^
CAST AS ENUM
and CAST AS PROTO
needs named type name of PROTO
or ENUM
, not PROTO
or ENUM
keywords.
https://cloud.google.com/spanner/docs/reference/standard-sql/data-types#enum_type
You reference an enum type, such as when using CAST, by using its fully qualified name.
https://cloud.google.com/spanner/docs/reference/standard-sql/conversion_functions#cast_as_proto
SELECT CAST( ''' year: 2001 month: 9 type { award_name: 'Best Artist' category: 'Artist' } type { award_name: 'Best Album' category: 'Album' } ''' AS googlesql.examples.music.Award) AS award_col
Therefore, they should be escaped named type name:
CAST(Bar AS `PROTO`)
CAST(Bar AS `ENUM`)
or other named type names.
CAST(Bar AS ProtoType)
CAST(Bar AS EnumType)
}}, | ||
}, | ||
}, | ||
`SELECT CAST(7 AS ENUM)`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same
$ gcloud spanner databases execute-sql ${SPANNER_DATABASE} --sql 'SELECT CAST(7 AS ENUM)'
ERROR: (gcloud.spanner.databases.execute-sql) INVALID_ARGUMENT: Syntax error: Unexpected keyword ENUM [at 1:18]\nSELECT CAST(7 AS ENUM)\n ^
- '@type': type.googleapis.com/google.rpc.LocalizedMessage
locale: en-US
message: |-
Syntax error: Unexpected keyword ENUM [at 1:18]
SELECT CAST(7 AS ENUM)
^
I have re-used this tests for another purpose( cloudspannerecosystem/memefish#115 ) , so it is better to contain only valid cases. |
Thanks @apstndb, I definitely copied those cases from the documentation. I can put together a PR to switch those to something that's actually valid tomorrow. (I think I meant to verify that those were valid before sending this PR out but as usual other things intervened and that got lost) |
A misreading of the spanner docs lead to tests that indicated that casting `AS ENUM` or `AS PROTO` was valid syntax (despite not specifying _which_ protobuf enum or message type to cast to). Replace these cases with ones that validate casting to specific enum/message types. Thanks to @apstndb for calling this out on googleapis#10945.
I just opened a PR with fixes for some bugs I ran into while using the features in this PR (including fixing the tests that @apstndb pointed out): The most problematic bug is the fact that Thanks in advance! |
…11279) * fix: spansql: fix NOT NULL protobuf column type parsing The protobuf type-name parser was so greedy that it failed on NOT NULL columns. In particular, it wasn't aware that unquoted tokens should be separated by dots, and that quoted tokens shouldn't be concatenated with anything else. Fix this by adding a boolean to handle that alternation and only allowing quoted IDs if nothing else has been consumed. (then bailing immediately after a quoted ID so we don't try to consume anything else) * fix: spansql: fix invalid CAST tests A misreading of the spanner docs lead to tests that indicated that casting `AS ENUM` or `AS PROTO` was valid syntax (despite not specifying _which_ protobuf enum or message type to cast to). Replace these cases with ones that validate casting to specific enum/message types. Thanks to @apstndb for calling this out on #10945. * fix spansql: CREATE PROTO BUNDLE SQL with 0 types Fix a bug in CreateProtoBundle.SQL() which unintentionally generated the DDL when there were no types listed: ``` CREATE PROTO BUNDLE (``) ``` --------- Co-authored-by: Sri Harsha CH <[email protected]>
feat(spansql): CREATE/ALTER/DROP PROTO BUNDLE
Add support for parsing and serializing CREATE, ALTER and DROP PROTO
BUNDLE DDL statements.
feat(spanner/spansql): support for protobuf types
Now that Spanner supports protobuf message and enum-typed columns and
casts, add support for parsing those those types.
Since protobuf columns aren't distinguished by a keyword, adjust the
parser to see any unquoted identifier that's not a known type as a
possible protobuf type and loop, consuming
.
s and identifiers until ithits a non-ident/
.
token. (to match the proto namespace components upthrough the message or enum names)
To track the fully-qualified message/enum type-name add an additional
field to the
Type
struct (tentatively) calledProtoRef
so we canrecover the message/enum name if canonicalizing everything.
closes: #10944