-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix CAST(JSON as ROW(ARRAY)) #9447
Conversation
This pull request was exported from Phabricator. Differential Revision: D56013293 |
✅ Deploy Preview for meta-velox canceled.
|
Summary: CAST(JSON as ROW(ARRAY()) used to fail with ``` OUT_OF_ORDER_ITERATION: Objects and arrays can only be iterated when they are first encountered. ``` According to simdjson documentation, https://github.com/simdjson/simdjson/blob/master/doc/basics.md, it is not allowed to store object values for later processing. These must be consumed or copied before proceeding. Also, fixed behavior when JSON object contains duplicate keys. Presto throws, but previous implementation used to allow duplicates. Also, fix the test to actually verify JSON objects with mixed case keys. Differential Revision: D56013293
3244566
to
08a94a3
Compare
This pull request was exported from Phabricator. Differential Revision: D56013293 |
Summary: CAST(JSON as ROW(ARRAY()) used to fail with ``` OUT_OF_ORDER_ITERATION: Objects and arrays can only be iterated when they are first encountered. ``` According to simdjson documentation, https://github.com/simdjson/simdjson/blob/master/doc/basics.md, it is not allowed to store object values for later processing. These must be consumed or copied before proceeding. Also, fixed behavior when JSON object contains duplicate keys. Presto throws, but previous implementation used to allow duplicates. Also, fix the test to actually verify JSON objects with mixed case keys. Differential Revision: D56013293
08a94a3
to
829d20c
Compare
This pull request was exported from Phabricator. Differential Revision: D56013293 |
Summary: CAST(JSON as ROW(ARRAY()) used to fail with ``` OUT_OF_ORDER_ITERATION: Objects and arrays can only be iterated when they are first encountered. ``` According to simdjson documentation, https://github.com/simdjson/simdjson/blob/master/doc/basics.md, it is not allowed to store object values for later processing. These must be consumed or copied before proceeding. Also, fixed behavior when JSON object contains duplicate keys. Presto throws, but previous implementation used to allow duplicates. Also, fix the test to actually verify JSON objects with mixed case keys. Differential Revision: D56013293
Summary: CAST(JSON as ROW(ARRAY()) used to fail with ``` OUT_OF_ORDER_ITERATION: Objects and arrays can only be iterated when they are first encountered. ``` According to simdjson documentation, https://github.com/simdjson/simdjson/blob/master/doc/basics.md, it is not allowed to store object values for later processing. These must be consumed or copied before proceeding. Also, fixed behavior when JSON object contains duplicate keys. Presto throws, but previous implementation used to allow duplicates. Also, fix the test to actually verify JSON objects with mixed case keys. Differential Revision: D56013293
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
// Mapping from lower-case field names of the target RowType to their | ||
// indices. | ||
folly::F14FastMap<std::string, column_index_t> fieldIndices; | ||
for (auto i = 0; i < rowType.size(); ++i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: RowType::size
is virtual and may not be able to inline here, so better assign it to a variable before loop
fieldIndices[key] = i; | ||
} | ||
|
||
std::vector<bool> foundIndices(rowType.size(), false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could initialize fieldIndices
value to -1 then we don't need this second memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. Updated.
Summary: CAST(JSON as ROW(ARRAY()) used to fail with ``` OUT_OF_ORDER_ITERATION: Objects and arrays can only be iterated when they are first encountered. ``` According to simdjson documentation, https://github.com/simdjson/simdjson/blob/master/doc/basics.md, it is not allowed to store object values for later processing. These must be consumed or copied before proceeding. Also, fixed behavior when JSON object contains duplicate keys. Presto throws, but previous implementation used to allow duplicates. Also, fix the test to actually verify JSON objects with mixed case keys. Reviewed By: Yuhta Differential Revision: D56013293
829d20c
to
afac6c4
Compare
This pull request was exported from Phabricator. Differential Revision: D56013293 |
@Yuhta Jimmy, thank you for review. Updated to address your comments. |
Summary: CAST(JSON as ROW(ARRAY()) used to fail with ``` OUT_OF_ORDER_ITERATION: Objects and arrays can only be iterated when they are first encountered. ``` According to simdjson documentation, https://github.com/simdjson/simdjson/blob/master/doc/basics.md, it is not allowed to store object values for later processing. These must be consumed or copied before proceeding. Also, fixed behavior when JSON object contains duplicate keys. Presto throws, but previous implementation used to allow duplicates. Also, fix the test to actually verify JSON objects with mixed case keys. Reviewed By: Yuhta Differential Revision: D56013293
afac6c4
to
e84d832
Compare
This pull request was exported from Phabricator. Differential Revision: D56013293 |
Summary: CAST(JSON as ROW(ARRAY()) used to fail with ``` OUT_OF_ORDER_ITERATION: Objects and arrays can only be iterated when they are first encountered. ``` According to simdjson documentation, https://github.com/simdjson/simdjson/blob/master/doc/basics.md, it is not allowed to store object values for later processing. These must be consumed or copied before proceeding. Also, fixed behavior when JSON object contains duplicate keys. Presto throws, but previous implementation used to allow duplicates. Also, fix the test to actually verify JSON objects with mixed case keys. Reviewed By: Yuhta Differential Revision: D56013293
Summary: CAST(JSON as ROW(ARRAY()) used to fail with ``` OUT_OF_ORDER_ITERATION: Objects and arrays can only be iterated when they are first encountered. ``` According to simdjson documentation, https://github.com/simdjson/simdjson/blob/master/doc/basics.md, it is not allowed to store object values for later processing. These must be consumed or copied before proceeding. Also, fixed behavior when JSON object contains duplicate keys. Presto throws, but previous implementation used to allow duplicates. Also, fix the test to actually verify JSON objects with mixed case keys. Reviewed By: Yuhta Differential Revision: D56013293
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mbasmanova thanks!
Summary: CAST(JSON as ROW(ARRAY()) used to fail with ``` OUT_OF_ORDER_ITERATION: Objects and arrays can only be iterated when they are first encountered. ``` According to simdjson documentation, https://github.com/simdjson/simdjson/blob/master/doc/basics.md, it is not allowed to store object values for later processing. These must be consumed or copied before proceeding. Also, fixed behavior when JSON object contains duplicate keys. Presto throws, but previous implementation used to allow duplicates. Also, fix the test to actually verify JSON objects with mixed case keys. Reviewed By: xiaoxmeng, Yuhta Differential Revision: D56013293
Summary: CAST(JSON as ROW(ARRAY()) used to fail with ``` OUT_OF_ORDER_ITERATION: Objects and arrays can only be iterated when they are first encountered. ``` According to simdjson documentation, https://github.com/simdjson/simdjson/blob/master/doc/basics.md, it is not allowed to store object values for later processing. These must be consumed or copied before proceeding. Also, fixed behavior when JSON object contains duplicate keys. Presto throws, but previous implementation used to allow duplicates. Also, fix the test to actually verify JSON objects with mixed case keys. Reviewed By: xiaoxmeng, Yuhta Differential Revision: D56013293
Please rebase to main after #9451 lands. |
Summary: CAST(JSON as ROW(ARRAY()) used to fail with ``` OUT_OF_ORDER_ITERATION: Objects and arrays can only be iterated when they are first encountered. ``` According to simdjson documentation, https://github.com/simdjson/simdjson/blob/master/doc/basics.md, it is not allowed to store object values for later processing. These must be consumed or copied before proceeding. Also, fixed behavior when JSON object contains duplicate keys. Presto throws, but previous implementation used to allow duplicates. Also, fix the test to actually verify JSON objects with mixed case keys. Reviewed By: xiaoxmeng, Yuhta Differential Revision: D56013293
e84d832
to
77fa564
Compare
This pull request was exported from Phabricator. Differential Revision: D56013293 |
Summary: CAST(JSON as ROW(ARRAY()) used to fail with ``` OUT_OF_ORDER_ITERATION: Objects and arrays can only be iterated when they are first encountered. ``` According to simdjson documentation, https://github.com/simdjson/simdjson/blob/master/doc/basics.md, it is not allowed to store object values for later processing. These must be consumed or copied before proceeding. Also, fixed behavior when JSON object contains duplicate keys. Presto throws, but previous implementation used to allow duplicates. Also, fix the test to actually verify JSON objects with mixed case keys. Reviewed By: xiaoxmeng, Yuhta Differential Revision: D56013293
Summary: CAST(JSON as ROW(ARRAY()) used to fail with ``` OUT_OF_ORDER_ITERATION: Objects and arrays can only be iterated when they are first encountered. ``` According to simdjson documentation, https://github.com/simdjson/simdjson/blob/master/doc/basics.md, it is not allowed to store object values for later processing. These must be consumed or copied before proceeding. Also, fixed behavior when JSON object contains duplicate keys. Presto throws, but previous implementation used to allow duplicates. Also, fix the test to actually verify JSON objects with mixed case keys. Reviewed By: xiaoxmeng, Yuhta Differential Revision: D56013293
77fa564
to
efeb0de
Compare
Summary: CAST(JSON as ROW(ARRAY()) used to fail with ``` OUT_OF_ORDER_ITERATION: Objects and arrays can only be iterated when they are first encountered. ``` According to simdjson documentation, https://github.com/simdjson/simdjson/blob/master/doc/basics.md, it is not allowed to store object values for later processing. These must be consumed or copied before proceeding. Also, fixed behavior when JSON object contains duplicate keys. Presto throws, but previous implementation used to allow duplicates. Also, fix the test to actually verify JSON objects with mixed case keys. Reviewed By: xiaoxmeng, Yuhta Differential Revision: D56013293
This pull request was exported from Phabricator. Differential Revision: D56013293 |
This pull request has been merged in 707dbfd. |
Summary: Pull Request resolved: facebookincubator#9447 CAST(JSON as ROW(ARRAY()) used to fail with ``` OUT_OF_ORDER_ITERATION: Objects and arrays can only be iterated when they are first encountered. ``` According to simdjson documentation, https://github.com/simdjson/simdjson/blob/master/doc/basics.md, it is not allowed to store object values for later processing. These must be consumed or copied before proceeding. Also, fixed behavior when JSON object contains duplicate keys. Presto throws, but previous implementation used to allow duplicates. Also, fix the test to actually verify JSON objects with mixed case keys. Reviewed By: xiaoxmeng, Yuhta Differential Revision: D56013293 fbshipit-source-id: 28337280e1943bca4d6b46ff6f1f62341656bdda
Summary: Pull Request resolved: facebookincubator#9447 CAST(JSON as ROW(ARRAY()) used to fail with ``` OUT_OF_ORDER_ITERATION: Objects and arrays can only be iterated when they are first encountered. ``` According to simdjson documentation, https://github.com/simdjson/simdjson/blob/master/doc/basics.md, it is not allowed to store object values for later processing. These must be consumed or copied before proceeding. Also, fixed behavior when JSON object contains duplicate keys. Presto throws, but previous implementation used to allow duplicates. Also, fix the test to actually verify JSON objects with mixed case keys. Reviewed By: xiaoxmeng, Yuhta Differential Revision: D56013293 fbshipit-source-id: 28337280e1943bca4d6b46ff6f1f62341656bdda
Summary:
CAST(JSON as ROW(ARRAY()) used to fail with
According to simdjson documentation, https://github.com/simdjson/simdjson/blob/master/doc/basics.md, it is not allowed to store object values for later processing. These must be consumed or copied before proceeding.
Also, fixed behavior when JSON object contains duplicate keys. Presto throws, but previous implementation used to allow duplicates.
Also, fix the test to actually verify JSON objects with mixed case keys.
Differential Revision: D56013293