-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds shell script for unbounded table read , modifies table_read to execute the same. #77
Adds shell script for unbounded table read , modifies table_read to execute the same. #77
Conversation
This module is similar to the BigQueryExample. A few changes to count the number of records and log them.
This test reads a simpleTable. Shell script and python script to check the number of records read.
This test reads a simpleTable. Shell script and python script to check the number of records read.
This test reads a simpleTable. Shell script and python script to check the number of records read.
This test reads a simpleTable. Shell script and python script to check the number of records read.
This test reads a simpleTable. Shell script and python script to check the number of records read.
This test reads a simpleTable. Shell script and python script to check the number of records read.
comments CODECOV_TOKEN usage.
…bounded and unbounded source.
…ble with complex schema.
…ble with complex schema.
…ds to different tables required for the e2e tests.
…nded-read-2-3 # Conflicts: # cloudbuild/python-scripts/utils/utils.py
…o fresh partitioned table creation.
…' into nightly-tests-unbounded-read-3-4
…' into nightly-tests-unbounded-read-3-4
…rgument. Reformats the file.
…s-unbounded-read-3-4
else | ||
echo "Unbounded Mode!" | ||
source cloudbuild/e2e-test-scripts/unbounded_table_read.sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should explicitly check if mode is unbounded
, just like the bounded
case. If mode is neither bounded nor unbounded, throw error.
python3 cloudbuild/python-scripts/insert_dynamic_partitions.py -- --project_name "$PROJECT_NAME" --dataset_name "$DATASET_NAME" --table_name "$TABLE_NAME" --refresh_interval "$PARTITION_DISCOVERY_INTERVAL" | ||
|
||
# Wait for a bit, as mapping and output of records takes some time. | ||
sleep 3m |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did we arrive at this number?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In our code we wait 2.5 mins for the read streams to form (prior to insertion).
This also makes sure that the previously inserted records are read properly.
But after the last insertion, since there is no wait in the python code, we wait through the script.
Can remove the extra 30 seconds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having a cushion is alright. Just wanted to understand the reason behind this number.
@@ -36,7 +36,13 @@ if [ "$MODE" == "bounded" ] | |||
then | |||
echo "Bounded Mode!" | |||
source cloudbuild/e2e-test-scripts/bounded_table_read.sh | |||
|
|||
elif [ "$MODE" == "unbounded" ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use shell case
statement, as here
Adds shell script that
Creates the asynchronously running unbounded job - the dataproc job is created in mode unbounded
Dynamically adds partitions - insert_dynamic_partitions.py is executed with the necessary parameters.
kills the dataproc job - As the read has completed and its correctness needs to be checked further.
/gcbrun