-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Loader speed: Query and DDL optimization #195
Comments
Slowness is because querying the table schema is taking many minutes to complete:
https://github.com/practo/tipoca-stream/blob/master/redshiftsink/pkg/redshift/redshift.go#L56-L76 Please fix this. |
Botteneck for loader performance is the Redshift load. If 1000s of queries are occupying redshift with load and read then Redshift just cannot load fast. Check the below graph, the dip happens when all the loads have stopped and only one table is allowed to load. #207 has more info. |
Lets experiment with compression and may be changing the data format in S3 as specified in https://docs.aws.amazon.com/redshift/latest/dg/c_loading-data-best-practices.html |
Compression is live. Now experimenting with
|
This issue was for DDL optimizations, loader speed is being tracked actually in this #186 |
now
This was seen to happen in the first batch being processed when the loader pod was created.
Loader pod is only handling one topic.
Bug has come after the recent schema call optimizations.
It used to finish and move to load staging after start in milliseconds, now it is taking minutes.
before
The text was updated successfully, but these errors were encountered: