-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wait replication sync command #91
Conversation
@option( | ||
"-t", | ||
"--timeout", | ||
type=int, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest to use TimeSpanParamType
Usage example:
ch-tools/ch_tools/chadmin/cli/object_storage_group.py
Lines 123 to 131 in baaf843
@option( | |
"-g", | |
"--guard-interval", | |
"--to-time", | |
"to_time", | |
default=DEFAULT_GUARD_INTERVAL, | |
type=TimeSpanParamType(), | |
help=("End of inspecting interval in human-friendly format."), | |
) |
default=3 * 24 * 60 * 60, | ||
help="Max amount of time to wait, in seconds. Default is 30 days.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Default is 3 days, right? Not 30.
And default values should be automatically added to help messages. So it's not required to write them manually.
help="Max amount of time to wait, in seconds. Default is 30 days.", | ||
) | ||
@pass_context | ||
def wait_replication_sync_command(ctx, status, pause, timeout): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest to move the command to chadmin
and estimate_replication_lag
function from ch_replication_lag.py
somewhere to https://github.com/yandex/ch-tools/blob/master/ch_tools/common
When we execute query on clickhouse01 | ||
""" | ||
SYSTEM STOP FETCHES | ||
""" | ||
And we execute query on clickhouse02 | ||
""" | ||
INSERT INTO test.table_01 SELECT number FROM numbers(100) | ||
""" | ||
And we sleep for 5 seconds | ||
And we execute command on clickhouse01 | ||
""" | ||
ch-monitoring replication-lag -w 4 | ||
""" | ||
Then we get response contains | ||
""" | ||
1; | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about the same test that detects replication lag for replication-sync
command, but default timeout for replication-lag
is 300 seconds. This would require to sleep for 300 seconds and will slow tests a lot.
Maybe we should pass args for replication-lag
to replication-sync
as discussed above to avoid this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense
Also I discovered that running a |
At first sight, we can pin |
Wait replication sync script is moved to ch-monitoring since it is basically a call to ch-monitoring command inside.
It might be worth moving it to chadmin.
I also not sure how we should implement a call of a click command inside another command, the most direct way to do it is to call a command with
ctx.invoke()
orctx.forward()
, but the docs suggest it is not a recommended way to do this. That's why I movedreplication-lag
command to a serapate method for a while.