-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(outputs.sql): Add option to automate table schema updates #16544
base: master
Are you sure you want to change the base?
Conversation
Thanks so much for the pull request! |
30b5ab0
to
ae1da3c
Compare
Pushed a new version to address the linter complaints and took the opportunity to:
Note: left the Update: It seems the Integration tests failing, are addressed in another PR. Will rebase, once merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hautecodure thank you very much for your contribution! Much appreciated! I do have some comments in the code, the biggest work items are
- Strip out the column sorting to another PR to only have one thing in each PR!
- I would rather like to have table-altering as an opt-in to not accidentally break existing configurations.
Looking forward to the next round!
Pushed a new update:
While a bit unpleasant with the per-driver overrides, there was not much of a choice, unless we want to pester users with writing queries for each and every edge-case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @hautecodure for the update and your patience! I do have some styling comments in the code. Additionally, I have the following items:
- Please do not remove existing options (
table_exists_template
) or render them non-functional! This might break some users! I agree that most likely this option is useless and should be removed, but to do so, you need to deprecate it first and keep it for a certain time. - The default value of
table_column_template
is empty and therefore, the line in thesample.conf
file must reflect this! You can add a line
## Use the following setting for automatically adding columns:
## table_update_template = "ALTER TABLE {TABLE} ADD COLUMN {COLUMN}"
- Using
tables map[string]map[string]bool
might simplify the code and improve lookups. What do you think?
Automate table schema updates, when the table does not contain a field (or tag) by using template, provided by the user. Note: this feature is opt-in, it requires the user to explicitly modify their configuration files.
Hi, thanks for the feedback, pushed a new iteration:
|
Download PR build artifacts for linux_amd64.tar.gz, darwin_arm64.tar.gz, and windows_amd64.zip. 📦 Click here to get additional PR build artifactsArtifact URLs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @hautecodure. Some more cosmetic things like moving the conditions to the caller side so that the reader directly sees that the create/update functions are only called on purpose or opt-in.
if !p.tables[tablename] && !p.tableExists(tablename) { | ||
createStmt := p.generateCreateTable(metric) | ||
_, err := p.db.Exec(createStmt) | ||
if err != nil { | ||
return err | ||
} | ||
if err := p.createTable(metric); err != nil { | ||
return err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest, I would like to keep the condition here as otherwise this reads like "always create the table"!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to keep the Write loop easier to read and handle the minutia in the respective function.
Should i revert back to the previous statement?
@@ -288,7 +337,6 @@ func TestPostgresIntegration(t *testing.T) { | |||
require.NoError(t, p.Connect()) | |||
defer p.Close() | |||
require.NoError(t, p.Write(testMetrics)) | |||
require.NoError(t, p.Close()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you remove this? This line makes sure all metrics are written...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It closes the connection and i want to re-use the existing connection, instead of setting up a new one.
The tests seem to pass fine. Are you sure Close is required to flush the metrics? Is this a pgx quirk?
|
||
p.TableUpdateTemplate = "ALTER TABLE {TABLE} ADD COLUMN {COLUMN}" | ||
|
||
require.NoError(t, p.Write(postCreateMetrics)) | ||
|
||
fields := []string{ | ||
"tag_add_after_create text", | ||
"bool_add_after_create boolean", | ||
} | ||
for _, column := range fields { | ||
require.Eventually(t, func() bool { | ||
rc, out, err := container.Exec([]string{ | ||
"bash", | ||
"-c", | ||
"pg_dump" + | ||
" --username=" + username + | ||
" --no-comments" + | ||
" " + dbname + | ||
// pg_dump's output has comments that include build info | ||
// of postgres and pg_dump. The build info changes with | ||
// each release. To prevent these changes from causing the | ||
// test to fail, we strip out comments. Also strip out | ||
// blank lines. | ||
"|grep -E -v '(^--|^$|^SET )'", | ||
}) | ||
require.NoError(t, err) | ||
require.Equal(t, 0, rc) | ||
|
||
b, err := io.ReadAll(out) | ||
require.NoError(t, err) | ||
|
||
return bytes.Contains(b, []byte(column)) | ||
}, 5*time.Second, 500*time.Millisecond, column) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please create a new testcase for this or directly set the update template! Furthermore, a comment would be nice saying something like
// Write a metric that targets the same table but has additional fields to test the automatic
// column update functionality.
require.NoError(t, p.Write(postCreateMetrics))
fields := []string{
"tag_add_after_create text",
"bool_add_after_create boolean",
}
for _, column := range fields {
Same for the other spots below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking of creating new cases, but i have to duplicate the existing ones as i have to create a table, insert metrics, add new fields.
I'm not familiar with the test infrastructure, is there a way to keep the containers running so i don't have to re-create the database?
Or should i split the cases into separate functions and re-use them?
Automate table updates, when the table does not contain a field
Summary
Currently, updates to the Table are left to the administrator and whenever a field is not present (in the table), the entire Telegraf flush is discarded, losing data in the process.
The new approach exposes two templates, one to update a table and another on to query a table for it's columns.
To avoid unnecessary (and expensive) requests, we cache the table columns and update it, whenever there is a cache miss from a field.
Checklist
Related issues
N/A