Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Save job status tables option #311

Merged
merged 18 commits into from
Jan 24, 2022

Conversation

Aryex
Copy link
Collaborator

@Aryex Aryex commented Jan 18, 2022

Summary

Added save_job_tatus_tables:Boolean option for jobs status table.

Description

With save_job_tatus_tables = true, a job status table will be initialized. Commits to vertica or external tables are recorded to vertica.

Related Issue

Closes #308

Additional Reviewers

@alexey-temnikov
@alexr-bq
@jonathanl-bq
@jeremyp-bq

Comment on lines 506 to 510
_ <- if (config.saveMetadataTables) {
tableUtils.updateJobStatusTable(config.tablename, config.jdbcConfig.auth.user, 0.0, config.sessionId, success = true)
} else {
Right(())
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we should create the status table for external tables, even if the save_metadata_tables option is true. Since there is no fault tolerance information saved, the only benefit is to track the number of times an external table is created and the session id it was created under.

@alexey-temnikov, thoughts?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. We should have it documented and highlighted in the changelog, that this behaviour will be changed.
@jeremyp-bq, @Aryex, minor nit - should we consider another name instead of save_metadata_tables since it is specific to the rejected rows? Perhaps save_rejected_rows?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexey-temnikov This PR is to turn off the status table by default and add an option to enable it, but I was thinking we could lump together the creation of the rejected rows table in the future under the same option, which is why it has the generic name save_metadata_tables.

Unless you think it is better to have an option for creation of each (e.g. save_status_table and later save_rejected_rows_table)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we should create the status table for external tables, even if the save_metadata_tables option is true. Since there is no fault tolerance information saved, the only benefit is to track the number of times an external table is created and the session id it was created under.

This was purely to mimic the old behavior. That said, it could be extended to provide an option to not record jobs of external tables. Something like: save_metadata_tables: all | none | vertica-only.

This PR is to turn off the status table by default and add an option to enable it, but I was thinking we could lump together the creation of the rejected rows table in the future under the same option.

I think using specific options would be better in the long run?

Copy link
Collaborator

@jeremyprime jeremyprime Jan 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking the logic would be like this, unless we see value in creating the status table for external tables:

if(config.saveMetadataTables && !config.createExternalTable.isDefined) {
  // create table
} else {
  Right(())
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm not following the condition. What do you mean by createExternalTable is defined AND not defined?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo, fixed it.

To summarize my questions:

  1. Should we create the status table for the external table case, or not since it doesn't provide much value for the external tables case (even if the option is set to true)?
  2. Should we have separate options for saving the status table and rejected rows table, or use the same option for both?

@jeremyprime
Copy link
Collaborator

@Aryex We should also update the README.md to list the new option.

Also note that one of the functional tests is failing (click the details link for the integration tests on the PR, then search the logs for "TEST FAILED" to see the exact one - looks to be an integration test that is looking for the job status table).

@Aryex Aryex marked this pull request as ready for review January 20, 2022 00:41
@Aryex Aryex changed the title Save metadata tables option Save job status tables option Jan 24, 2022
val query = "SELECT COUNT (1) FROM S2V_JOB_STATUS_USER_" + writeOpts("user") +
" WHERE target_table_name = '" + targetTableName + "'"
val rs = stmt.executeQuery(query)
var count:Int = 0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple notes here:

  1. General style recommendation is to put a space after the colon in type annotations. This is probably something that we should be using an automated code formatter for, but we haven't spent the time to do this yet.
  2. Avoid var. In this case, the mutation is localized and probably not a problem, especially since it's in test code, but it helps to get out of the mindset of reaching for var when it isn't necessary. Here, we could use the following snippet instead:
val count: Int = if (rs.next) rs.getInt(1) else 0
rs.close()
count

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make creation of job status table optional
4 participants