Skip to content

Commit

Permalink
Add recalculate_checksum command
Browse files Browse the repository at this point in the history
In case you clone a database, you would want to recaclulate checksums for
repeatable scripts as they already exist but checksum is wrong as database name
has changed.

Both deploy_command and recalculate_checksum use very similar body. Only
differences are:

1) Deploy deals with V, R, A migrations, while recalculation deals only with
R migrations.
2) Deploy also applies the script, while recalculation on updates
schemachangetables.
  • Loading branch information
riiwo committed Feb 9, 2024
1 parent 94deeb5 commit 5fd538d
Show file tree
Hide file tree
Showing 5 changed files with 115 additions and 40 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ jobs:

strategy:
matrix:
python-version: [3.7, 3.8, 3.9]
python-version: [3.8, 3.9]
github-runner: ['ubuntu-latest', 'windows-latest']

runs-on: ${{ matrix.github-runner }}
Expand Down
6 changes: 5 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,10 @@ All notable changes to this project will be documented in this file.

*The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).*

## [3.5.5] - 2024-02-09
### Changed
- Added `recalculate_checksum` subcommand, introducing recalculate checksum scripts

## [3.5.4] - 2024-02-07
### Changed
- Updated `snowflake-connector-python` dependency use 3.7.0. This allows to use OpenSSL 3.0
Expand All @@ -12,7 +16,7 @@ All notable changes to this project will be documented in this file.

## [3.5.3] - 2023-02-18
### Changed
- Added `undo` subcommand, introducing Undo scripts
- Added `undo` subcommand, introducing Undo script

## [3.5.2] - 2023-02-14
### Changed
Expand Down
43 changes: 34 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -243,7 +243,7 @@ Default [Password](https://docs.snowflake.com/en/user-guide/python-connector-exa
[Browser based SSO](https://docs.snowflake.com/en/user-guide/admin-security-fed-auth-use.html#setting-up-browser-based-sso) | `externalbrowser`
[Programmatic SSO](https://docs.snowflake.com/en/user-guide/admin-security-fed-auth-use.html#native-sso-okta-only) (Okta Only) | Okta URL endpoing for your Okta account typically in the form `https://<okta_account_name>.okta.com` OR `https://<okta_account_name>.oktapreview.com`

In the event both authentication criteria for the default authenticator are provided, schemachange will prioritize password authentication over key pair authentication.
In the event both authentication criteria for the default authenticator are provided, schemachange will prioritize password authentication over key pair authentication.

### Password Authentication
The Snowflake user password for `SNOWFLAKE_USER` is required to be set in the environment variable `SNOWFLAKE_PASSWORD` prior to calling the script. schemachange will fail if the `SNOWFLAKE_PASSWORD` environment variable is not set.
Expand All @@ -260,20 +260,20 @@ The URL of the authenticator resource that will be receive the POST request.
* token-response-name
The Expected name of the JSON element containing the Token in the return response from the authenticator resource.
* token-request-payload
The Set of variables passed as a dictionary to the `data` element of the request.
The Set of variables passed as a dictionary to the `data` element of the request.
* token-request-headers
The Set of variables passed as a dictionary to the `headers` element of the request.
The Set of variables passed as a dictionary to the `headers` element of the request.

It is recomended to use the YAML file and pass oauth secrets into the configuration using the templating engine instead of the command line option.
It is recomended to use the YAML file and pass oauth secrets into the configuration using the templating engine instead of the command line option.


### External Browser Authentication
External browser authentication can be used for local development by setting the environment variable `SNOWFLAKE_AUTHENTICATOR` to the value `externalbrowser` prior to calling schemachange.
External browser authentication can be used for local development by setting the environment variable `SNOWFLAKE_AUTHENTICATOR` to the value `externalbrowser` prior to calling schemachange.
The client will be prompted to authenticate in a browser that pops up. Refer to the [documentation](https://docs.snowflake.com/en/user-guide/admin-security-fed-auth-use.html#setting-up-browser-based-sso) to cache the token to minimize the number of times the browser pops up to authenticate the user.

### Okta Authentication
For clients that do not have a browser, can use the popular SaaS Idp option to connect via Okta. This will require the Okta URL that you utilize for SSO.
Okta authentication can be used setting the environment variable `SNOWFLAKE_AUTHENTICATOR` to the value of your okta endpoint as a fully formed URL ( E.g. `https://<org_name>.okta.com`) prior to calling schemachange.
For clients that do not have a browser, can use the popular SaaS Idp option to connect via Okta. This will require the Okta URL that you utilize for SSO.
Okta authentication can be used setting the environment variable `SNOWFLAKE_AUTHENTICATOR` to the value of your okta endpoint as a fully formed URL ( E.g. `https://<org_name>.okta.com`) prior to calling schemachange.

_** NOTE**: Please disable Okta MFA for the user who uses Native SSO authentication with client drivers. Please consult your Okta administrator for more information._

Expand Down Expand Up @@ -348,14 +348,14 @@ dry-run: false
# A string to include in the QUERY_TAG that is attached to every SQL statement executed
query-tag: 'QUERY_TAG'
# Information for Oauth token requests
# Information for Oauth token requests
oauthconfig:
# url Where token request are posted to
token-provider-url: 'https://login.microsoftonline.com/{{ env_var('AZURE_ORG_GUID', 'default') }}/oauth2/v2.0/token'
# name of Json entity returned by request
token-response-name: 'access_token'
# Headers needed for successful post or other security markings ( multiple labeled items permitted
token-request-headers:
token-request-headers:
Content-Type: "application/x-www-form-urlencoded"
User-Agent: "python/schemachange"
# Request Payload for Token (it is recommended pass
Expand Down Expand Up @@ -438,6 +438,31 @@ Parameter | Description
--query-tag | A string to include in the QUERY_TAG that is attached to every SQL statement executed.
--oauth-config | Define values for the variables to Make Oauth Token requests (e.g. {"token-provider-url": "https//...", "token-request-payload": {"client_id": "GUID_xyz",...},... })'

#### recalculate_checksum
This subcommand is used to recalculate repeatable migration checksums. It is useful when cloning a database to ensure, that you don't need to rerun repeatable migrations.

`usage: schemachange recalculate_checksum [-h] [--config-folder CONFIG_FOLDER] [-f ROOT_FOLDER] [-m MODULES_FOLDER] [-a SNOWFLAKE_ACCOUNT] [-u SNOWFLAKE_USER] [-r SNOWFLAKE_ROLE] [-w SNOWFLAKE_WAREHOUSE] [-d SNOWFLAKE_DATABASE] [-c CHANGE_HISTORY_TABLE] [--vars VARS] [--create-change-history-table] [-ac] [-v] [--dry-run] [--query-tag QUERY_TAG]`

Parameter | Description
--- | ---
-h, --help | Show the help message and exit
--config-folder CONFIG_FOLDER | The folder to look in for the schemachange-config.yml file (the default is the current working directory)
-f ROOT_FOLDER, --root-folder ROOT_FOLDER | The root folder for the database change scripts. The default is the current directory.
-m MODULES_FOLDER, --modules-folder MODULES_FOLDER | The modules folder for jinja macros and templates to be used across mutliple scripts
-a SNOWFLAKE_ACCOUNT, --snowflake-account SNOWFLAKE_ACCOUNT | The name of the snowflake account (e.g. xy12345.east-us-2.azure).
-u SNOWFLAKE_USER, --snowflake-user SNOWFLAKE_USER | The name of the snowflake user
-r SNOWFLAKE_ROLE, --snowflake-role SNOWFLAKE_ROLE | The name of the role to use
-w SNOWFLAKE_WAREHOUSE, --snowflake-warehouse SNOWFLAKE_WAREHOUSE | The name of the default warehouse to use. Can be overridden in the change scripts.
-d SNOWFLAKE_DATABASE, --snowflake-database SNOWFLAKE_DATABASE | The name of the default database to use. Can be overridden in the change scripts.
-c CHANGE_HISTORY_TABLE, --change-history-table CHANGE_HISTORY_TABLE | Used to override the default name of the change history table (which is METADATA.SCHEMACHANGE.CHANGE_HISTORY)
--vars VARS | Define values for the variables to replaced in change scripts, given in JSON format (e.g. '{"variable1": "value1", "variable2": "value2"}')
--create-change-history-table | Create the change history table if it does not exist. The default is 'False'.
-ac, --autocommit | Enable autocommit feature for DML commands. The default is 'False'.
-v, --verbose | Display verbose debugging details during execution. The default is 'False'.
--dry-run | Run schemachange in dry run mode. The default is 'False'.
--query-tag | A string to include in the QUERY_TAG that is attached to every SQL statement executed.
--oauth-config | Define values for the variables to Make Oauth Token requests (e.g. {"token-provider-url": "https//...", "token-request-payload": {"client_id": "GUID_xyz",...},... })'

#### render
This subcommand is used to render a single script to the console. It is intended to support the development and troubleshooting of script that use features from the jinja template engine.

Expand Down
102 changes: 74 additions & 28 deletions schemachange/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@

#region Global Variables
# metadata
_schemachange_version = '3.5.4'
_schemachange_version = '3.5.5'
_config_file_name = 'schemachange-config.yml'
_metadata_database_name = 'METADATA'
_metadata_schema_name = 'SCHEMACHANGE'
Expand Down Expand Up @@ -58,6 +58,7 @@
+ "execution"
_log_apply = "Applying change script {script_name}"
_log_undo = "Applying undo script {script_name}"
_log_recalculate = "Recalculate checksum for change script {script_name}"
_log_apply_set_complete = "Successfully applied {scripts_applied} change scripts (skipping " \
+ "{scripts_skipped}) \nCompleted successfully"
_log_undo_set_complete = "Successfully applied {scripts_applied} undo scripts"
Expand Down Expand Up @@ -469,8 +470,7 @@ def record_change_script(self, script, script_content, change_history_table, exe
query = self._q_ch_log.format(**frmt_args)
self.execute_snowflake_query(query)


def deploy_command(config):
def setup_session(config):
req_args = set(['snowflake_account','snowflake_user','snowflake_role','snowflake_warehouse'])
validate_auth_config(config, req_args)

Expand All @@ -480,11 +480,9 @@ def deploy_command(config):
print(_log_config_details.format(**config))

#connect to snowflake and maintain connection
session = SnowflakeSchemachangeSession(config)

scripts_skipped = 0
scripts_applied = 0
return SnowflakeSchemachangeSession(config)

def calculate_repeatable_migration_checksum(config, session):
# Deal with the change history table (create if specified)
change_history_table = get_change_history_table_details(config['change_history_table'])
change_history_metadata = session.fetch_change_history_metadata(change_history_table)
Expand Down Expand Up @@ -514,15 +512,20 @@ def deploy_command(config):
max_published_version_display = 'None'
print(_log_ch_max_version.format(max_published_version_display=max_published_version_display))

# Find all scripts in the root folder (recursively) and sort them correctly
all_scripts = get_all_scripts_recursively(config['root_folder'], config['verbose'])
all_script_names = list(all_scripts.keys())
# Sort scripts such that versioned scripts get applied first and then the repeatable ones.
all_script_names_sorted = sorted_alphanumeric([script for script in all_script_names if script[0] == 'V']) \
+ sorted_alphanumeric([script for script in all_script_names if script[0] == 'R']) \
+ sorted_alphanumeric([script for script in all_script_names if script[0] == 'A'])
return [change_history_table, r_scripts_checksum, max_published_version]

def apply_scripts(config, all_scripts, all_script_names_sorted, apply = True):
session = setup_session(config)

scripts_skipped = 0
scripts_applied = 0

[
change_history_table,
r_scripts_checksum,
max_published_version
] = calculate_repeatable_migration_checksum(config, session)

# Loop through each script in order and apply any required changes
for script_name in all_script_names_sorted:
script = all_scripts[script_name]

Expand Down Expand Up @@ -556,26 +559,36 @@ def deploy_command(config):
scripts_skipped += 1
continue

print(_log_apply.format(**script))
if apply:
print(_log_apply.format(**script))
else:
print(_log_recalculate.format(**script))

if not config['dry_run']:
execution_time = session.apply_change_script(script, content, change_history_table)
execution_time = 0
if apply:
execution_time = session.apply_change_script(script, content, change_history_table)
session.record_change_script(script, content, change_history_table, execution_time)
scripts_applied += 1

print(_log_apply_set_complete.format(scripts_applied=scripts_applied, scripts_skipped=scripts_skipped))
return [scripts_skipped, scripts_applied]

def undo_command(config):
req_args = set(['snowflake_account','snowflake_user','snowflake_role','snowflake_warehouse', 'step'])
validate_auth_config(config, req_args)
def deploy_command(config):
# Find all scripts in the root folder (recursively) and sort them correctly
all_scripts = get_all_scripts_recursively(config['root_folder'], config['verbose'])
all_script_names = list(all_scripts.keys())
# Sort scripts such that versioned scripts get applied first and then the repeatable ones.
all_script_names_sorted = sorted_alphanumeric([script for script in all_script_names if script[0] == 'V']) \
+ sorted_alphanumeric([script for script in all_script_names if script[0] == 'R']) \
+ sorted_alphanumeric([script for script in all_script_names if script[0] == 'A'])

# Log some additional details
if config['dry_run']:
print("Running in dry-run mode")
print(_log_config_details.format(**config))
# Loop through each script in order and apply any required changes
[scripts_skipped, scripts_applied] = apply_scripts(config, all_scripts, all_script_names_sorted, True)

#connect to snowflake and maintain connection
session = SnowflakeSchemachangeSession(config)
print(_log_apply_set_complete.format(scripts_applied=scripts_applied, scripts_skipped=scripts_skipped))

def undo_command(config):
session = setup_session(config)

# Deal with the change history table (raise if not provided)
change_history_table = get_change_history_table_details(config['change_history_table'])
Expand Down Expand Up @@ -613,6 +626,17 @@ def undo_command(config):

print(_log_undo_set_complete.format(scripts_applied=scripts_applied))

def recalculate_checksum_command(config):
# Find all scripts in the root folder (recursively) and sort them correctly
all_scripts = get_all_scripts_recursively(config['root_folder'], config['verbose'])
all_script_names = list(all_scripts.keys())
# Sort scripts such that versioned scripts get applied first and then the repeatable ones.
all_script_names_sorted = sorted_alphanumeric([script for script in all_script_names if script[0] == 'R'])

[scripts_applied, scripts_skipped] = apply_scripts(config, all_scripts, all_script_names_sorted, False)

print(_log_apply_set_complete.format(scripts_applied=scripts_applied, scripts_skipped=scripts_skipped))

def render_command(config, script_path):
"""
Renders the provided script.
Expand Down Expand Up @@ -894,6 +918,24 @@ def main(argv=sys.argv):
parser = argparse.ArgumentParser(prog = 'schemachange', description = 'Apply schema changes to a Snowflake account. Full readme at https://github.com/Snowflake-Labs/schemachange', formatter_class = argparse.RawTextHelpFormatter)
subcommands = parser.add_subparsers(dest='subcommand')

parser_undo = subcommands.add_parser("recalculate_checksum")
parser_undo.add_argument('--config-folder', type = str, default = '.', help = 'The folder to look in for the schemachange-config.yml file (the default is the current working directory)', required = False)
parser_undo.add_argument('-s', '--step', type = int, default = 1, help = 'Amount of versioned migrations to be undone in the reverse of their applied order', required = False)
parser_undo.add_argument('-f', '--root-folder', type = str, help = 'The root folder for the database change scripts', required = False)
parser_undo.add_argument('-m', '--modules-folder', type = str, help = 'The modules folder for jinja macros and templates to be used across multiple scripts', required = False)
parser_undo.add_argument('-a', '--snowflake-account', type = str, help = 'The name of the snowflake account (e.g. xy12345.east-us-2.azure)', required = False)
parser_undo.add_argument('-u', '--snowflake-user', type = str, help = 'The name of the snowflake user', required = False)
parser_undo.add_argument('-r', '--snowflake-role', type = str, help = 'The name of the default role to use', required = False)
parser_undo.add_argument('-w', '--snowflake-warehouse', type = str, help = 'The name of the default warehouse to use. Can be overridden in the change scripts.', required = False)
parser_undo.add_argument('-d', '--snowflake-database', type = str, help = 'The name of the default database to use. Can be overridden in the change scripts.', required = False)
parser_undo.add_argument('-c', '--change-history-table', type = str, help = 'Used to override the default name of the change history table (the default is METADATA.SCHEMACHANGE.CHANGE_HISTORY)', required = False)
parser_undo.add_argument('--vars', type = json.loads, help = 'Define values for the variables to replaced in change scripts, given in JSON format (e.g. {"variable1": "value1", "variable2": "value2"})', required = False)
parser_undo.add_argument('-ac', '--autocommit', action='store_true', help = 'Enable autocommit feature for DML commands (the default is False)', required = False)
parser_undo.add_argument('-v','--verbose', action='store_true', help = 'Display verbose debugging details during execution (the default is False)', required = False)
parser_undo.add_argument('--dry-run', action='store_true', help = 'Run schemachange in dry run mode (the default is False)', required = False)
parser_undo.add_argument('--query-tag', type = str, help = 'The string to add to the Snowflake QUERY_TAG session value for each query executed', required = False)
parser_undo.add_argument('--oauth-config', type = json.loads, help = 'Define values for the variables to Make Oauth Token requests (e.g. {"token-provider-url": "https//...", "token-request-payload": {"client_id": "GUID_xyz",...},... })', required = False)

parser_undo = subcommands.add_parser("undo")
parser_undo.add_argument('--config-folder', type = str, default = '.', help = 'The folder to look in for the schemachange-config.yml file (the default is the current working directory)', required = False)
parser_undo.add_argument('-s', '--step', type = int, default = 1, help = 'Amount of versioned migrations to be undone in the reverse of their applied order', required = False)
Expand Down Expand Up @@ -942,7 +984,7 @@ def main(argv=sys.argv):
# The original parameters did not support subcommands. Check if a subcommand has been supplied
# if not default to deploy to match original behaviour.
args = argv[1:]
if len(args) == 0 or not any(subcommand in args[0].upper() for subcommand in ["DEPLOY", "RENDER", "UNDO"]):
if len(args) == 0 or not any(subcommand in args[0].upper() for subcommand in ["DEPLOY", "RENDER", "UNDO", "RECALCULATE_CHECKSUM"]):
args = ["deploy"] + args

args = parser.parse_args(args)
Expand All @@ -964,6 +1006,8 @@ def main(argv=sys.argv):
"create_change_history_table":None,"autocommit":None,"dry_run":None,"query_tag":None,"oauth_config":None,"step":None }
elif args.subcommand == 'undo':
renderoveride = {"create_change_history_table":None}
elif args.subcommand == 'recalculate_checksum':
renderoveride = {"create_change_history_table":None}
elif args.subcommand == 'deploy':
renderoveride = {"step":None}

Expand Down Expand Up @@ -997,6 +1041,8 @@ def main(argv=sys.argv):
render_command(config, args.script)
elif args.subcommand == 'undo':
undo_command(config)
elif args.subcommand == 'recalculate_checksum':
recalculate_checksum_command(config)
else:
deploy_command(config)

Expand Down
2 changes: 1 addition & 1 deletion setup.cfg
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[metadata]
name = schemachange
version = 3.5.4
version = 3.5.5
author = jamesweakley/jeremiahhansen
description = A Database Change Management tool for Snowflake
long_description = file: README.md
Expand Down

0 comments on commit 5fd538d

Please sign in to comment.