PES-3645 Soda <> Redshift IAM Role profiling #13
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Change Summary
Adding support for Soda profiling w Redshift iam role.
We're now sending a few more attributes to the soda-core in case of Iam role setup.
Why? please check the RCA below.
Packages PR: https://github.com/atlanhq/marketplace-packages/pull/12019
Initial Problem
we were getting the error from the boto library stating partial creds are provided
to fix this, in the sodaConnectionTemplate, we updated the logic to ensure that with IAM role only ARN is sent to the boto library, allowing it to assume the role correctly
Problems while assuming the role
reason: we are supposed to send the ARN as well as the ExternalId to boto so the role can be assumed (if External_Id is used by the user)
Problems during the get_cluster_credentials call
reason: get_cluster_credentials expected a dbuser as well dbname for the request - we were passing null to it
& because we were not passing it initially, we had to update the sodaConnectionTemplate to use these attributes (we already ask for these details in the Redshift crawler setup)
this still failed stating a policy issue
reason: if we look closely, we’re passing the region as
eu-west-1
(which is the default region) and we’re seeing thecluster_name
asmercury
(which we're getting after resolving the host name)to fix this, pass the cluster id as well as ask for the cluster region in the Redshift config
Jira Issues Resolved
https://atlanhq.atlassian.net/browse/PES-3645