-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add LPDAAC S3 credential rotation dynamic tiler lambda (for HLS) #25
Comments
I deployed a separate edl-credential-rotation stack for delta-backend-dev. In the raster-api handler in delta-backend-dev a small change is needed to use the edl aws session credentials from the lambda environment. With these changes the raster-api is able to pick up and use the credentials however the API is currently deployed in an isolated subnet which is preventing us from accessing the external s3 files. Unfortunately the CDK VPC configuration is blocking the deployment of private-with-nat lambdas due to poor CIDR block planning. This change management plan includes steps to resolve the VPC issue. feature/edl-4-rasterapi contains the lambda changes as well as some minor changes to GDAL environment variables. |
@anayeaye @vincentsarago is helping investigate some GDAL optimizations for our use cases that will most likely affect https://github.com/NASA-IMPACT/delta-backend/compare/feature/edl-4-rasterapi#diff-08a35aa423ced1c2c9aeb17d6a439c22744578a3b1cbdfee77f2f26be39554c1. Is this a good location to ping you with updates as we learn more? |
@sharkinsspatial thanks--this is a great place for updates! |
@anayeaye change management plan looks great, we should add it to a VEDA project folder so we can re-use it or reference it in the future. Thanks for writing it up. A few questions below but I think we want to send this to the front end developers (Daniel, Ricardo, Hanbyul), data publishers (Iksha, Slesa) and the ESA development team which has been using the Staging API ASAP so they are aware staging may go down for 1-2 days next week - do you agree? Questions about the change management plan:
Can we make it clear here that the plan is to deploy a new stack and once we have verified its operational to update the domain name servers to point to the new stack endpoints?
Can we make it clear we are upgrading pgstac which is a schema for the postgresql database in RDS (as opposed to the version of posgresql itself) and from version XX to XX and a link https://github.com/stac-utils/pgstac. Also add that we will be also creating a snapshot of the existing database and using it to restore the existing datasets to the new database and schema?
I think adequate here just means that there is no risk of there having been changes to the database between the date of the most recent snapshot and when we use it to populate the new database, is that your definition as well?
What types of tests will you run? |
@abarciauskas-bgse Thank you for your change management review comments! I have updated the document and agree that we need to share with the wider VEDA team ASAP. As far as staging going down, I think this plan ensures that staging will not be down for more than an hour or two but we will have a window when new data ingests would be lost--the dev stack work should give us a good estimate of how long that will be. I am not sure that the RDS restore plan is even viable (I hope it is!). I think that I can test it tomorrow and then tighten up the dates for sharing. |
@manilmaskey brought up a valid question in today's IMPACT meeting that made me consider the fact that we should have a broader strategy for cross account bucket access with the DAACs. I adopted the temporary S3 credential rotation strategy for the HLS tiler because our delivery timelines for integration with the FIRMS application were extremely short and this didn't leave adequate change to coordinate with LPDAAC on a large administrative change. @tracetechnical and I chatted a bit about this today and given the frequent maintenance windows and periodic instability of EDL it would be a good idea to have someone from IMPACT engage directly with the relevant DAACs and check if cross account policies with read access can be enabled for all roles in our accounts. There are several approaches for tackling this but it would be good to first determine if this is feasible from a policy perspective. cc @abarciauskas-bgse @anayeaye |
Still pushing this EDL service forward as a temporary solution until cross account policies are established. PR #50 handles the VPC CIDR range limitations that were preventing us from adding the private-with-nat subnets needed to render HLS data on the map. Currently there is not an edl-login-service deployed for the delta-backend (I took it down while navigating the VPC changes). The feature branch for the delta-backend raster-api changes needed for EDL is still open but will need a catch up when we come back to this issue. |
This work is on hold, we should consider an alternate tiler for HLS data. PR #56 documents how credential rotation was added to a test delta backend stack and why it cannot be used as is (tl;dr we can only tile HLS or our own hosted COGs in a single tiler). |
Noting a possible solution to the issue raised in pr 56 from @abarciauskas-bgse @vincentsarago @sharkinsspatial: Add an additional tiler to the delta-backend deployment that will receive EDL tokens and use the dataset configuration or collection metadata to choose what tiler is used. |
@abarciauskas-bgse Here are some notes about what I think the delta-backend can do to support HLS for the trilateral release. I think that the second scenario is what you are proposing and I can get started on it if I have the right idea... Short term trilateral release commitmentTwo possible short term solutions exist in which we provide a us-west delta-backend stack deployed with a snapshot of the staging database (and redeploy as needed to add latest staging-stack ingests). In both the cloudformation stack will have a new name (like delta-backend-west) with the possibility of moving custom domain API users over to this new us-west backend in the future (i.e. cut over traffic from https://staging-stac.delta-backend.xyz to this new backend). Scenario 1 (single delta backend in us-west only supporting LPDAAC-CLD)
Scenario 2 (multiple tilers one delta backend deployed in us-west)Deploy latest delta backend with 3 provider-dedicated tilers
Work required
|
To summarize my conversation with @anayeaye yesterday, I believe we want to deliver a parameterized endpoint so that clients can still use the same API endpoint for doing visualization but pass a parameter identifying the data provider. The reasoning behind this is that, while many datasets will live in the "VEDA data store bucket", other datasets in our API will be maintained by other "data providers" - most likely to be DAACs. While we will probably need some things to be true for all VEDA data providers (in that we have some way of accessing the data from our systems), I think it's the case that we will have different backend implementations to make requests of these providers, such as different S3 credentials. IN order to make this work we need to:
What do you think about this approach @anayeaye @vincentsarago @sharkinsspatial |
@abarciauskas-bgse the problem with this approach is that we assume that we will get |
When you say we will get |
@abarciauskas-bgse oh, so every 30min or so we get credential for multiple |
There are multiple lambdas, one for each provider, each gets new credentials every 30 minutes |
@abarciauskas-bgse We have a few options here. Due to restrictions on Lambda reserved environment variables keys https://docs.aws.amazon.com/lambda/latest/dg/configuration-envvars.html the credential environment variables Additionally, all of these environment settings can be injected more explicitly in the |
Also linking to Patrick's document here for reference which outlines potential longer term strategies around this issue https://docs.google.com/document/d/18GyoMZj0I2HKAXwqyeziO0ISbOwHxo1TN4eAlR4mH3U/view. |
@sharkinsspatial FYI we don't use This is done at app creation level but could in theory also be done a request level |
The dynamic tiler may be requesting data from Earthdata cloud buckets, such as the HLS data provided by LP.DAAC. The tiler need to have some sort of credentials to requests to those files. This could be done by storing URS credentials in a .netrc file but @sharkinsspatial has created EDL credential rotations for direct S3 access which should be faster than authenticating through URS for each request: https://github.com/NASA-IMPACT/edl-credential-rotation. We should probably re-use this approach in our backend API.
The text was updated successfully, but these errors were encountered: