Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support regional fetchers #1065

Open
caparker opened this issue Oct 19, 2023 · 2 comments
Open

Support regional fetchers #1065

caparker opened this issue Oct 19, 2023 · 2 comments

Comments

@caparker
Copy link
Collaborator

In order to support regional fetchers we would need to make a few minor updates

  1. Update the cdk to deploy the lamda to a different region and create a queue in that region
  2. The scheduler is currently set up to get the QUEUE_NAME from the env variables. We would keep this as a backup but then allow the deployments to provide a preferred QUEUE_NAME
  3. The deployments are deployed along with the rest of the stack and since we would need information (QUEUE_NAME) from the deployment we would need to either move the deployment config to the manager api (the long term solution) or do that part independent of the cdk deployment.
  4. Transferring data from one region to another would also require some updates. This would just require knowing the region that our bucket is in and the region that we are currently running the lambda in. Doesnt seem like it would be too much work.

Some related issues and info
https://stackoverflow.com/questions/73780913/how-to-deploy-the-same-stack-across-multiple-regions-using-aws-cdk
https://docs.aws.amazon.com/sns/latest/dg/sns-cross-region-delivery.html
https://stackoverflow.com/questions/49707489/how-to-upload-the-file-under-different-region-of-aws-s3-bucket-using-python

@russbiggs
Copy link
Member

In the proposal do you envision that each regional deployment would place files in a regional bucket or always place in a single bucket? I imagine there could be some significant cross region costs for putting objects into another regions bucket. e.g. eu-west-1 -> us-east-1. Itll be key to figure out the most cost effective way to get everything into the same region, itll just be figuring out when that occurs.

@caparker
Copy link
Collaborator Author

We would likely want to look deeper into the cost but based on my quick look it comes down to this

A typical file from Japan is about 350K (which could be reduced but more on that later) which at the $0.02 transfer rate would cost about 0.0007 cents per file

Creating each file typically takes from 90 to 260 sec, typically 150 sec and at the typical speed thats about $0.0025 per file, or about 350 times the transfer cost

So if we were to create the same size file but in Tokyo and do it in 15 sec instead of 150 sec and then transfer it to us-east-1 we would be spending about $0.0002512 vs $0.0025, or 10X less.

Scale is important here though so improving this for just one fetcher would save us about $0.055/day and therefor it would take a while to recoup our costs. But if we could trim seconds off of all the lambdas I could see this being a big deal. Or if we were typing up someones connection because transfer rates were so slow.

And finally, we could also reduce the file size for Japan, right now only about 10% of a given file is new data so we could reduce costs if we optimized the file size a bit more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants