Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate cdp changes to ingest from datastore txmeta files #333

Merged
merged 28 commits into from
Apr 22, 2024

Conversation

chowbao
Copy link
Contributor

@chowbao chowbao commented Mar 24, 2024

Integrate cdp changes to ingest from datastore txmeta files

  • Add cdp_* airflow parameters to run in parallel with captive core mode. cdp_* dags should write to cdp_* dataset
  • cdp_* versions of the captive core export dags
  • New parameters to select between captive core mode and reading txmeta mode in the dags and in build_export_task
  • build time task did not need to be adjusted. Current stellar-etl implementation pulls from history archives which should be fine no matter what mode is being run

@chowbao chowbao requested a review from a team as a code owner March 24, 2024 19:08
@chowbao
Copy link
Contributor Author

chowbao commented Mar 24, 2024

Oops this was supposed to be draft PR 🤷‍♀️

Copy link
Contributor

@sydneynotthecity sydneynotthecity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are your thoughts on how we're going to deploy cdp and replace the old DAGs? Curious how you're thinking about that.

dags/cdp_state_table_dag.py Outdated Show resolved Hide resolved
dags/stellar_etl_airflow/build_export_task.py Show resolved Hide resolved
Copy link
Contributor

@sydneynotthecity sydneynotthecity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Nothing blocking the merge

airflow_variables_prod.json Outdated Show resolved Hide resolved
catchup=True,
description="This DAG exports trades and operations from the history archive using CaptiveCore. This supports parsing sponsorship and AMMs.",
schedule_interval="*/30 * * * *",
schedule_interval="*/10 * * * *",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any data/benchmarking yet on how well this DAG does on a 10 min interval?

In prod, we'll need to release in a couple steps to make sure we don't mess up the schedule interval shift:

  • update old DAGs to include end date
  • delete old DAGs
  • add new DAG with new start date (that won't miss any gaps)

Copy link
Contributor Author

@chowbao chowbao Apr 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In test it takes about ~10 minutes to run (mostly cause of the slow startup time issue). So we won't ever fall behind the 10 min schedule

dags/history_tables_dag.py Outdated Show resolved Hide resolved
@chowbao chowbao merged commit 8a70ec4 into master Apr 22, 2024
4 checks passed
@amishas157 amishas157 deleted the cdp-integration branch July 26, 2024 18:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants