-
Notifications
You must be signed in to change notification settings - Fork 8
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add missing data store link and create data ingestion page (to be fle…
…shed out)
- Loading branch information
Showing
2 changed files
with
36 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
--- | ||
title: VEDA Data Ingestion Services | ||
subtitle: Ingestion services for VEDA | ||
--- | ||
|
||
Welcome to the VEDA data ingestion services documentation. This document provides an overview of our data ingestion pipelines, Airflow instance, and ingest APIs. Whether you're deploying new services or interacting with existing ones, this guide will help you get started. For detailed development and contribution guidelines, please refer to our [Contributing Guide](./contributing/index.qmd). | ||
|
||
## Overview | ||
|
||
VEDA's data ingestion services are designed to handle and manage the flow of data from various sources efficiently. Our system integrates with Apache Airflow for orchestrating and scheduling data pipelines, and APIs to facilitate and manage data ingestion tasks. | ||
|
||
### Components | ||
|
||
1. **Apache Airflow Instance** | ||
- Our Airflow instance orchestrates various data ingestion tasks and workflows. It supports scheduling, monitoring, and managing pipelines, described using Directed Acyclic Graphs (DAGs). | ||
- For details on how to deploy, configure, or modify our Airflow instance, refer to the [veda-data-airflow](https://github.com/NASA-IMPACT/veda-data-airflow) repository. | ||
|
||
2. **Ingest APIs** | ||
- Our ingest API facilitates data ingestion from multiple sources and manages the data flow into VEDA's system. This API is included in [veda-backend](https://github.com/NASA-IMPACT/veda-backend) | ||
- To learn how to interact with these APIs as a user, consult the [Dataset Ingestion Guide](./contributing/dataset-ingestion/index.qmd). | ||
|
||
## Deployment and Modification | ||
|
||
If you need to deploy or modify any of the ingestion services, please visit the [veda-data-airflow](https://github.com/veda-data-airflow) repository. This repo contains: | ||
- Instructions for deploying the Airflow instance | ||
- Examples of how to write new ingestion processes (DAGs) | ||
- Guidance on extending and modifying existing ingestion services |