diff --git a/_quarto.yml b/_quarto.yml index 128051ba..bfbb9b26 100644 --- a/_quarto.yml +++ b/_quarto.yml @@ -35,9 +35,15 @@ website: text: Welcome - section: services/index.qmd contents: - - services/dashboard.qmd - - services/apis.qmd - - services/jupyterhub.qmd + - section: Core Services + contents: + - services/dashboard.qmd + - services/apis.qmd + - services/jupyterhub.qmd + - section: Additional services + contents: + - services/data-store.qmd + - services/data-ingestion.qmd - section: notebooks/index.qmd contents: - section: Quickstarts diff --git a/services/data-ingestion.qmd b/services/data-ingestion.qmd new file mode 100644 index 00000000..95a44f04 --- /dev/null +++ b/services/data-ingestion.qmd @@ -0,0 +1,27 @@ +--- +title: VEDA Data Ingestion Services +subtitle: Ingestion services for VEDA +--- + +Welcome to the VEDA data ingestion services documentation. This document provides an overview of our data ingestion pipelines, Airflow instance, and ingest APIs. Whether you're deploying new services or interacting with existing ones, this guide will help you get started. For detailed development and contribution guidelines, please refer to our [Contributing Guide](./contributing/index.qmd). + +## Overview + +VEDA's data ingestion services are designed to handle and manage the flow of data from various sources efficiently. Our system integrates with Apache Airflow for orchestrating and scheduling data pipelines, and APIs to facilitate and manage data ingestion tasks. + +### Components + +1. **Apache Airflow Instance** + - Our Airflow instance orchestrates various data ingestion tasks and workflows. It supports scheduling, monitoring, and managing pipelines, described using Directed Acyclic Graphs (DAGs). + - For details on how to deploy, configure, or modify our Airflow instance, refer to the [veda-data-airflow](https://github.com/NASA-IMPACT/veda-data-airflow) repository. + +2. **Ingest APIs** + - Our ingest API facilitates data ingestion from multiple sources and manages the data flow into VEDA's system. This API is included in [veda-backend](https://github.com/NASA-IMPACT/veda-backend) + - To learn how to interact with these APIs as a user, consult the [Dataset Ingestion Guide](./contributing/dataset-ingestion/index.qmd). + +## Deployment and Modification + +If you need to deploy or modify any of the ingestion services, please visit the [veda-data-airflow](https://github.com/veda-data-airflow) repository. This repo contains: + - Instructions for deploying the Airflow instance + - Examples of how to write new ingestion processes (DAGs) + - Guidance on extending and modifying existing ingestion services