Skip to content

Commit

Permalink
Add missing data store link and create data ingestion page (to be fle…
Browse files Browse the repository at this point in the history
…shed out)
  • Loading branch information
ividito committed Oct 8, 2024
1 parent 55853cd commit e768489
Show file tree
Hide file tree
Showing 2 changed files with 36 additions and 3 deletions.
12 changes: 9 additions & 3 deletions _quarto.yml
Original file line number Diff line number Diff line change
Expand Up @@ -35,9 +35,15 @@ website:
text: Welcome
- section: services/index.qmd
contents:
- services/dashboard.qmd
- services/apis.qmd
- services/jupyterhub.qmd
- section: Core Services
contents:
- services/dashboard.qmd
- services/apis.qmd
- services/jupyterhub.qmd
- section: Additional services
contents:
- services/data-store.qmd
- services/data-ingestion.qmd
- section: notebooks/index.qmd
contents:
- section: Quickstarts
Expand Down
27 changes: 27 additions & 0 deletions services/data-ingestion.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
---
title: VEDA Data Ingestion Services
subtitle: Ingestion services for VEDA
---

Welcome to the VEDA data ingestion services documentation. This document provides an overview of our data ingestion pipelines, Airflow instance, and ingest APIs. Whether you're deploying new services or interacting with existing ones, this guide will help you get started. For detailed development and contribution guidelines, please refer to our [Contributing Guide](./contributing/index.qmd).

## Overview

VEDA's data ingestion services are designed to handle and manage the flow of data from various sources efficiently. Our system integrates with Apache Airflow for orchestrating and scheduling data pipelines, and APIs to facilitate and manage data ingestion tasks.

### Components

1. **Apache Airflow Instance**
- Our Airflow instance orchestrates various data ingestion tasks and workflows. It supports scheduling, monitoring, and managing pipelines, described using Directed Acyclic Graphs (DAGs).
- For details on how to deploy, configure, or modify our Airflow instance, refer to the [veda-data-airflow](https://github.com/NASA-IMPACT/veda-data-airflow) repository.

2. **Ingest APIs**
- Our ingest API facilitates data ingestion from multiple sources and manages the data flow into VEDA's system. This API is included in [veda-backend](https://github.com/NASA-IMPACT/veda-backend)
- To learn how to interact with these APIs as a user, consult the [Dataset Ingestion Guide](./contributing/dataset-ingestion/index.qmd).

## Deployment and Modification

If you need to deploy or modify any of the ingestion services, please visit the [veda-data-airflow](https://github.com/veda-data-airflow) repository. This repo contains:
- Instructions for deploying the Airflow instance
- Examples of how to write new ingestion processes (DAGs)
- Guidance on extending and modifying existing ingestion services

0 comments on commit e768489

Please sign in to comment.