docs: added psql exmaple in getting started

getdozer · Feb 8, 2024 · e7327cd · e7327cd
1 parent 05237e6
commit e7327cd
Show file tree

Hide file tree

Showing 3 changed files with 43 additions and 18 deletions.
diff --git a/docs/configuration.md b/docs/configuration.md
@@ -12,7 +12,7 @@ Dozer relies on a YAML configuration structure delineated in `dozer-config.yaml`
 |-------------|-----------------------------------------------------------------------------------------------------------------|
 | [Connections and Sources](configuration/data-sources) | Details the array of various database, data wareshouses or any other type of connection and their tables.  |
 | [Transformations](configuration/transformations)       | Describes the transformations applied to the sourced data.                                                   |
-| [Endpoints](configuration/endpoints) | Establishes endpoints, determining how sinks are configured                              |
+| [Sinks](configuration/endpoints) | Establishes endpoints, determining how sinks are configured                              |
 | [Global Parameters and Flags](configuration/flags) | Enables or disables specific options or feature                                  |
 
 

diff --git a/docs/getting_started/core/adding-transformations.mdx b/docs/getting_started/core/adding-transformations.mdx
@@ -6,30 +6,55 @@ import TabItem from '@theme/TabItem';
 We now want to join the two datasets and perform an aggregations to determine the average fare of trips between zones. For this we will add a `sql` section to the `dozer-config.yaml` that looks like this:
 
 ```yaml
-sql: |
-    SELECT 
-      PULocationID, DOLocationID, 
-      pu_zones.Zone as PULocationName, 
-      do_zones.Zone as DOLocationName, 
-      AVG(fare_amount) as avg_amount
-    INTO avg_fares
-    FROM trips
-    INNER JOIN zones pu_zones ON trips.PULocationID = pu_zones.LocationID
-    INNER JOIN zones do_zones ON trips.DOLocationID = do_zones.LocationID
-    GROUP BY PULocationID, DOLocationID;
+sources:
+  - name: actors
+    table_name: actor
+    connection: pagila_conn
+    columns:
+      - actor_id
+      - first_name
+      - last_name
+  - name: films
+    table_name: film
+    connection: pagila_conn
+    columns:
+      - film_id
+      - title
+      - rental_rate
+  - name: film_actors
+    table_name: film_actor
+    connection: pagila_conn
+    columns:
+      - actor_id
+      - film_id
+
+sql:  |
+  SELECT a.first_name AS actor_first_name, 
+         a.last_name AS actor_last_name, 
+         f.title AS film_title, 
+         f.rental_rate
+  into actor_films
+  FROM actors a
+  JOIN film_actors fa ON a.actor_id = fa.actor_id
+  JOIN films f ON fa.film_id = f.film_id
+  WHERE f.rental_rate > 3;
 ```
 ::::note
 The SQL you specify in the .yaml file does not run in the source database. Data is processed in real-time as it comes into Dozer by Dozer's internal streaming SQL engine.
 ::::
+Here we used `actors`, `films` and `film_actors` as sources and joined them to create a new table `actor_films`. We also filtered the data to only include films with a rental rate greater than 3. 
 
-To expose the result of this query as an API we will also need to add an additional endpoint:
-
+This new table would replicated into the sink database.
 ```yaml
-endpoints:
-  - table_name: avg_fares
+sinks:
+  - table_name: actor_films
     config: !Dummy
 ```
 
 
-Now restart `dozer`, to re-trigger the ingestion from the CSV file and Supabase concurrently, and execute the above query in real-time.
+Now restart `dozer`, to re-trigger the Snapshotting and Replication process. 
+
+```bash
+dozer run
+```
 
diff --git a/docs/getting_started/core/connecting-to-destinations.mdx b/docs/getting_started/core/connecting-to-destinations.mdx
@@ -1,7 +1,7 @@
 import Tabs from '@theme/Tabs';
 import TabItem from '@theme/TabItem';
 
-# Connecting to the destination databases
+# Connecting to the Sink Database
 
 The following sections describe how to connect to the destination databases. Previously, we were transfering data to a demo sink, but now we will transfer data to a real sink.