From 2e1afaf7a49acbccef9d90047c104479a266cf49 Mon Sep 17 00:00:00 2001 From: siddiquebagwan Date: Tue, 26 Sep 2023 22:20:53 +0530 Subject: [PATCH] doc(ingestion): looker & lookml ingestion guide (#8006) Co-authored-by: MohdSiddiqueBagwan Co-authored-by: Hyejin Yoon <0327jane@gmail.com> --- docs-website/sidebars.js | 7 + .../looker/configuration.md | 212 ++++++++++++++++++ .../quick-ingestion-guides/looker/overview.md | 52 +++++ docs/quick-ingestion-guides/looker/setup.md | 156 +++++++++++++ 4 files changed, 427 insertions(+) create mode 100644 docs/quick-ingestion-guides/looker/configuration.md create mode 100644 docs/quick-ingestion-guides/looker/overview.md create mode 100644 docs/quick-ingestion-guides/looker/setup.md diff --git a/docs-website/sidebars.js b/docs-website/sidebars.js index 06396d6088277..b07cd0b03ce11 100644 --- a/docs-website/sidebars.js +++ b/docs-website/sidebars.js @@ -81,6 +81,13 @@ module.exports = { "docs/quick-ingestion-guides/powerbi/configuration", ], }, + { + Looker: [ + "docs/quick-ingestion-guides/looker/overview", + "docs/quick-ingestion-guides/looker/setup", + "docs/quick-ingestion-guides/looker/configuration", + ], + }, ], }, { diff --git a/docs/quick-ingestion-guides/looker/configuration.md b/docs/quick-ingestion-guides/looker/configuration.md new file mode 100644 index 0000000000000..d9ba1907b006e --- /dev/null +++ b/docs/quick-ingestion-guides/looker/configuration.md @@ -0,0 +1,212 @@ +--- +title: Configuration +--- +# Configuring Looker & LookML Connector + +Now that you have created a DataHub-specific API key with the relevant access in [the prior step](setup.md), it's time to set up a connection via the DataHub UI. + +## Configure Secrets + +You must create two secrets to configure a connection with Looker or LookerML. + +* `LOOKER_CLIENT_ID` +* `LOOKER_CLIENT_SECRET` + +On your DataHub instance, navigate to the **Ingestion** tab in your screen's top right corner. + +

+ Navigate to the "Ingestion Tab" +

+ +:::note +If you do not see the Ingestion tab, please get in touch with your DataHub admin to grant you the correct permissions. +::: + +Navigate to the **Secrets** tab and click **Create new secret**. + +

+ Secrets Tab +

+ +First, create a secret for the **Client Id**. The value should be the **Client Id** of the API key created in the [prior step](http://localhost:3000/docs/next/quick-ingestion-guides/looker/setup#create-an-api-key). + +

+ API Key Client ID +

+ +Then, create a secret for the **Client Secret**. The value should be the **Client Secret** of the API key created in the [prior step](http://localhost:3000/docs/next/quick-ingestion-guides/looker/setup#create-an-api-key). + +

+ API Key client secret +

+ + +## Configure Looker Ingestion + +### Configure Recipe + +Navigate to the **Sources** tab and click **Create new source**. + +

+ Click "Create new source" +

+ +Choose `Looker`. + +

+ Select Looker from the options +

+ +Enter the details into the Looker Recipe. + +* **Base URL:** This is your looker instance URL. (i.e. `https://.cloud.looker.com`) +* **Client ID:** Use the secret LOOKER_CLIENT_ID with the format `${LOOKER_CLIENT_ID}`. +* **Client Secret:** Use the secret LOOKER_CLIENT_SECRET with the format `${LOOKER_CLIENT_SECRET}`. + + +Optionally, use the `dashboard_pattern` and `chart_pattern` fields to filter for specific dashboard and chart. + + config: + ... + dashboard_pattern: + allow: + - "2" + chart_pattern: + allow: + - "258829b1-82b1-4bdb-b9fb-6722c718bbd3" + +Your recipe should look something like this: + +

+ Looker Recipe +

+ + After completing the recipe, click **Next**. + +### Schedule Execution + +Now, it's time to schedule a recurring ingestion pipeline to extract metadata from your Looker instance regularly. + +Decide how regularly you want this ingestion to run-- day, month, year, hour, minute, etc. Select from the dropdown. + +

+ schedule selector +

+ +Ensure you've configured your correct timezone. + +

+ timezone_selector +

+ +Finally, click **Next** when you are done. + +### Finish Up + +Name your ingestion source, then click **Save and Run**. + +

+ Name your ingestion +

+ +You will now find your new ingestion source running. + +

+ ingestion_running +

+ +## Configure LookML Connector + +Now that you have created a DataHub-specific API key and Deploy Key with the relevant access in [the prior step](setup.md), it's time to set up a connection via the DataHub UI. + +### Configure Recipe + +Navigate to the **Sources** tab and click **Create new source**. + +

+ Click "Create new source" +

+ +Choose `LooML`. + +

+ Select Looker from the options +

+ +Enter the details into the Looker Recipe. You need to set a minimum 5 fields in the recipe for this quick ingestion guide: + +* **GitHub Repository:** This is your GitHub repository where LookML models are stored. You can provide the full URL (example: https://gitlab.com/gitlab-org/gitlab) or organization/repo; in this case, the connector assume it is a GitHub repo +* **GitHub Deploy Key:** Copy the content of `looker_datahub_deploy_key` and paste into this filed. +* **Looker Base URL:** This is your looker instance URL. (i.e. https://abc.cloud.looker.com) +* **Looker Client ID:** Use the secret LOOKER_CLIENT_ID with the format `${LOOKER_CLIENT_ID}`. +* **Looker Client Secret:** Use the secret LOOKER_CLIENT_SECRET with the format `${LOOKER_CLIENT_SECRET}`. + +Your recipe should look something like this: + +

+ LookML Recipe +

+ + +After completing the recipe, click **Next**. + +### Schedule Execution + +Now, it's time to schedule a recurring ingestion pipeline to extract metadata from your Looker instance regularly. + +Decide how regularly you want this ingestion to run-- day, month, year, hour, minute, etc. Select from the dropdown. + +

+ schedule selector +

+ +Ensure you've configured your correct timezone. +

+ timezone_selector +

+ +Click **Next** when you are done. + +### Finish Up + +Name your ingestion source, then click **Save and Run**. +

+ Name your ingestion +

+ +You will now find your new ingestion source running. + +

+ ingestion_running +

+ +## Validate Ingestion Runs + +View the latest status of ingestion runs on the Ingestion page. + +

+ ingestion succeeded +

+ +Click the `+` sign to expand the complete list of historical runs and outcomes; click **Details** to see the results of a specific run. + +

+ ingestion_details +

+ +From the Ingestion Run Details page, pick **View All** to see which entities were ingested. + +

+ ingestion_details_view_all +

+ +Pick an entity from the list to manually validate if it contains the detail you expected. + +

+ ingestion_details_view_all +

+ + +**Congratulations!** You've successfully set up Looker & LookML as an ingestion source for DataHub! + +*Need more help? Join the conversation in [Slack](http://slack.datahubproject.io)!* diff --git a/docs/quick-ingestion-guides/looker/overview.md b/docs/quick-ingestion-guides/looker/overview.md new file mode 100644 index 0000000000000..843d704526bcc --- /dev/null +++ b/docs/quick-ingestion-guides/looker/overview.md @@ -0,0 +1,52 @@ +--- +title: Overview +--- +# Looker & LookML Ingestion Guide: Overview + +## What You Will Get Out of This Guide + +This guide will help you set up the Looker & LookML connectors to begin ingesting metadata into DataHub. +Upon completing this guide, you will have a recurring ingestion pipeline to extract metadata from Looker & LookML and load it into DataHub. + +### Looker + +Looker connector will ingest Looker asset types: + +* [Dashboards](https://cloud.google.com/looker/docs/dashboards) +* [Charts](https://cloud.google.com/looker/docs/creating-visualizations) +* [Explores](https://cloud.google.com/looker/docs/reference/param-explore-explore) +* [Schemas](https://developers.looker.com/api/explorer/4.0/methods/Metadata/connection_schemas) +* [Owners of Dashboards](https://cloud.google.com/looker/docs/creating-user-defined-dashboards) + +:::note + +To get complete Looker metadata integration (including Looker views and lineage to the underlying warehouse tables), you must also use the [lookml](https://datahubproject.io/docs/generated/ingestion/sources/looker#module-lookml) connector. + +::: + + +### LookML + +LookMl connector will include the following LookML asset types: + +* [LookML views from model files in a project](https://cloud.google.com/looker/docs/reference/param-view-view) +* [Metadata for dimensions](https://cloud.google.com/looker/docs/reference/param-field-dimension) +* [Metadata for measures](https://cloud.google.com/looker/docs/reference/param-measure-types) +* [Dimension Groups as tag](https://cloud.google.com/looker/docs/reference/param-field-dimension-group) + +:::note + +To get complete Looker metadata integration (including Looker views and lineage to the underlying warehouse tables), you must also use the [looker](https://datahubproject.io/docs/generated/ingestion/sources/looker#module-looker) connector. + +::: + +## Next Steps +Please continue to the [setup guide](setup.md), where we'll describe the prerequisites. + +### Reference + +If you want to ingest metadata from Looker using the DataHub CLI, check out the following resources: +* Learn about CLI Ingestion in the [Introduction to Metadata Ingestion](../../../metadata-ingestion/README.md) +* [Looker Ingestion Source](https://datahubproject.io/docs/generated/ingestion/sources/Looker) + +*Need more help? Join the conversation in [Slack](http://slack.datahubproject.io)!* diff --git a/docs/quick-ingestion-guides/looker/setup.md b/docs/quick-ingestion-guides/looker/setup.md new file mode 100644 index 0000000000000..c08de116895ea --- /dev/null +++ b/docs/quick-ingestion-guides/looker/setup.md @@ -0,0 +1,156 @@ +--- +title: Setup +--- + +# Looker & LookML Ingestion Guide: Setup + +## Looker Prerequisites + +To configure ingestion from Looker, you'll first have to ensure you have an API key to access the Looker resources. + +### Login To Looker Instance + +Login to your Looker instance(e.g. `https://.cloud.looker.com`). + +Navigate to **Admin Panel** & click **Roles** to open Roles Panel. + +

+ Looker home page +

+ +

+ Looker roles search +

+ +### Create A New Permission Set + +On **Roles Panel**, click **New Permission Set**. + +

+ Looker new permission set +

+ +Set a name for the new permission set (e.g., *DataHub Connector Permission Set*) and select the following permissions. + +
+Permission List + +- access_data +- see_lookml_dashboards +- see_looks +- see_user_dashboards +- explore +- see_sql +- see_lookml +- clear_cache_refresh +- manage_models +- see_datagroups +- see_pdts +- see_queries +- see_schedules +- see_system_activity +- see_users + +
+ +After selecting all permissions mentioned above, click **New Permission Set** at the bottom of the page. + +

+Looker permission set window +

+ +### Create A Role + +On the **Roles** Panel, click **New Role**. + +

+Looker new role button +

+ +Set the name for the new role (e.g., *DataHub Extractor*) and set the following fields on this window. + +- Set **Permission Set** to permission set created in previous step (i.e *DataHub Connector Permission Set*) +- Set **Model Set** to `All` + +Finally, click **New Role** at the bottom of the page. + +

+ Looker new role window +

+ +### Create A New User + +On the **Admin** Panel, click **Users** to open the users panel. + +

+ Looker user search +

+ +Click **Add Users**. + +

+ Looker add user +

+ +On **Adding a new user**, set details in the following fields. + +- Add user's **Email Addresses**. +- Set **Roles** to the role created in previous step (e.g. *DataHub Extractor*) + +Finally, click **Save**. + +

+Looker new user window +

+ +### Create An API Key + +On the **User** Panel, click on the newly created user. + +

+Looker user panel +

+ +Click **Edit Keys** to open the **API Key** Panel. + +

+Looker user view +

+ +On the **API Key** Panel, click **New API Key** to generate a new **Client ID** and **Client Secret**. +

+Looker new api key +

+ +## LookML Prerequisites + +Follow the below steps to create the GitHub Deploy Key. + +### Generate a private-public SSH key pair + +```bash + ssh-keygen -t rsa -f looker_datahub_deploy_key +``` + +This will typically generate two files like the one below. +* `looker_datahub_deploy_key` (private key) +* `looker_datahub_deploy_key.pub` (public key) + + +### Add Deploy Key to GitHub Repository + +First, log in to [GitHub](https://github.com). + +Navigate to **GitHub Repository** -> **Settings** -> **Deploy Keys** and add a public key (e.g. `looker_datahub_deploy_key.pub`) as deploy key with read access. + +

+Looker home page +

+ +Make a note of the private key file. You must paste the file's contents into the GitHub Deploy Key field later while [configuring](./configuration.md) ingestion on the DataHub Portal. + +## Next Steps + +Once you've done all the above steps, it's time to move on to [configuring the actual ingestion source](configuration.md) within DataHub. + +_Need more help? Join the conversation in [Slack](http://slack.datahubproject.io)!_ \ No newline at end of file