This is a Singer tap that produces JSON-formatted data following the Singer spec.
This tap:
-
Pulls raw data from the Looker v4.0 API
-
Extracts the following endpoint streams:
- color_collections
- connections
- content_metadata
- content_metadata_access
- dashboards
- datagroups
- folders
- groups
- integration_hubs
- integrations
- lookml_dashboards
- lookml_models
- looks
- projects
- queries
- merge_queries
- query_history (POST)
- roles
- scheduled_plans
- themes
- user_attributes
- user_login_lockouts
- users
- versions
- workspaces
-
All endpoints replicate FULL_TABLE (ALL records, every time). Currently, the Looker API does not support paginating, sorting, filtering, or providing audit fields (like created/modified datetimes).
-
Primary Key field(s): Almost all endpoint have an
id
primary key- lookml_models, models, git_branches use a combination key of
name
andproject_name
- git_branches use a combination key of
name
andproject_id
- project_files use
id
andproject_id
- connections use
name
- user_attribute_values use
user_id
anduser_attribute_id
- group_attribute_values use
group_id
andattribute_value_id
- lookml_models, models, git_branches use a combination key of
-
Transformations: Remove
can
nodes; IDs to string; fix JSON validation errors (datatypes) -
ALL JSON schema generated from Looker API Swagger Definitions
-
Install
Clone this repository, and then install using setup.py. We recommend using a virtualenv:
> virtualenv -p python3 venv > source venv/bin/activate > python setup.py install OR > cd .../tap-looker > pip install .
-
Dependent libraries The following dependent libraries were installed.
> pip install singer-python > pip install singer-tools > pip install target-stitch > pip install target-json
-
Create your tap's
config.json
file.subdomain
is the eading part of Looker URL before .looker.com; https://bytecode
.looker.comclient_id
andclient_secret
are your API3 Keys, which may be generated and provided by a Looker Admin.domain
is usuallylooker.com
, unless you have your own white-labeled URL.api_port
is usually19999
, unless you are hosting Looker internally and are using a different port for the API.start_data
is not currently used. The Looker API does not provide audit dates or allow query filtering, sorting, and paging.user_agent
is used to identify yourself in the API logs.
{ "subdomain": "YOUR_SUBDOMAIN", "client_id": "YOUR_LOOKER_CLIENT_ID", "client_secret": "YOUR_LOOKER_CLIENT_SECRET", "domain": "looker.com", "api_port": "19999", "start_date": "2019-01-01T00:00:00Z", "user_agent": "tap-looker <api_user_email@your_company.com>" }
-
Run the Tap in Discovery Mode This creates a catalog.json for selecting objects/fields to integrate:
tap-looker --config config.json --discover > catalog.json
See the Singer docs on discovery mode here.
-
Run the Tap in Sync Mode (with catalog) and write out to state file
For Sync mode:
> tap-looker --config tap_config.json --catalog catalog.json > state.json > tail -1 state.json > state.json.tmp && mv state.json.tmp state.json
To load to json files to verify outputs:
> tap-looker --config tap_config.json --catalog catalog.json | target-json > state.json > tail -1 state.json > state.json.tmp && mv state.json.tmp state.json
To pseudo-load to Stitch Import API with dry run:
> tap-looker --config tap_config.json --catalog catalog.json | target-stitch --config target_config.json --dry-run > state.json > tail -1 state.json > state.json.tmp && mv state.json.tmp state.json
-
Test the Tap
While developing the looker tap, the following utilities were run in accordance with Singer.io best practices: Pylint to improve code quality:
> pylint tap_looker -d missing-docstring -d logging-format-interpolation -d too-many-locals -d too-many-arguments
Pylint test resulted in the following score:
Your code has been rated at 9.73/10
To check the tap and verify working:
> tap-looker --config tap_config.json --catalog catalog.json | singer-check-tap > state.json > tail -1 state.json > state.json.tmp && mv state.json.tmp state.json
Check tap resulted in the following:
The output is valid. It contained 3000 messages for 41 streams. 155 schema messages 2795 record messages 50 state messages Details by stream: +-----------------------------+---------+---------+ | stream | records | schemas | +-----------------------------+---------+---------+ | datagroups | 19 | 1 | | lookml_dashboards | 2 | 1 | | scheduled_plans | 2 | 4 | | users | 62 | 1 | | user_attribute_values | 1302 | 1 | | user_sessions | 26 | 1 | | integration_hubs | 1 | 1 | | workspaces | 2 | 1 | | integrations | 24 | 1 | | themes | 2 | 1 | | roles | 8 | 1 | | role_groups | 3 | 1 | | dashboards | 44 | 1 | | dashboard_filters | 20 | 1 | | content_metadata | 183 | 5 | | dashboard_elements | 131 | 1 | | merge_queries | 1 | 44 | | queries | 155 | 46 | | dashboard_layouts | 44 | 1 | | projects | 13 | 1 | | git_branches | 166 | 1 | | project_files | 151 | 1 | | user_attributes | 17 | 1 | | user_attribute_group_values | 2 | 1 | | lookml_models | 18 | 1 | | models | 18 | 1 | | explores | 25 | 18 | | looks | 69 | 1 | | color_collections | 15 | 1 | | permissions | 40 | 1 | | permission_sets | 9 | 1 | | content_metadata_access | 106 | 3 | | model_sets | 9 | 1 | | versions | 1 | 1 | | user_login_lockouts | 0 | 1 | | groups | 9 | 1 | | groups_in_group | 10 | 1 | | connections | 14 | 1 | | folders | 35 | 1 | +-----------------------------+---------+---------+
Copyright © 2019 Stitch