Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support other JSON schemas for dbt #3948

Closed
wants to merge 7 commits into from

Conversation

yu-iskw
Copy link

@yu-iskw yu-iskw commented Jul 26, 2024

yu-iskw added 4 commits July 26, 2024 10:34
Signed-off-by: Yu Ishikawa <[email protected]>
Signed-off-by: Yu Ishikawa <[email protected]>
Signed-off-by: Yu Ishikawa <[email protected]>
Signed-off-by: Yu Ishikawa <[email protected]>
{
"name": "dbt Packages",
"description": "dbt packages configurations",
"fileMatch": ["**/packages.yml"],
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to make it work even in a nested directory, because users potentially manage a dbt project in a nested directory, not the root directory.

{
"name": "dbt Selectors",
"description": "dbt selectors configurations",
"fileMatch": ["**/selectors.yml"],
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Comment on lines 1011 to 1023
"fileMatch": [
"**/macros/*.yml",
"**/macros/**/*.yml",
"**/macros/*.yaml",
"**/macros/**/*.yaml",

"**/models/*.yml",
"**/models/**/*.yml",
"**/models/*.yaml",
"**/models/**/*.yaml",

"**/seeds/*.yml",
"**/seeds/**/*.yml",
"**/seeds/*.yaml",
"**/seeds/**/*.yaml",

"**/snapshots/*.yml",
"**/snapshots/**/*.yml",
"**/snapshots/*.yaml",
"**/snapshots/**/*.yaml"
],
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can technically configure the directories instead of the default directories respectively. If we want the rules more flexible, we need to parse all YAML files. But, I'm not sure we can apply file match rules like **/*.yml, **/**/*.yml.

@hyperupcall
Copy link
Member

hyperupcall commented Jul 26, 2024

Thanks for PR; could the fileMatch be narrowed further to prevent false positives? Ideally, something with "dbt" in its name;

@yu-iskw yu-iskw force-pushed the add-dbt-jsonschemas branch from 41ba5bc to cb46470 Compare July 26, 2024 03:49
@yu-iskw
Copy link
Author

yu-iskw commented Jul 26, 2024

@hyperupcall Sure. I have modified the patterns. Can you take a look again?

cb46470

Signed-off-by: Yu Ishikawa <[email protected]>
@yu-iskw
Copy link
Author

yu-iskw commented Jul 26, 2024

@hyperupcall Can you give me advice to solve this? Though I have changed the pattern from **/macros/*.yml to macros/*.yml, I got the workflow errorn.

################ Error message
>> fileMatch with directory must start with "**/" => macros/*.yml

@hyperupcall
Copy link
Member

hyperupcall commented Jul 26, 2024

@yu-iskw I believe that rule is set up because the glob patterns are recursive by default, so effectively macros/*.yml is the same as **/macros/*.yml. Making it /macros/*.yml should silence the error since the pattern explicitly matches the root. (Same thing with packages.yml, doing /packages.yml makes it unambiguously match the root directory).

But in my opinion, matching every yaml file in snapshots, models, etc. still is way to broad, even if its only at the root directory. I'm somewhat hesitant with packages.yml (something like packages.dbt.yml would be ideal), but I'm okay with matching selectors.yml. If this is merged as-is, I worry there will be too many false positive matches. For example, I myself have written some ad-hoc packages.(json|yaml) files for use in personal projects. @madskristensen what do you think

@joellabes
Copy link
Contributor

Hello! I'm a DX Advocate at dbt Labs (makers of dbt) - the way that the dbt CLI app knows it's in a dbt project is that it can find a dbt_project.yml file at its cwd or parents, unless explicitly specified elsewhere.

Because of this, there isn’t a required folder pattern, but I would guess most project directories would contain dbt in them somewhere (and it'd be a reasonable convention to require for schemastore). Would you be open to a pattern like **/*dbt*/**/models/**/*.{yaml,yml}?

I ran it through Digital Ocean's glob checker and it seems to work:
image

"snapshots/*.yaml",
"snapshots/**/*.yaml"
],
"url": "https://raw.githubusercontent.com/dbt-labs/dbt-jsonschema/main/schemas/latest/selectors-latest.json"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"url": "https://raw.githubusercontent.com/dbt-labs/dbt-jsonschema/main/schemas/latest/selectors-latest.json"
"url": "https://raw.githubusercontent.com/dbt-labs/dbt-jsonschema/main/schemas/latest/dbt_yml_files-latest.json"

@hyperupcall
Copy link
Member

hyperupcall commented Aug 1, 2024

find a dbt_project.yml file at its cwd or parents, unless explicitly specified elsewhere

Ideally, schema fileMatches would "activate" and "deactivate" based on the existence of a particular file like dbt_project.yml. I remember this being useful in other cases as well, so I've opened #3965 to track it.

Would you be open to a pattern like **/*dbt*/**/models/**/*.{yaml,yml}?

Yes, that works!

@joellabes
Copy link
Contributor

I can't contribute directly to this branch, but I've opened #3977 to implement the agreed approach

@hyperupcall
Copy link
Member

Closing as superseded by #3977

@hyperupcall hyperupcall closed this Aug 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants