Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve ergonomics for glob matching #9

Open
zimbatm opened this issue May 5, 2021 · 9 comments
Open

Improve ergonomics for glob matching #9

zimbatm opened this issue May 5, 2021 · 9 comments
Labels
enhancement New feature or request

Comments

@zimbatm
Copy link
Member

zimbatm commented May 5, 2021

One thing I noticed when using this on a real-world project, is that matchExt is a bit too wide.

What I mean is that typically I want to match under a specific folder. Eg: frontend/**/*.js. Not for the whole project. So matchExt doesn't scale really well with larger source bases.

@zimbatm zimbatm added the enhancement New feature or request label May 5, 2021
@zimbatm
Copy link
Member Author

zimbatm commented May 5, 2021

Like in #8, the solution right now is pretty verbose:

include = with nix-filter; [
      (and
        (inDirectory "frontend")
        (or_ isDirectory (matchExt "js"))
    ];
];

@wpcarro
Copy link

wpcarro commented Jul 25, 2022

One thing I noticed when using this on a real-world project, is that matchExt is a bit too wide.

I was expecting (matchExt "md") to find all of the Markdown files in my root directory recursively, but it seems to only grab the top-level matches. I'm guessing this is because with a setup like...

files = nix-filter {
  root = /repo;
  include = [
    (matchExt "md")
  ];
  exclude = [
    (and isDirectory (matchName ".git"))
  ];
};

...there is nothing in the include section that returns true for a directory to support the recursive search. How would I support a basic glob like: **/*.md with nix-filter?

When I try (or_ isDirectory (matchExt "md")), the output includes a bunch of empty directories (i.e. the directories that have no Markdown files).

@wpcarro
Copy link

wpcarro commented Jul 25, 2022

I assume from the docs that this maybe by design, but how might I support the recursive search for all Markdown files in a repository with nix-filter without whitelisting directories to search? Or might I be better-off using a different tool?

While traversing the filesystem, starting from path, it will call filter on each file and folder recursively. If the filter returns false then the file or folder is ignored. If a folder is ignored, it won't recurse into it anymore.

Because of that, it makes it difficult to implement recursive glob matchers. Something like **/*.js would necessarily have to add every folder, to be able to traverse them. And those empty folders will end-up in the output.

If we want to control rebuild, it's important to have a fixed set of folders.

@zimbatm
Copy link
Member Author

zimbatm commented Jul 25, 2022

It's a limitation of the underlying nix builtin. I think the best strategy is to add isDirectory to the includes and blacklist the few folders that don't have md files in them.

In theory, it's possible to achieve what you want, but it would require nix-filter to use two passes. With two additions to the /nix/store.

@wpcarro
Copy link

wpcarro commented Jul 25, 2022

Thanks for getting back to me. The README for this repo has helped me get up-to-speed on this problem, so thanks for taking the time to include that:

One possibility is to use a two-pass system, where first all the folders are being added, and then the empty ones are being filtered out. But all of this happens at Nix evaluation time. Nix evaluation is already slow enough like that.

Idea

I'm thinking that maybe I could try an approach that uses readDir to crawl a directory and build an attrset/tree that converts something like...

├── bar.md
├── baz.js
├── foo.txt
└── subdir
    ├── bar.txt
    ├── baz.md
    └── foo.md

...into...

{
  "bar.md" = true;
  subdir = {
    "baz.md" = true;
    "foo.md" = true;
  };
}

Which I could then maybe transform into builtins.path calls that move the files to /nix/store. Anything immediately stand-out as a pitfall with this approach?

@zimbatm
Copy link
Member Author

zimbatm commented Jul 28, 2022

Nice, I didn't think of that combo before. That would make things much more ergonomic. The only downside is more nix evaluation.

@ilkecan
Copy link
Contributor

ilkecan commented Aug 11, 2022

When I try (or_ isDirectory (matchExt "md")), the output includes a bunch of empty directories (i.e. the directories that have no Markdown files).

This should work and works for me (tested with this and the latest commit). Are you using a somewhat recent version of the project?

Also if you are only interested in including a pattern like **/*.md, you don't need to use or_ since elements of include attribute will be ored anyway. So this should also work:

src = nix-filter {
  root = ~/zettelkasten;

  include = with nix-filter; [
    isDirectory
    (matchExt "md")
  ];
}

@zimbatm
Copy link
Member Author

zimbatm commented Aug 13, 2022

The issue is that isDirectory will include all the directories, even if they don't contain .md files. Imagine you add a ./photos folder in your zettelkasten that only includes .jpeg files. That folder will also show up, empty.

@infinisil
Copy link

I recently developed a safer and easier-to-use abstraction for source filtering in a PR to Nixpkgs, please take a look, try it out, and give feedback! https://discourse.nixos.org/t/easy-source-filtering-with-file-sets/29117

Particularly for this issue, it allows you to recursively specify files to include like this:

let
  fs = lib.fileset;
in
fs.unions [
  # Only js files in the frontend directory
  (fs.fileFilter (file: file.ext == "js") ./frontend)
]

And this takes care of not including empty directories automatically.

We can also easily define convenience functions like fileExtFilter [ "js" ] if desired!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants