Replies: 4 comments 24 replies
-
Something like this definitely seems cleaner than the current nested bazel invocation workaround that we're using. However, my biggest concern is that while my proposal feels like a solution (although admittedly one that's potentially very difficult to get implemented), this proposal feels like it's still a hack, just less of one. I'm not necessarily opposed to that, mind you, but my main issue is that if this were implemented, I'd want a migration plan to a proper solution, ideally in a backwards compatible way. I really don't want to have something like this implemented, find that it mostly works, but that we need something like my proposal to get a proper clean solution, and find out that doing that wouldn't be backwards compatible. I guess my other concern is that when you're talking about sharing things between the outer and inner bazel invocation, it seems like it'll start off as less work than my proposal, but as you share more and more things, it seems like it'd become harder and harder to maintain. My proposal, while more work, kind of just gives you everything at once for free. In terms of the viability of the idea, it seems mostly good, but the biggest issue I can think of off the top of my head is that knowing when you need to rerun a repo rule is nontrivial (though maybe that's not the case if you have access to the bazel internals). Given that bazel's motto is
Currently, my implementation uses In conclusion, I'm not explicitly opposed to the idea, but I'd want to be very careful about it. |
Beta Was this translation helpful? Give feedback.
-
I could be misreading it (and please do correct me if I'm wrong), but I think the proposal could be generalized to a simple design problem: At the bottom, Bazel build is a DAG comprising artifact nodes and action edges. The ask is whether we could support a self-expanding DAG as in, being able to feed some of the artifact nodes(i.e. starlark rules and BUILD files) in the initial analysis of the DAG to generate more nodes and edges. Being able to run Bazel-in-bazel is one way to solve it, which is essentially using 2 (or more) DAGs with the result of the upstream ones being fed to one final downstream DAG. However, as @matts1 mentioned, it does feel like a short-term, bloated solution. Realistically that could create 2(or more) Bazel Java servers on the same client machine, which might significantly worsen the user experience. Self-expanding DAG is a challenge indeed. As the comments in the proposal mentioned, Bazel is currently operating under some constraints, such as |
Beta Was this translation helpful? Give feedback.
-
re: debugging nested Bazel invocations, yeah, it's not my idea of a fun time either. But then again, doing the same within a singel Bazel invocation would also increase the state space of Bazel, so the question is not whether we need to add debugging tools, but what kind of debugging tools we'd need to add and I can't tell that assuming we provide good enough tools, which case would make debugging simpler. re: Windows, I think it's extra work, but somewhat surprisingly, process handling on Windows is actually nicer than on Linux so I'm not worried that we'd not be able to do what we want, but it may be extra work indeed.
Touché! @Wyverald , WDYT? |
Beta Was this translation helpful? Give feedback.
-
Summarizing the above discussions, it looks like we haven't identified any fundamental issues, the long list semantic problems we'll need to work through seems to be shorted than I initially anticipated and seems like the complications this entails would not be much more hairy than the complications any workaround would ential. Which means that the main obstacle is engineering time (aka. opportunity cost) @meteorcloudy would you be fine with scheduling a prototype (without commitment) sometime in the first half of 2024? |
Beta Was this translation helpful? Give feedback.
-
Every once a while we get a request of the sort "I want to generate a BUILD file from a Bazel action" or "I want to run Bazel as part of fetching an external repository" (see the most recent such proposal here)
The ideological objection to this is that fetching repositories should be about collecting source code Bazel then builds and not about building things. However, there are some issues with this dogma; the most salient I know of is that fetching source code (especially if one wants to do it hermetically) sometimes requires unusual tools and currently the only two options Bazel offers for building these tools is either to make them build-less (e.g. a shell script) or reimplementing building them in a hacky way inside Bazel, whose whole business is building software.
The architectural objection to this is that making BUILD files depend on the result of actions would alter how Bazel works very significantly. For example, the semantics of
query
,cquery
andaquery
would become much more complicated:query
in particular offers the guarantee that it can return all the files needed to load and analyze a given top-level target. If reading BUILD files required running actions, this would also require returning all the input files of those actions there. They would also need to run actions, talk to RBE, have the pertinent command line options, etc.At the very least, the above makes implementing generating BUILD files from Bazel actions very difficult. That in practice translates to calendar time and bugs.
The straightforward solution to this is to just invoke Bazel in a subprocess during repository fetching. This can be done today and it works, but it is clunky because one needs to manually make sure that various caches are shared by the outer and inner Bazel instances and slow because repository rules prohibit starting a long-lived Bazel server (#20447).
I don't particularly like the idea of directly fixing #20447 because it pokes a hole in the nice property that repositories clean up after themselves after they fetch things and detecting whether a subprocess is Bazel sounds like too much DWIM.
However, we could teach Bazel to invoke Bazel as part of fetching a repository not as a free-form command line that happens to invoke Bazel, but explicitly. This would have a number of advantages:
bazel clean
,bazel shutdown
and the like on the outer instance can be arranged for.All we need to do is to make sure that the interface is such that it's amenable to closer integration. Off the top of my had, I could imagine an interface that says "run Bazel in this repository, build those top-level targets, put their files from
DefaultInfo
in these given locations, make that a new repository" (This is pretty much what the proposal by @matts1 does)@matts1 @Wyverald @meteorcloudy WDYT?
cc @lizkammer
Beta Was this translation helpful? Give feedback.
All reactions