Aggregate build context assumptions #19

virtuald · 2015-11-16T19:42:44Z

Here's my enhanced version of the aggregate build context, and the problems I encountered with the original context and how I solved them. You can see my code in this gist. I used this to help me generate python packages from a set of input files. My problem is:

From each input file, generate one or more output file (easy enough using the default build context)
Python 2 packages required an __init__.py file to be present in each generated subdirectory, and I had to generate their contents based on the input files (the easiest solution seems like the aggregate context, which I enhanced).

Problems with the original aggregate builder:

It assumes that you know what each output file is before processing the input files -- as you must call newInputSet() for each output file on each run (regardless of incremental state)
- I cannot determine the output files without processing the input files
- Solved here and here by assuming that if an input file hasn't changed, then there is no reason for the compiler to call newInputSet() for the output.

Additionally, the MetadataAggregator is a really cool idea, but...

The glean operation does not allow you to associate metadata with a particular output file, as the attribute key used was the class name. So if you had multiple output files generated from the same input files there could be a conflict. The output key should be something like what I implemented.
Once again, it assumes you know which input files are associated with each output file
There isn't a way to associate any metadata with the outputfile that wasn't associated with an input file, so I added something for that too

Once again, thanks so much for the work on this library, it's super useful and I was able to successfully use it to accomplish my goals. Hopefully my feedback can make it better.

The text was updated successfully, but these errors were encountered:

ifedorenko · 2015-11-17T14:30:39Z

Thank you for your feedback. Incremental build library is still in early stages of development and adoption and it is important to get input based on as many usecases as possible. To make sure I understand the problems correctly, let me rephrase them in my own words

there is a problem using BuildContext and AggregatorBuildContext in the same mojo. Currently the two context implementation compete for the same state file and only one is able to retain state between builds.
it is not possible to create aggregate outputs based on inputs returned by BuildContext#registerAndProcessInputs. If there is no input change, aggregate outputs are deleted, which is not correct. As a side note, I think a reasonable workaround here would be to use BuildContext#registerInputs and check for ResourceMetadata#getStatus in your mojo.
AggregatorBuildContext does not support generation of multiple outputs from the same input using MetadataAggregator because internally context state only retains metadata gleaned by the last aggregator.

Did I get this right I missed/misunderstood something?

On the last bullet, do you think we need a way to produce several aggregate outputs from the same aggregator or it is reasonable to assume one aggregator only produces single output?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Aggregate build context assumptions #19

Aggregate build context assumptions #19

virtuald commented Nov 16, 2015

ifedorenko commented Nov 17, 2015

Aggregate build context assumptions #19

Aggregate build context assumptions #19

Comments

virtuald commented Nov 16, 2015

ifedorenko commented Nov 17, 2015