Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rpm PoC: Allow specifying DNF options via CLI input JSON #522

Merged
merged 6 commits into from
Jun 12, 2024

Conversation

eskultety
Copy link
Member

@eskultety eskultety commented Apr 18, 2024

Introduce a new input JSON attribute 'options' for the PoC rpm package
manager allowing consumers to pass arbitrary DNF repo configuration
options that would be applied later through the inject-files command.

The general schema for the input JSON is as follows:

{
    type: <package_manager>
    path: <relative_path>
    options: {
        <namespace>: {
            <key1>..<keyN>: Union[str, Any]<val1>
        }
    }
}

which translated to DNF repo options for RPM might then look like:

{
    type: rpm
    path: .
    options: {
        dnf: {
             repoid1: {gpgcheck: 0, deltarpm: 1},
             repoid2: {fastestmirro: 1, sslverify: 0}
        }
    }
}

which in turn would end up being dumped to the .build-config.json similarly to this:

{
  "environment_variables": [],
  "project_files": [],
  "options": {
    "rpm": {
      "dnf": {
        "releases": {
          "gpgcheck": 0,
          "enabled": 0,
          "type": "yum"
        }
      }
    }
  }

and finally the resulting generated .repo file might look like:

[releases]
gpgcheck = 0
enabled = 0
type = yum
baseurl = file:///etc/yum.repos.d/deps/rpm/x86_64/releases

Notes:

  • worth mentioning is that the way this is currently designed is that options are only ever formatted in the build config if some were actually set on the CLI (rather than being formatted as options: {}) so that this would be transparent to other package manager for which we don't intend to support options at the moment; this was all achieved by setting exclude_none=True globally for buildconfig.model_dump_json
  • the reason why there's an additional level of nesting in the .build-config.json namespaced by "rpm" is due to the fact that it's possible to specify more than one package with the input JSON on CLI, thus theoretically possible to provide various options for various package managers (if ever turned on) so we'd need to know which options relate to which package manager type which wouldn't otherwise be possible from current build config contents.

Maintainers will complete the following section

  • Commit messages are descriptive enough
  • Code coverage from testing does not decrease and new code is covered
  • Docs updated (if applicable)
  • Docs links in the code are still valid (if docs were updated)

Note: if the contribution is external (not from an organization member), the CI
pipeline will not run automatically. After verifying that the CI is safe to run:

@eskultety
Copy link
Member Author

@brunoapimentel I wonder if I should extract the ConfigParser bits on .repo file generation and propose that as a separate PR as I didn't get any comments on that and I'm not sure whether we're in agreement or it got lost in context of the main change - options in RpmPackageInput.

@eskultety eskultety force-pushed the rpm_prefetch_options branch from 21f454f to 28d3b3c Compare April 26, 2024 09:29
@brunoapimentel
Copy link
Contributor

@brunoapimentel I wonder if I should extract the ConfigParser bits on .repo file generation and propose that as a separate PR as I didn't get any comments on that and I'm not sure whether we're in agreement or it got lost in context of the main change - options in RpmPackageInput.

I did some tests with the implementation based on ConfigParser, and it LGTM. So I'd probably keep it the way it is since the implementation is already done.

The only thing I was in middle of reviewing before getting carried away with something else was the custom validation you added in the last commit. From the few manual tests I did, it seemed that Pydantic handled these same use cases pretty well, and I thought that the error messages were reasonable. But I've likely missed the issues you had with unit tests.

@eskultety
Copy link
Member Author

The only thing I was in middle of reviewing before getting carried away with something else was the custom validation you added in the last commit. From the few manual tests I did, it seemed that Pydantic handled these same use cases pretty well, and I thought that the error messages were reasonable. But I've likely missed the issues you had with unit tests.

Soo, let me illustrate the issue on 1 example:

x = _DNFOptions.model_validate({'type': 'rpm', 'options': 'bad_type'})
*** pydantic_core._pydantic_core.ValidationError: 3 validation errors for _DNFOptions
dnf
  Field required [type=missing, input_value={'type': 'rpm', 'options': 'bad_type'}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.6/v/missing
type
  Extra inputs are not permitted [type=extra_forbidden, input_value='rpm', input_type=str]
    For further information visit https://errors.pydantic.dev/2.6/v/extra_forbidden
options
  Extra inputs are not permitted [type=extra_forbidden, input_value='bad_type', input_type=str]
    For further information visit https://errors.pydantic.dev/2.6/v/extra_forbidden

^This is what I get without the validator, IOW the only bit of information I get is 'Field required' but it doesn't tell me anything else like what field is missing, so despite input_value the error message is IMO very vague.

vs

This is after introducing the validator

x = _DNFOptions.model_validate({'type': 'rpm', 'options': 'bad_type'})
*** pydantic_core._pydantic_core.ValidationError: 1 validation error for _DNFOptions
  Value error, Missing required namespace attribute in '{'type': 'rpm', 'options': 'bad_type'}': 'dnf' [type=value_error, input_value={'type': 'rpm', 'options': 'bad_type'}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.6/v/value_error

It's at least IMO clear that a keyword 'dnf' is missing, so from that perspective I think that message is clearer.
Besides being vague, based on the fact that you get multiple error messages with the default behaviour I had some troubles figuring out the error msg regex for unit tests properly and so I created a custom validator with more deterministic messages. @brunoapimentel do you think there's a better way? I'm fine with dropping the validator if there is :) .

@brunoapimentel
Copy link
Contributor

brunoapimentel commented Apr 30, 2024

@brunoapimentel do you think there's a better way? I'm fine with dropping the validator if there is :) .

Oh, I see it now. When you invoke the actual CLI, the output you get is different (although I'm not sure which piece is making the conversion).

I started with your example here, and extrapolated a little:

$ cachi2 fetch-deps --source ../test-repos/cachi2-rpm --dev-package-managers '{"type": "rpm", "options": "bad_type"}'
2024-04-29 22:18:48,960 ERROR InvalidInput: 1 validation error for user input\npackages -> 0 -> rpm -> options\n  Value error, Unexpected data type for 'options.bad_type' in input JSON: expected 'dict'
Error: InvalidInput: 1 validation error for user input
packages -> 0 -> rpm -> options
  Value error, Unexpected data type for 'options.bad_type' in input JSON: expected 'dict'

$ cachi2 fetch-deps --source ../test-repos/cachi2-rpm --dev-package-managers '{"type": "rpm", "options": {"bad_type": "this"}}'
2024-04-29 22:19:39,650 ERROR InvalidInput: 1 validation error for user input\npackages -> 0 -> rpm -> options\n  Value error, Missing required namespace attribute in '{'bad_type': 'this'}': 'dnf'
Error: InvalidInput: 1 validation error for user input
packages -> 0 -> rpm -> options
  Value error, Missing required namespace attribute in '{'bad_type': 'this'}': 'dnf'

$ cachi2 fetch-deps --source ../test-repos/cachi2-rpm --dev-package-managers '{"type": "rpm", "options": {"dnf": "this"}}'
2024-04-29 22:19:47,235 ERROR InvalidInput: 1 validation error for user input\npackages -> 0 -> rpm -> options\n  Value error, Unexpected data type for 'options.dnf.this' in input JSON: expected 'dict'
Error: InvalidInput: 1 validation error for user input
packages -> 0 -> rpm -> options
  Value error, Unexpected data type for 'options.dnf.this' in input JSON: expected 'dict'

I think these messages are a lot clearer than the ones you posted before, so maybe we could just go with them. Wdyt?

@eskultety
Copy link
Member Author

I think these messages are a lot clearer than the ones you posted before, so maybe we could just go with them. Wdyt?

Uhm, I'm sorry, I'm afraid you lost me a bit in your examples, so do you want me to modify the messages I emit in the validator in any way or are we okay with what is proposed?

Copy link
Contributor

@brunoapimentel brunoapimentel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only nitpick here is about implementing the custom validation, but I don't see it as blocker, so I'm leaving my ack.

@eskultety eskultety mentioned this pull request May 14, 2024
2 tasks
@eskultety eskultety force-pushed the rpm_prefetch_options branch from 28d3b3c to e213c13 Compare May 20, 2024 15:35
@eskultety eskultety force-pushed the rpm_prefetch_options branch from e213c13 to 596f705 Compare May 28, 2024 11:40
@eskultety
Copy link
Member Author

@brunoapimentel I needed to update this PR to reflect your recent integration test changes by adjusting them to compare on the parsed structure contents rather than verbatim file contents when it comes to the Configparser instance I introduced. Let me know if you still agree with that adjustment so that I can finally merge this. (this time for real if CI passes)

@eskultety eskultety force-pushed the rpm_prefetch_options branch from 596f705 to 192536d Compare May 28, 2024 12:10
@brunoapimentel
Copy link
Contributor

@brunoapimentel I needed to update this PR to reflect your recent integration test changes by adjusting them to compare on the parsed structure contents rather than verbatim file contents when it comes to the Configparser instance I introduced. Let me know if you still agree with that adjustment so that I can finally merge this. (this time for real if CI passes)

The changes on the integration test LGTM. Let's get this merged :)

Cosmetic change - this is mainly to improve the UX of the test results
run summary helping to identify/describe what test cases we run for the
input validation.

Signed-off-by: Erik Skultety <[email protected]>
Cosmetic change - we can convey the same message on a single
line and enhance code readability just a tiny bit without line splits.

Signed-off-by: Erik Skultety <[email protected]>
This is called as a callback from the cli.py module where we always
have both from_output_dir and for_output_dir resolved. It's fair to
assume any inject_files post action would need both of these to
accomplish whatever it is the action needs to do, so listing both
arguments as positional in the actual callback implementation makes the
code a bit easier to read.

Signed-off-by: Erik Skultety <[email protected]>
Repo files follow the INI syntax for which we can use ConfigParser.
The motivation behind using a library abstraction to generate the INI
repo files stems from the fact that we already have some options
rendered conditionally with a fair chance of extending this in the
future with more custom user-supplied DNF repo options which means even
more conditionals which would stretch the capabilities of a plain
triple string (or a template for that matter) a bit too far.
Therefore, this patch proposes using ConfigParser instance for creating
and dumping the repo files in the inject_files post action.

Note that to make things a tiny bit more ergonomic, we subclass the
ConfigParser class to define a couple utility methods that can tell us
whether the ConfigParser instance is effectively empty (because an
empty instance of ConfigParser still evaluates to True) and thus can
be dumped to a file as well as some extra handling of default options,
e.g. defaulting to 'gpgcheck=1' explicitly in the generated .repo file.

Signed-off-by: Erik Skultety <[email protected]>
The sorting was added solely to make comparing of the expected and
generated repo files' contents more deterministic in unit tests, but
now that we're driving the generation process with ConfigParser, we can
perform the comparisons on the object level rather than actual file
contents, and so we no longer need to sacrifice the beauty of using
iterdir's generator by converting it to a list just to keep tests
deterministic.

Signed-off-by: Erik Skultety <[email protected]>
Introduce a new input JSON attribute 'options' for the PoC rpm package
manager allowing consumers to pass arbitrary DNF repo configuration
options that would be applied later through the inject-files command.

The general schema for the input JSON is as follows:

{
    type: <package_manager>
    path: <relative_path>
    options: {
        <namespace>: {
            <key1>..<keyN>: Union[str, Any]<val1>
        }
    }
}

which translated to DNF repo options for RPM might then look like:

{
    type: rpm
    path: .
    options: {
        dnf: {
             repoid1: {gpgcheck: 0, deltarpm: 1},
             repoid2: {fastestmirro: 1, sslverify: 0}
        }
    }
}

This implementation is trying to be generic enough to be potentially
later extended to other package managers as well which also means that
due to the fairly complex (multi-level nested dictionary) input
a new intermediary model _DNFOptions was needed to be introduced in
this patch, but most importantly a custom 'before' model validator for
this model had to be implemented to generate stable and deterministic
errors which could be reliably unit-tested (default pydantic error
messages for such complex structures are madness).

Worth noting would be that unless options were specified, they would
not appear in the build-config.json --> exclude_none=True

Signed-off-by: Erik Skultety <[email protected]>
@eskultety eskultety force-pushed the rpm_prefetch_options branch from 192536d to a0497be Compare June 12, 2024 12:05
@eskultety eskultety added this pull request to the merge queue Jun 12, 2024
Merged via the queue into containerbuildsystem:main with commit e5caa8c Jun 12, 2024
13 checks passed
@eskultety eskultety deleted the rpm_prefetch_options branch July 17, 2024 11:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants