Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Settings Pydantic and use the power of BaseSettings to simplify CLI #700

Closed
wants to merge 7 commits into from

Conversation

stan-dot
Copy link
Contributor

@stan-dot stan-dot commented Nov 5, 2024

Fixes #495

Using Pydantic for logic to do with file loading. It has a bit of magic but will lower our lines of code. ApplicationConfig will extend BaseSettings, while the nested configs like ScratchConfig are still BlueapiBaseModels.

@stan-dot stan-dot added enhancement New feature or request python Pull requests that update Python code cli Relates to CLI code labels Nov 5, 2024
@stan-dot stan-dot self-assigned this Nov 5, 2024
@stan-dot stan-dot linked an issue Nov 5, 2024 that may be closed by this pull request
@stan-dot
Copy link
Contributor Author

stan-dot commented Nov 5, 2024

File "/workspaces/blueapi/src/blueapi/service/interface.py", line 21, in <module>
INTERNALERROR>     _CONFIG: ApplicationConfig = ApplicationConfig()

now it's the time when the use of global constants bites us

@stan-dot
Copy link
Contributor Author

stan-dot commented Nov 5, 2024

@callumforrester why do we get another instance of ApplicationConfig inside the interface.py?

"""This module provides interface between web application and underlying Bluesky
context and worker"""


_CONFIG: ApplicationConfig = ApplicationConfig()


def config() -> ApplicationConfig:
    return _CONFIG


def set_config(new_config: ApplicationConfig):
    global _CONFIG

    _CONFIG = new_config

@callumforrester
Copy link
Contributor

@stan-dot that's for the subprocess, you need an instance in the main process, which is passed down to the subprocess via set_config. I'm not sure why it defaults to the default config rather than None, though.

@stan-dot
Copy link
Contributor Author

stan-dot commented Nov 6, 2024

I think that the global keyword made it more confusing to me. Would SUBPROCESS_CONFIG be a correct name?

why would we want it to be None? Pydantic expects it to not be None.

I haven't worked with python subprocesses before and not sure how to debug successfully

src/blueapi/config.py Show resolved Hide resolved
)

@classmethod
def customize_sources(cls, init_settings, env_settings, file_secret_settings):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like settings_customise_sources in the docs?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

class Settings(BaseSettings):
    foobar: str
    nested: Nested
    model_config = SettingsConfigDict(toml_file='config.toml')

    @classmethod
    def settings_customise_sources(
        cls,
        settings_cls: Type[BaseSettings],
        init_settings: PydanticBaseSettingsSource,
        env_settings: PydanticBaseSettingsSource,
        dotenv_settings: PydanticBaseSettingsSource,
        file_secret_settings: PydanticBaseSettingsSource,
    ) -> Tuple[PydanticBaseSettingsSource, ...]:
        return (TomlConfigSettingsSource(settings_cls),)

Looking at how a toml file is used by an equivalent

And

settings_customise_sources takes four callables as arguments and returns any number of callables as a tuple. In turn these callables are called to build the inputs to the fields of the settings class.

I think you need the signature to exactly match what is shown?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, it needs to be an override. Not sure why did this got changes

@callumforrester
Copy link
Contributor

why would we want it to be None? Pydantic expects it to not be None.

So currently the following sequence of events happens:

  1. Subprocess spawned by main process
  2. config set to default config, the result of calling ApplicationConfig()
  3. Main process calls set_config(), overriding the default config with whatever was in the config file

We could instead make it None at step 2 (and yes, change the type annotation), since whatever it is gets immediately erased anyway.

@stan-dot stan-dot force-pushed the 699-refactor-settings-to-pydantic branch from 9ec6d18 to bf47630 Compare November 8, 2024 17:35
@stan-dot
Copy link
Contributor Author

@callumforrester do main and subprocess share a global variable? you didn't mention where does step 2 happen - whether in main process or the subprocess

@callumforrester
Copy link
Contributor

@stan-dot No they can't share a global variable. The main process passes the idea into the subprocess via set_config

@stan-dot
Copy link
Contributor Author

The main process passes the idea into the subprocess via set_config

I thought usually it's values or references passed, not ideas...

@callumforrester
Copy link
Contributor

Can you tell I haven't finished my morning coffee yet? Honestly no idea what I intended to type there, "config", maybe?

@stan-dot
Copy link
Contributor Author

no, I cannot tell that. I'd appreciate concise technical terms used

@stan-dot stan-dot marked this pull request as ready for review November 11, 2024 10:39
@stan-dot stan-dot force-pushed the 699-refactor-settings-to-pydantic branch 2 times, most recently from 51d5f52 to ab8aaf1 Compare November 12, 2024 13:25
@stan-dot stan-dot force-pushed the 699-refactor-settings-to-pydantic branch from ab8aaf1 to 7cd6dea Compare November 14, 2024 10:53
@stan-dot
Copy link
Contributor Author

very odd errors, trying to fix now

@stan-dot
Copy link
Contributor Author

trying to refactor the interface to make it testable more easily.

@callumforrester it looks like unlike the patterns I can see in the codebase for worker and rest api the interface was designed to work as a callable module holding its own state, not as a class in its own.

It makes me feel confused, where can I read the ADR for this?

image

@stan-dot
Copy link
Contributor Author

huh, I'd need to refactor main as well


def config() -> ApplicationConfig:
return _CONFIG
config_manager = ConfigManager()


def set_config(new_config: ApplicationConfig):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could: Can we remove this now?

Comment on lines +38 to +39
ApplicationConfig.model_config["yaml_file"] = config
app_config = ApplicationConfig() # Instantiates with customized sources
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should: I'm concerned this. mutation of global state is an antipattern. Could you instead subclass the config?

Suggested change
ApplicationConfig.model_config["yaml_file"] = config
app_config = ApplicationConfig() # Instantiates with customized sources
class YamlConfig(ApplicationConfig):
model_config = {**ApplicationConfig.model_config, **{"yaml_file": config}}
app_config = YamlConfig()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's how Pydantic uses it and this is not regular state but settings

also this is not Java

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mutation of global state is an antipattern in any programming language (good thread).

Now I have to remember to address this if I use ApplicationConfig somewhere else, just like you have:

ApplicationConfig.model_config["yaml_file"] = None

There is nothing that makes this obvious to me as a developer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking into the base settings docs now

re: mutation - it was there anyway and this config is just the init, and anywhere else you'd sooner use configManager getter methods, should you need access to the config.

image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the mutable settings option is here:
and we're not using it
https://docs.pydantic.dev/latest/concepts/pydantic_settings/#removing-sources

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like what we're trying to do was just not a well-envisioned use case: pydantic/pydantic-settings#346

Should maybe reconsider the use of BaseSettings and/or raise a PR upstream?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is true, turns out that pydantic-settings isn't as well-finished product like pydantic itself


def __init__(self, config: ApplicationConfig = None):
if config is None:
ApplicationConfig.model_config["yaml_file"] = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should: Again, I think you should find an alternative to mutating global state, if you fix the instance in CLI then you shouldn't need this line.

@DiamondJoseph
Copy link
Contributor

DiamondJoseph commented Nov 25, 2024

Since instance level file configuration for BaseSettings isn't going to be in pydantic_settings until v3; and v3 will release at some time next year; and that the work on this in pydantic_settings has not yet begun, I think we need to decide either to shelve this until such time we can instantiate our settings in a way we're happy with, or consider other workarounds.

We could, for example, deprecate the ability to pass -c config.yaml and instead pass the config file path(s) through an EnvVar, which is then read and set just the once; or maintain the current -c but have it temporarily set the equivalent EnvVar, clearing it on exit?

See also: pydantic/pydantic-settings#259 for various approaches.

@stan-dot
Copy link
Contributor Author

should this be closed as not at the best time now @callumforrester ?

#700 (comment)

if so I think it's best that I also unassign myself from the set config issue (#495)

@callumforrester
Copy link
Contributor

callumforrester commented Nov 25, 2024

@stan-dot happy for you to do so, thanks for your work so far

@stan-dot stan-dot closed this Nov 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cli Relates to CLI code enhancement New feature or request python Pull requests that update Python code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Refactor Config to use Pydantic BaseSettings Set configuration options via environment variables
3 participants