Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ESConfig: configurable Elasticsearch document types to allow splitting index #1296

Merged
merged 31 commits into from
Oct 28, 2024

Commits on Oct 27, 2024

  1. add MetaCPAN::ESConfig module to centralize ES config

    Centralize Elasticsearch configuration in MetaCPAN::ESConfig. Allow
    overridden values from the main config file.
    
    This module is not meant to have any behavior aside from holding the
    configuration.
    haarg committed Oct 27, 2024
    Configuration menu
    Copy the full SHA
    1489794 View commit details
    Browse the repository at this point in the history
  2. adapt Mapping script to use ESConfig module

    ESConfig knows how to find mapping data. Use it to find the mapping data
    as well as index configuration.
    
    The mapping data should be able to be moved into json files rather than
    json wrapped in a module. This can happen in the future.
    haarg committed Oct 27, 2024
    Configuration menu
    Copy the full SHA
    edb8e78 View commit details
    Browse the repository at this point in the history
  3. remove analysis configuration from MetaCPAN::Model

    The analysis set in MetaCPAN::Model wasn't used for anything directly,
    generate the index deployment statements. The index settings we actually
    use lives in MetaCPAN::Script::Mapping::DeployStatement, so the
    declarations in MetaCPAN::Model were used for nothing.
    haarg committed Oct 27, 2024
    Configuration menu
    Copy the full SHA
    4e2c206 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    1ce41ec View commit details
    Browse the repository at this point in the history
  5. configure MetaCPAN::Model via ESConfig

    Rather than searching for modules on disk, use the explicit
    configuration in ESConfig to configure MetaCPAN::Model.
    haarg committed Oct 27, 2024
    Configuration menu
    Copy the full SHA
    94c84ea View commit details
    Browse the repository at this point in the history
  6. check for compilation errors in document set modules

    ElasticSearchX::Model ignores all errors in a ::Set package can't be
    loaded, and uses a generic ElasticSearchX::Model::Document::Set object.
    
    It's fine for the module to be missing, but compilation errors should be
    reported.
    haarg committed Oct 27, 2024
    Configuration menu
    Copy the full SHA
    491e640 View commit details
    Browse the repository at this point in the history
  7. disable critic rule prohibiting prototypes

    PPI and thus Perl::Critic don't understand signatures, so the rule ends up prohibiting signatures as well
    haarg committed Oct 27, 2024
    Configuration menu
    Copy the full SHA
    aff19bf View commit details
    Browse the repository at this point in the history
  8. add doc method to MetaCPAN::Model

    Allows getting a "type" object from a document name rather than needing
    to specify an index and type.
    haarg committed Oct 27, 2024
    Configuration menu
    Copy the full SHA
    967b34e View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    f9f7dea View commit details
    Browse the repository at this point in the history
  10. Scripts: refresh all indices

    Previously when trying to refresh indices, scripts would call
    $self->index->refresh. This would refresh the "currently used" index.
    That doesn't make any sense when splitting each type into its own index.
    This was also using ElasticSearchX::Model, which we want to get rid of.
    
    Instead, call ->indices->refresh via the Search::Elasticsearch object.
    This will refresh all indices, which is fine for our purposes. In the
    future, we could consider being more selective about which indices we
    are refreshing, but this is no worse than the old behavior.
    haarg committed Oct 27, 2024
    Configuration menu
    Copy the full SHA
    a2e9202 View commit details
    Browse the repository at this point in the history
  11. find ESXM types via model rather than index

    Rather than using the same index to find other types, find them via the
    model. This means the types don't need to be in the same index.
    haarg committed Oct 27, 2024
    Configuration menu
    Copy the full SHA
    2f5b7d1 View commit details
    Browse the repository at this point in the history
  12. find es index/type via ESConfig rather than passing index around

    Many parts of the code treated the index as the parent of all data, so
    it was the thing being passed around. That will no be true in the
    future.
    
    Instead, ESConfig can give the path (index+type) of each named document
    type. Convert most places passing around index to use es_doc_path.
    haarg committed Oct 27, 2024
    Configuration menu
    Copy the full SHA
    0e55e2e View commit details
    Browse the repository at this point in the history
  13. fix model type alias for ESBool

    haarg committed Oct 27, 2024
    Configuration menu
    Copy the full SHA
    d709a5e View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    373da0d View commit details
    Browse the repository at this point in the history
  15. remove out of date comment

    haarg committed Oct 27, 2024
    Configuration menu
    Copy the full SHA
    74fcce0 View commit details
    Browse the repository at this point in the history
  16. Configuration menu
    Copy the full SHA
    75b7021 View commit details
    Browse the repository at this point in the history
  17. add ES and ESModel Catalyst model classes

    Replaces the CPAN and User model classes. Removes magic namespace
    creation. Just return the Search::Elasticsearch object, or the model
    object.
    haarg committed Oct 27, 2024
    Configuration menu
    Copy the full SHA
    7097883 View commit details
    Browse the repository at this point in the history
  18. Configuration menu
    Copy the full SHA
    f72bdf1 View commit details
    Browse the repository at this point in the history
  19. Configuration menu
    Copy the full SHA
    c44ac51 View commit details
    Browse the repository at this point in the history
  20. scripts don't need index method

    haarg committed Oct 27, 2024
    Configuration menu
    Copy the full SHA
    c61c516 View commit details
    Browse the repository at this point in the history
  21. Configuration menu
    Copy the full SHA
    e5c5376 View commit details
    Browse the repository at this point in the history
  22. create distributions with upsert

    Trying to count distribution documents before creating is vulnerable to
    concurrency and consistency issues. Instead, use an upsert to create it.
    haarg committed Oct 27, 2024
    Configuration menu
    Copy the full SHA
    ca7b31f View commit details
    Browse the repository at this point in the history
  23. default backup all indexes

    haarg committed Oct 27, 2024
    Configuration menu
    Copy the full SHA
    762f54d View commit details
    Browse the repository at this point in the history
  24. Configuration menu
    Copy the full SHA
    407b552 View commit details
    Browse the repository at this point in the history
  25. Configuration menu
    Copy the full SHA
    b9c246d View commit details
    Browse the repository at this point in the history
  26. Configuration menu
    Copy the full SHA
    374abed View commit details
    Browse the repository at this point in the history
  27. always use suggester for autocomplete

    We have two autocomplete end points. The old one is no longer used by
    the front end. The new one uses the suggest API.
    
    Rewrite both end points to use the suggest SPI, just returning data in
    different forms.
    haarg committed Oct 27, 2024
    Configuration menu
    Copy the full SHA
    a29eff9 View commit details
    Browse the repository at this point in the history
  28. Configuration menu
    Copy the full SHA
    c0b70ae View commit details
    Browse the repository at this point in the history
  29. fix script query syntax when using newer Elasticsearch

    Older versions expect the key "inline", newer expect "source".
    haarg committed Oct 27, 2024
    Configuration menu
    Copy the full SHA
    38877ad View commit details
    Browse the repository at this point in the history
  30. remove use_dis_max from query_string query

    use_dis_max is the default, and isn't supported in newer versions
    haarg committed Oct 27, 2024
    Configuration menu
    Copy the full SHA
    8d40746 View commit details
    Browse the repository at this point in the history
  31. Configuration menu
    Copy the full SHA
    d6571ec View commit details
    Browse the repository at this point in the history