Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(agents-api): Performance improvements #749

Merged
merged 6 commits into from
Oct 26, 2024
Merged

feat(agents-api): Performance improvements #749

merged 6 commits into from
Oct 26, 2024

Conversation

creatorrr
Copy link
Contributor

@creatorrr creatorrr commented Oct 26, 2024

  • feat(agents-api,typespec): Limit max_k for search to 50; max items for embed to 100
  • refactor(typespec): Remove older 0.4.0 version of typespec
  • feat: Add metadata filter argument to doc search
  • feat(agents-api): Performance improvements - use KNN unless dataset too big
  • feat(agents-api): Performance improvements - merge metadata_filter

Important

This PR improves agents API performance by limiting search and embed parameters, adding metadata filtering, updating typespec versions, and optimizing search functions.

  • Behavior:
    • Limit max_k for search to 50 and max items for embed to 100 in Docs.py.
    • Add metadata_filter argument to document search functions in search_docs_by_embedding.py, search_docs_by_text.py, and search_docs_hybrid.py.
    • Use KNN for search unless dataset exceeds ann_threshold in search_docs_by_embedding.py.
  • Refactor:
    • Remove older 0.4.0 version of typespec.
    • Update openapi.yaml to 1.0.0 version.
  • Models:
    • Split EmbedQueryRequest into SingleEmbedQueryRequest and MultipleEmbedQueryRequest in Docs.py and models.tsp.
  • Misc:
    • Adjust workers and add timeout in gunicorn_conf.py.
    • Update docker-compose.yml to sync gunicorn_conf.py.

This description was created by Ellipsis for c0bd25c. It will automatically update as commits are pushed.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Reviewed everything up to c0bd25c in 52 seconds

More details
  • Looked at 7232 lines of code in 24 files
  • Skipped 0 files when reviewing.
  • Skipped posting 5 drafted comments based on config settings.
1. typespec/versions.tsp:4
  • Draft comment:
    Ensure that all references to the older version (0.4.0) are updated or removed throughout the codebase to prevent any inconsistencies or errors.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The removal of the older version (0.4.0) from the enum is consistent with the PR description, which mentions removing the older version of typespec. This change is appropriate and aligns with the update to version 1.0.0.
2. agents-api/agents_api/autogen/Docs.py:13
  • Draft comment:
    The limit field has been changed from 100 to 50. Ensure that this change is reflected in all relevant parts of the codebase to maintain consistency.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The change in the limit field from 100 to 50 is consistent with the PR description and aligns with the intent to limit the maximum k for search to 50. This change is appropriate.
3. agents-api/agents_api/autogen/Docs.py:18
  • Draft comment:
    The metadata_filter field has been added. Ensure that this new field is handled correctly in all relevant parts of the codebase, including validation and usage.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The addition of metadata_filter is a new feature as per the PR description. This change should be checked for consistency across the codebase.
4. agents-api/agents_api/autogen/Docs.py:142
  • Draft comment:
    The EmbedQueryRequest class has been removed and replaced with SingleEmbedQueryRequest and MultipleEmbedQueryRequest. Ensure that all references to EmbedQueryRequest are updated accordingly.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The removal of EmbedQueryRequest and addition of SingleEmbedQueryRequest and MultipleEmbedQueryRequest aligns with the PR description. This change should be checked for consistency across the codebase.
5. agents-api/gunicorn_conf.py:5
  • Draft comment:
    The workers calculation has been changed from multiprocessing.cpu_count() * 2 + 1 to multiprocessing.cpu_count() - 1. Ensure that this change is optimal for the expected workload and does not degrade performance.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The change in the workers calculation in the gunicorn configuration is a performance improvement as per the PR description. This change should be verified for its impact on performance.

Workflow ID: wflow_xmzG11wG3M9FDwxh


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

@creatorrr creatorrr merged commit bb6b238 into dev Oct 26, 2024
14 checks passed
@creatorrr creatorrr deleted the f/search-fixes branch October 26, 2024 03:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants