-
Notifications
You must be signed in to change notification settings - Fork 896
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(agents-api): Performance improvements #749
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…r embed to 100 Signed-off-by: Diwank Singh Tomer <[email protected]>
Signed-off-by: Diwank Singh Tomer <[email protected]>
…oo big Signed-off-by: Diwank Singh Tomer <[email protected]>
Signed-off-by: Diwank Singh Tomer <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Reviewed everything up to c0bd25c in 52 seconds
More details
- Looked at
7232
lines of code in24
files - Skipped
0
files when reviewing. - Skipped posting
5
drafted comments based on config settings.
1. typespec/versions.tsp:4
- Draft comment:
Ensure that all references to the older version (0.4.0) are updated or removed throughout the codebase to prevent any inconsistencies or errors. - Reason this comment was not posted:
Confidence changes required:50%
The removal of the older version (0.4.0) from the enum is consistent with the PR description, which mentions removing the older version of typespec. This change is appropriate and aligns with the update to version 1.0.0.
2. agents-api/agents_api/autogen/Docs.py:13
- Draft comment:
Thelimit
field has been changed from 100 to 50. Ensure that this change is reflected in all relevant parts of the codebase to maintain consistency. - Reason this comment was not posted:
Confidence changes required:50%
The change in thelimit
field from 100 to 50 is consistent with the PR description and aligns with the intent to limit the maximumk
for search to 50. This change is appropriate.
3. agents-api/agents_api/autogen/Docs.py:18
- Draft comment:
Themetadata_filter
field has been added. Ensure that this new field is handled correctly in all relevant parts of the codebase, including validation and usage. - Reason this comment was not posted:
Confidence changes required:50%
The addition ofmetadata_filter
is a new feature as per the PR description. This change should be checked for consistency across the codebase.
4. agents-api/agents_api/autogen/Docs.py:142
- Draft comment:
TheEmbedQueryRequest
class has been removed and replaced withSingleEmbedQueryRequest
andMultipleEmbedQueryRequest
. Ensure that all references toEmbedQueryRequest
are updated accordingly. - Reason this comment was not posted:
Confidence changes required:50%
The removal ofEmbedQueryRequest
and addition ofSingleEmbedQueryRequest
andMultipleEmbedQueryRequest
aligns with the PR description. This change should be checked for consistency across the codebase.
5. agents-api/gunicorn_conf.py:5
- Draft comment:
Theworkers
calculation has been changed frommultiprocessing.cpu_count() * 2 + 1
tomultiprocessing.cpu_count() - 1
. Ensure that this change is optimal for the expected workload and does not degrade performance. - Reason this comment was not posted:
Confidence changes required:50%
The change in theworkers
calculation in the gunicorn configuration is a performance improvement as per the PR description. This change should be verified for its impact on performance.
Workflow ID: wflow_xmzG11wG3M9FDwxh
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Important
This PR improves agents API performance by limiting search and embed parameters, adding metadata filtering, updating typespec versions, and optimizing search functions.
max_k
for search to 50 and max items for embed to 100 inDocs.py
.metadata_filter
argument to document search functions insearch_docs_by_embedding.py
,search_docs_by_text.py
, andsearch_docs_hybrid.py
.ann_threshold
insearch_docs_by_embedding.py
.0.4.0
version of typespec.openapi.yaml
to1.0.0
version.EmbedQueryRequest
intoSingleEmbedQueryRequest
andMultipleEmbedQueryRequest
inDocs.py
andmodels.tsp
.workers
and addtimeout
ingunicorn_conf.py
.docker-compose.yml
to syncgunicorn_conf.py
.This description was created by for c0bd25c. It will automatically update as commits are pushed.