-
Notifications
You must be signed in to change notification settings - Fork 905
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wip(agents-api): Add Doc sql queries #979
base: f/switch-to-pg
Are you sure you want to change the base?
Conversation
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Explore these optional code suggestions:
|
CI Failure Feedback 🧐(Checks updated until commit 74add36)
✨ CI feedback usage guide:The CI feedback tool (
In addition to being automatically triggered, the tool can also be invoked manually by commenting on a PR:
where Configuration options
See more information about the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Changes requested. Reviewed everything up to 6c77490 in 1 minute and 30 seconds
More details
- Looked at
848
lines of code in10
files - Skipped
0
files when reviewing. - Skipped posting
8
drafted comments based on config settings.
1. agents-api/agents_api/queries/docs/search_docs_hybrid.py:13
- Draft comment:
The import statement forrun_concurrently
is unnecessary since it's not used in the code. Consider removing it to clean up the imports. - Reason this comment was not posted:
Confidence changes required:10%
The import statement forrun_concurrently
is unnecessary since it's not used in the code.
2. agents-api/agents_api/queries/docs/search_docs_hybrid.py:134
- Draft comment:
Appending an empty list totasks
is incorrect. You should append a coroutine or task. Consider usingNone
or a similar placeholder if you intend to skip adding a task. - Reason this comment was not posted:
Confidence changes required:50%
The code usestasks.append([])
which is incorrect for appending tasks. It should append a coroutine instead.
3. agents-api/agents_api/queries/docs/search_docs_hybrid.py:147
- Draft comment:
Appending an empty list totasks
is incorrect. You should append a coroutine or task. Consider usingNone
or a similar placeholder if you intend to skip adding a task. - Reason this comment was not posted:
Confidence changes required:50%
The code usestasks.append([])
which is incorrect for appending tasks. It should append a coroutine instead.
4. agents-api/agents_api/queries/docs/search_docs_hybrid.py:154
- Draft comment:
Ensure that at least one valid task is added totasks
before callinggather
. If bothtext_query
andembedding
are empty,tasks
will contain only empty lists, leading to an error. - Reason this comment was not posted:
Confidence changes required:50%
The code does not handle the case where bothtext_query
andembedding
are empty, which could lead to an error when callinggather
.
5. agents-api/agents_api/queries/docs/search_docs_hybrid.py:110
- Draft comment:
Theembedding
parameter should have a default value of an empty list[]
instead ofNone
to avoid type issues and simplify checks. - Reason this comment was not posted:
Confidence changes required:50%
Theembedding
parameter should have a default value of an empty list instead ofNone
to avoid type issues.
6. agents-api/agents_api/queries/docs/create_doc.py:69
- Draft comment:
Consider adding more exception handling for other potential database errors, such asasyncpg.DataError
orasyncpg.SyntaxOrAccessError
, to make the function more robust. - Reason this comment was not posted:
Confidence changes required:50%
Therewrap_exceptions
decorator is used to handle exceptions, but it doesn't cover all possible exceptions that might occur during database operations.
7. agents-api/agents_api/queries/docs/delete_doc.py:43
- Draft comment:
Consider adding more exception handling for other potential database errors, such asasyncpg.DataError
orasyncpg.SyntaxOrAccessError
, to make the function more robust. - Reason this comment was not posted:
Confidence changes required:50%
Therewrap_exceptions
decorator is used to handle exceptions, but it doesn't cover all possible exceptions that might occur during database operations.
8. agents-api/agents_api/queries/docs/get_doc.py:29
- Draft comment:
Consider adding exception handling for potential database errors, such asasyncpg.DataError
orasyncpg.SyntaxOrAccessError
, to make the function more robust. - Reason this comment was not posted:
Confidence changes required:50%
Therewrap_exceptions
decorator is used to handle exceptions, but it doesn't cover all possible exceptions that might occur during database operations.
Workflow ID: wflow_0SgPnESfFL0Scwfr
Want Ellipsis to fix these issues? Tag @ellipsis-dev
in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
return similarity | ||
|
||
|
||
def maximal_marginal_relevance( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an exact duplicate of the existing implementation.
- function
maximal_marginal_relevance
(mmr.py)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Incremental review on 93673b7 in 1 minute and 3 seconds
More details
- Looked at
1195
lines of code in21
files - Skipped
0
files when reviewing. - Skipped posting
5
drafted comments based on config settings.
1. agents-api/agents_api/queries/docs/get_doc.py:5
- Draft comment:
The import statement for 'asyncpg' is unnecessary and can be removed to clean up the code. This is also applicable in other files like 'list_docs.py', 'search_docs_by_embedding.py', and 'search_docs_by_text.py'. - Reason this comment was not posted:
Confidence changes required:50%
The import statement for 'asyncpg' is unnecessary in multiple files as it is not used anywhere in the code. Removing it will clean up the code.
2. agents-api/agents_api/queries/docs/list_docs.py:5
- Draft comment:
The import statement for 'asyncpg' is unnecessary and can be removed to clean up the code. This is also applicable in other files like 'search_docs_by_embedding.py', and 'search_docs_by_text.py'. - Reason this comment was not posted:
Confidence changes required:50%
The import statement for 'asyncpg' is unnecessary in multiple files as it is not used anywhere in the code. Removing it will clean up the code.
3. agents-api/agents_api/queries/docs/search_docs_by_embedding.py:6
- Draft comment:
The import statement for 'asyncpg' is unnecessary and can be removed to clean up the code. This is also applicable in 'search_docs_by_text.py'. - Reason this comment was not posted:
Confidence changes required:50%
The import statement for 'asyncpg' is unnecessary in multiple files as it is not used anywhere in the code. Removing it will clean up the code.
4. agents-api/agents_api/queries/docs/search_docs_by_text.py:6
- Draft comment:
The import statement for 'asyncpg' is unnecessary and can be removed to clean up the code. - Reason this comment was not posted:
Confidence changes required:50%
The import statement for 'asyncpg' is unnecessary in multiple files as it is not used anywhere in the code. Removing it will clean up the code.
5. agents-api/agents_api/queries/docs/create_doc.py:98
- Draft comment:
The 'org' option is removed from the owner_type Literal, but this change is not reflected in the PR description. This should be documented for clarity. This change is also applicable in 'delete_doc.py', 'get_doc.py', and 'list_docs.py'. - Reason this comment was not posted:
Confidence changes required:50%
The 'org' option is removed from the owner_type Literal in multiple files, but this change is not reflected in the PR description. This should be documented for clarity.
Workflow ID: wflow_pnbwiXEmTCJVnbrw
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Changes requested. Incremental review on dc0ec36 in 48 seconds
More details
- Looked at
939
lines of code in28
files - Skipped
0
files when reviewing. - Skipped posting
1
drafted comments based on config settings.
1. agents-api/agents_api/queries/users/delete_user.py:59
- Draft comment:
Theasyncpg.exceptions.UniqueViolationError
should not be handled here as it is not relevant to delete operations. Consider removing this exception handling. - Reason this comment was not posted:
Marked as duplicate.
Workflow ID: wflow_WUrKdAU41fxw9Cd0
Want Ellipsis to fix these issues? Tag @ellipsis-dev
in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
status_code=404, | ||
detail="The specified developer does not exist.", | ||
), | ||
asyncpg.exceptions.UniqueViolationError: partialclass( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The asyncpg.exceptions.UniqueViolationError
should not be handled here as it is not relevant to delete operations. Consider removing this exception handling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Incremental review on 831e950 in 51 seconds
More details
- Looked at
440
lines of code in11
files - Skipped
0
files when reviewing. - Skipped posting
3
drafted comments based on config settings.
1. agents-api/agents_api/queries/docs/get_doc.py:36
- Draft comment:
Usingast.literal_eval
ond["content"]
can be unsafe if the content is not guaranteed to be a valid Python literal. Consider using a safer method to parse or handle the content. - Reason this comment was not posted:
Comment was on unchanged code.
2. agents-api/agents_api/queries/docs/embed_snippets.py:10
- Draft comment:
vectorizer_query
is set toNone
. This is a placeholder and should be replaced with an actual query before deployment. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The comment is essentially repeating information that's already explicitly stated in a TODO comment one line above. The TODO comment is more visible and serves the same purpose. This makes the PR comment redundant and not adding any new information or value.
Perhaps the comment is trying to emphasize the importance of not deploying with a None query, which could be a critical issue.
While deployment concerns are valid, the existing TODO comment already makes it clear this needs to be replaced, and deployment issues would be caught by basic testing since the function would fail immediately.
Delete the comment as it's redundant with the existing TODO comment and doesn't provide additional actionable value.
3. agents-api/agents_api/queries/entries/list_entries.py:88
- Draft comment:
Ensuresort_by
anddirection
are validated and sanitized to prevent SQL injection, as they are used in string interpolation for SQL queries. - Reason this comment was not posted:
Comment did not seem useful.
Workflow ID: wflow_TPwdOI1YwvBHnQ9P
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
PR Type
Enhancement
Description
Added comprehensive document management system with the following features:
Core document operations:
Advanced search capabilities:
List and filtering:
Performance optimizations:
Changes walkthrough 📝
9 files
__init__.py
Initialize document management module with core operations
agents-api/agents_api/queries/docs/init.py
create_doc.py
Document creation with ownership management
agents-api/agents_api/queries/docs/create_doc.py
constraints
delete_doc.py
Document deletion with ownership validation
agents-api/agents_api/queries/docs/delete_doc.py
get_doc.py
Single document retrieval functionality
agents-api/agents_api/queries/docs/get_doc.py
list_docs.py
Paginated document listing with filters
agents-api/agents_api/queries/docs/list_docs.py
mmr.py
Maximal Marginal Relevance implementation for search
agents-api/agents_api/queries/docs/mmr.py
search_docs_by_embedding.py
Vector-based document search implementation
agents-api/agents_api/queries/docs/search_docs_by_embedding.py
search_docs_by_text.py
Full-text document search implementation
agents-api/agents_api/queries/docs/search_docs_by_text.py
search_docs_hybrid.py
Hybrid document search with score fusion
agents-api/agents_api/queries/docs/search_docs_hybrid.py
Important
Add comprehensive document management system with CRUD, advanced search, and performance optimizations.
create_doc.py
,delete_doc.py
,get_doc.py
, andlist_docs.py
.search_docs_by_text.py
.search_docs_by_embedding.py
.search_docs_hybrid.py
.mmr.py
.Doc
model inDocs.py
with new fields likemodality
,language
,index
,embedding_model
, andembedding_dimensions
.test_docs_queries.py
for CRUD operations and listing.This description was created by for 831e950. It will automatically update as commits are pushed.