Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

F/file queries: Add file sql queries #974

Merged
merged 9 commits into from
Dec 20, 2024
Merged

Conversation

Vedantsahai18
Copy link
Member

@Vedantsahai18 Vedantsahai18 commented Dec 19, 2024

PR Type

Enhancement, Tests


Description

  • Implemented comprehensive file management system with ownership tracking, CRUD operations, and proper validation
  • Enhanced session management with system template support, tool call forwarding, and participant lookup
  • Added entry management functionality with relations support, history tracking, and filtering capabilities
  • Refactored database schema to use unified ownership tables for files and documents
  • Improved SQL query utilities with type safety, batching, and timeout support
  • Added extensive test coverage for files, sessions, and entries
  • Enhanced error handling and validation across all query operations
  • Added model field to entries and updated OpenAPI specifications
  • Improved database connection handling and configuration

Changes walkthrough 📝

Relevant files
Enhancement
47 files
Sessions.py
Add system template and tool call forwarding to session models

agents-api/agents_api/autogen/Sessions.py

  • Added system_template field to session models for custom system
    prompts
  • Added forward_tool_calls flag to control tool call forwarding behavior

  • +40/-0   
    utils.py
    Improve SQL query utilities with type safety and batching

    agents-api/agents_api/queries/utils.py

  • Added type definitions and validation for SQL query parameters
  • Enhanced query preparation and execution with timeout support
  • Added better error handling and query batching capabilities
  • +111/-45
    create_session.py
    Add session creation with system template and participant lookup

    agents-api/agents_api/queries/sessions/create_session.py

  • Implemented session creation with system template support
  • Added validation for agent requirements
  • Added participant lookup table population
  • +132/-0 
    create_or_update_session.py
    Add session upsert with system template support                   

    agents-api/agents_api/queries/sessions/create_or_update_session.py

  • Added upsert functionality for sessions with system template
  • Implemented participant lookup management
  • Added validation for agent requirements
  • +152/-0 
    update_session.py
    Add session update with system template support                   

    agents-api/agents_api/queries/sessions/update_session.py

  • Added session update functionality with system template
  • Implemented participant lookup updates
  • Added validation for agent requirements
  • +129/-0 
    patch_session.py
    Add session patch functionality with metadata merging       

    agents-api/agents_api/queries/sessions/patch_session.py

  • Added session patching with system template support
  • Implemented selective participant updates
  • Added metadata merging functionality
  • +131/-0 
    get_session.py
    Add session retrieval with system template support             

    agents-api/agents_api/queries/sessions/get_session.py

  • Added session retrieval with system template
  • Implemented participant lookup joins
  • Added error handling for missing sessions
  • +83/-0   
    list_sessions.py
    Add paginated session listing with filtering                         

    agents-api/agents_api/queries/sessions/list_sessions.py

  • Added session listing with filtering and pagination
  • Implemented metadata filtering
  • Added sorting by creation/update time
  • +106/-0 
    create_file.py
    Add file creation with ownership tracking                               

    agents-api/agents_api/queries/files/create_file.py

  • Implemented file creation with ownership support
  • Added hash calculation and size tracking
  • Added support for user/agent file ownership
  • +143/-0 
    get_file.py
    Add file retrieval with ownership validation                         

    agents-api/agents_api/queries/files/get_file.py

  • Implemented file retrieval with ownership checks
  • Added support for user/agent file access
  • Added content placeholder for blob storage
  • +81/-0   
    list_files.py
    Add file listing with ownership filtering                               

    agents-api/agents_api/queries/files/list_files.py

  • Added file listing with ownership filtering
  • Implemented pagination and sorting
  • Added optimized queries for owner-specific listings
  • +122/-0 
    delete_file.py
    Add file deletion with ownership cleanup                                 

    agents-api/agents_api/queries/files/delete_file.py

  • Implemented file deletion with ownership checks
  • Added cascading deletion of ownership records
  • Added validation for file existence
  • +87/-0   
    create_entries.py
    Add entry creation with relations support                               

    agents-api/agents_api/queries/entries/create_entries.py

  • Implemented entry creation with session validation
  • Added support for entry relations
  • Added content JSON handling
  • +187/-0 
    delete_entries.py
    Add entry deletion with relations cleanup                               

    agents-api/agents_api/queries/entries/delete_entries.py

  • Implemented entry deletion with cascading relations
  • Added session validation
  • Added support for batch deletion
  • +132/-0 
    list_entries.py
    Add entry listing with filtering options                                 

    agents-api/agents_api/queries/entries/list_entries.py

  • Added entry listing with source filtering
  • Implemented pagination and sorting
  • Added relation exclusion support
  • +119/-0 
    get_agent.py
    Refactor agent query and remove unused code                           

    agents-api/agents_api/queries/agents/get_agent.py

  • Removed unused imports and type variables
  • Renamed raw query to agent_query and simplified query definition
  • Removed metrics counter decorator
  • +7/-13   
    get_history.py
    Add new entry history retrieval functionality                       

    agents-api/agents_api/queries/entries/get_history.py

  • Added new query file for retrieving entry history
  • Implemented history query with developer check
  • Added proper error handling and response transformation
  • +74/-0   
    create_developer.py
    Enhance developer creation with error handling                     

    agents-api/agents_api/queries/developers/create_developer.py

  • Added SQL comments for query parameters
  • Implemented proper error handling with HTTPException
  • Renamed query to developer_query for consistency
  • +21/-13 
    get_user.py
    Improve user query with better error handling                       

    agents-api/agents_api/queries/users/get_user.py

  • Added SQL comments for query parameters
  • Enhanced error handling for user not found cases
  • Removed unused imports and metrics counter
  • +16/-17 
    delete_session.py
    Add session deletion functionality                                             

    agents-api/agents_api/queries/sessions/delete_session.py

  • Added new file for session deletion functionality
  • Implemented cascading delete for session and lookup data
  • Added proper error handling and metrics tracking
  • +69/-0   
    list_agents.py
    Clean up agent listing query implementation                           

    agents-api/agents_api/queries/agents/list_agents.py

  • Removed unused imports and type variables
  • Simplified query variable naming
  • Removed metrics counter decorator
  • +6/-10   
    get_developer.py
    Improve developer query with better error handling             

    agents-api/agents_api/queries/developers/get_developer.py

  • Added SQL comments for query parameters
  • Enhanced error handling for developer not found
  • Renamed query to developer_query for consistency
  • +16/-9   
    patch_developer.py
    Enhance developer patch functionality                                       

    agents-api/agents_api/queries/developers/patch_developer.py

  • Added SQL comments for query parameters
  • Enhanced error handling for developer not found
  • Renamed query to developer_query for consistency
  • +18/-10 
    update_user.py
    Improve user update with better error handling                     

    agents-api/agents_api/queries/users/update_user.py

  • Added SQL comments for query parameters
  • Enhanced error handling for user not found cases
  • Simplified query definition and naming
  • +14/-13 
    update_developer.py
    Enhance developer update with error handling                         

    agents-api/agents_api/queries/developers/update_developer.py

  • Added SQL comments and proper error handling
  • Enhanced foreign key violation handling
  • Renamed query to developer_query for consistency
  • +16/-8   
    patch_agent.py
    Clean up agent patch query implementation                               

    agents-api/agents_api/queries/agents/patch_agent.py

  • Removed unused imports and type variables
  • Simplified query definition and naming
  • Improved code formatting
  • +7/-11   
    update_agent.py
    Clean up agent update query implementation                             

    agents-api/agents_api/queries/agents/update_agent.py

  • Removed unused imports and type variables
  • Simplified query definition and naming
  • Improved code formatting
  • +7/-11   
    create_or_update_agent.py
    Clean up agent create/update implementation                           

    agents-api/agents_api/queries/agents/create_or_update_agent.py

  • Removed unused imports and type variables
  • Simplified query definition and naming
  • Improved code formatting
  • +7/-11   
    count_sessions.py
    Add session counting functionality                                             

    agents-api/agents_api/queries/sessions/count_sessions.py

  • Added new file for counting sessions
  • Implemented session counting with developer check
  • Added proper error handling and metrics tracking
  • +55/-0   
    __init__.py
    Add sessions query module initialization                                 

    agents-api/agents_api/queries/sessions/init.py

  • Added new sessions module with comprehensive documentation
  • Defined all session-related query functions
  • Added proper exports
  • +30/-0   
    app.py
    Improve database connection handling                                         

    agents-api/agents_api/app.py

  • Added null checks for postgres pool
  • Improved connection pool management
  • Removed unused imports
  • +6/-4     
    __init__.py
    Add explicit exports to developers module                               

    agents-api/agents_api/queries/developers/init.py

    • Added explicit exports for developer query functions
    +7/-0     
    __init__.py
    Add explicit exports to agents module                                       

    agents-api/agents_api/queries/agents/init.py

    • Added explicit exports for agent query functions
    +10/-0   
    __init__.py
    Add entries query module initialization                                   

    agents-api/agents_api/queries/entries/init.py

  • Added new entries module with documentation
  • Defined entry-related query functions
  • Added proper exports
  • +21/-0   
    __init__.py
    Add files query module initialization                                       

    agents-api/agents_api/queries/files/init.py

  • Added new files module with documentation
  • Defined file-related query functions
  • Added proper exports
  • +16/-0   
    Entries.py
    Add model field to entry definition                                           

    agents-api/agents_api/autogen/Entries.py

  • Added model field to BaseEntry class
  • Set default model value to "gpt-4o-mini"
  • +1/-0     
    openapi_model.py
    Update model input handling                                                           

    agents-api/agents_api/autogen/openapi_model.py

    • Added model parameter to from_model_input function
    +1/-0     
    Entries.py
    Add model field to entry definition                                           

    integrations-service/integrations/autogen/Entries.py

  • Added model field to BaseEntry class
  • Set default model value to "gpt-4o-mini"
  • +1/-0     
    openapi-1.0.0.yaml
    Update OpenAPI schema with new fields                                       

    typespec/tsp-output/@typespec/openapi3/openapi-1.0.0.yaml

  • Added model field to entries schema
  • Added system_template and forward_tool_calls to session schema
  • Updated session requirements
  • +57/-0   
    000006_docs.up.sql
    Refactor document ownership schema                                             

    memory-store/migrations/000006_docs.up.sql

  • Replaced separate user_docs and agent_docs tables with unified
    doc_owners table
  • Added owner type validation
  • Created indexes and triggers
  • +38/-18 
    000005_files.up.sql
    Refactor file ownership schema                                                     

    memory-store/migrations/000005_files.up.sql

  • Replaced separate user_files and agent_files tables with unified
    file_owners table
  • Added owner type validation
  • Created indexes and triggers
  • +37/-19 
    000006_docs.down.sql
    Update docs schema down migration                                               

    memory-store/migrations/000006_docs.down.sql

  • Updated down migration for new doc_owners schema
  • Removed old table drops
  • Simplified cleanup process
  • +14/-28 
    000015_entries.up.sql
    Enhance entries schema with new features                                 

    memory-store/migrations/000015_entries.up.sql

  • Added developer role to chat_role enum
  • Added trigger to update session's updated_at timestamp
  • +18/-2   
    models.tsp
    Add new session model fields                                                         

    typespec/sessions/models.tsp

  • Added system_template field to Session model
  • Added forward_tool_calls boolean field
  • +6/-0     
    000005_files.down.sql
    Update files schema down migration                                             

    memory-store/migrations/000005_files.down.sql

  • Updated down migration for new file_owners schema
  • Removed old table drops
  • Simplified cleanup process
  • +4/-6     
    000009_sessions.up.sql
    Add updated_at field to sessions                                                 

    memory-store/migrations/000009_sessions.up.sql

  • Added updated_at field to sessions table
  • Removed outdated comment
  • +1/-2     
    models.tsp
    Add model field to entry type spec                                             

    typespec/entries/models.tsp

  • Added model field to BaseEntry
  • Set default model value to "gpt-4o-mini"
  • +1/-0     
    Tests
    5 files
    test_entry_queries.py
    Update entry tests for PostgreSQL migration                           

    agents-api/tests/test_entry_queries.py

  • Updated tests for PostgreSQL database
  • Added error case testing
  • Simplified test structure
  • +158/-182
    test_session_queries.py
    Add comprehensive session query tests                                       

    agents-api/tests/test_session_queries.py

  • Added comprehensive session CRUD tests
  • Added system template testing
  • Added participant management tests
  • +198/-97
    test_files_queries.py
    Add file management test suite                                                     

    agents-api/tests/test_files_queries.py

  • Added file CRUD operation tests
  • Added ownership validation tests
  • Added listing and filtering tests
  • +252/-54
    fixtures.py
    Update test fixtures for PostgreSQL support                           

    agents-api/tests/fixtures.py

  • Updated fixtures for PostgreSQL support
  • Added file and session fixtures
  • Simplified cleanup handling
  • +80/-60 
    test_agent_queries.py
    Improve agent deletion test coverage                                         

    agents-api/tests/test_agent_queries.py

  • Improved delete agent test by creating test agent first
  • Removed unnecessary imports
  • Enhanced test assertions
  • +12/-5   
    Formatting
    1 files
    web.py
    Remove unused FastAPI import                                                         

    agents-api/agents_api/web.py

    • Removed unused Depends import
    +1/-2     
    Configuration changes
    1 files
    env.py
    Add database query timeout configuration                                 

    agents-api/agents_api/env.py

    • Added query_timeout configuration option
    +2/-0     

    💡 PR-Agent usage: Comment /help "your question" on any pull request to receive relevant information


    Important

    Add comprehensive file management system with SQL queries, ownership tracking, and extensive test coverage.

    • File Management:
      • Added SQL queries for file creation (create_file.py), retrieval (get_file.py), listing (list_files.py), and deletion (delete_file.py) with ownership tracking.
      • Introduced file_owners table for unified ownership management in 000005_files.up.sql.
    • Database Schema:
      • Refactored file and document ownership schema to use file_owners and doc_owners tables in 000005_files.up.sql and 000006_docs.up.sql.
      • Added triggers and validation functions for ownership integrity.
    • Tests:
      • Added tests for file operations in test_files_queries.py.
      • Updated fixtures in fixtures.py to support new file management features.
    • Miscellaneous:
      • Improved error handling and validation in SQL query utilities.
      • Updated OpenAPI specifications to reflect new model fields.

    This description was created by Ellipsis for c88e8d7. It will automatically update as commits are pushed.

    Copy link

    gitguardian bot commented Dec 19, 2024

    ️✅ There are no secrets present in this pull request anymore.

    If these secrets were true positive and are still valid, we highly recommend you to revoke them.
    While these secrets were previously flagged, we no longer have a reference to the
    specific commits where they were detected. Once a secret has been leaked into a git
    repository, you should consider it compromised, even if it was deleted immediately.
    Find here more information about risks.


    🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

    Copy link
    Contributor

    qodo-merge-pro-for-open-source bot commented Dec 19, 2024

    CI Failure Feedback 🧐

    (Checks updated until commit 41739ee)

    Action: Typecheck

    Failed stage: Typecheck [❌]

    Failed test name: agents_api.common.utils.cozo

    Failure summary:

    The pytype check failed due to multiple issues:

  • Import error: Cannot find module pycozo in file agents_api/common/utils/cozo.py
  • Stray type comments found in tests/test_workflow_routes.py (lines 65 and 114)
  • Invalid type error in pytype matcher related to ParamSpecMatch
    These errors prevented the build from
    completing successfully.

  • Relevant error logs:
    1:  ##[group]Operating System
    2:  Ubuntu
    ...
    
    1195:  [17/360] check agents_api.autogen.Files
    1196:  [18/360] check agents_api.autogen.Executions
    1197:  [19/360] check agents_api.autogen.Entries
    1198:  [20/360] check agents_api.autogen.Agents
    1199:  [21/360] check agents_api.clients.__init__
    1200:  [22/360] check agents_api.activities.sync_items_remote
    1201:  [23/360] check agents_api.common.utils.datetime
    1202:  [24/360] check agents_api.common.utils.cozo
    1203:  FAILED: /home/runner/work/julep/julep/agents-api/.pytype/pyi/agents_api/common/utils/cozo.pyi 
    1204:  /home/runner/work/julep/julep/agents-api/.venv/bin/python -m pytype.main --disable pyi-error --imports_info /home/runner/work/julep/julep/agents-api/.pytype/imports/agents_api.common.utils.cozo.imports --module-name agents_api.common.utils.cozo --platform linux -V 3.12 -o /home/runner/work/julep/julep/agents-api/.pytype/pyi/agents_api/common/utils/cozo.pyi --analyze-annotated --nofail --none-is-not-bool --quick --strict-none-binding /home/runner/work/julep/julep/agents-api/agents_api/common/utils/cozo.py
    1205:  /home/runner/work/julep/julep/agents-api/agents_api/common/utils/cozo.py:9:1: error: in <module>: Can't find module 'pycozo'. [import-error]
    1206:  from pycozo import Client
    1207:  ~~~~~~~~~~~~~~~~~~~~~~~~~
    1208:  For more details, see https://google.github.io/pytype/errors.html#import-error
    ...
    
    1214:  [30/360] check agents_api.common.utils.json
    1215:  [31/360] check agents_api.metrics.counters
    1216:  [32/360] check agents_api.common.utils.types
    1217:  [33/360] check agents_api.common.nlp
    1218:  [34/360] check agents_api.common.storage_handler
    1219:  [35/360] check agents_api.common.protocol.developers
    1220:  [36/360] check agents_api.dependencies.exceptions
    1221:  [37/360] check agents_api.queries.utils
    1222:  ERROR:pytype.matcher Invalid type: <class 'pytype.abstract.function.ParamSpecMatch'>
    ...
    
    1290:  [105/360] check agents_api.queries.sessions.get_session
    1291:  [106/360] check agents_api.queries.sessions.delete_session
    1292:  [107/360] check agents_api.queries.sessions.create_or_update_session
    1293:  [108/360] check agents_api.queries.sessions.count_sessions
    1294:  [109/360] check agents_api.queries.files.list_files
    1295:  [110/360] check agents_api.queries.files.get_file
    1296:  [111/360] check agents_api.queries.files.delete_file
    1297:  [112/360] check tests.test_workflow_routes
    1298:  /home/runner/work/julep/julep/agents-api/tests/test_workflow_routes.py:65:1: error: : Stray type comment: object [ignored-type-comment]
    1299:  #   type: object~~~~~~~~~~~~~~~
    1300:  #   type: object
    1301:  /home/runner/work/julep/julep/agents-api/tests/test_workflow_routes.py:114:1: error: : Stray type comment: object [ignored-type-comment]
    1302:  #   type: object~~~~~~~~~~~~~~~
    1303:  #   type: object
    1304:  For more details, see https://google.github.io/pytype/errors.html#ignored-type-comment
    ...
    
    1346:  [154/360] check agents_api.rec_sum.entities
    1347:  [155/360] check agents_api.dependencies.__init__
    1348:  [156/360] check tests.test_execution_workflow
    1349:  [157/360] check agents_api.queries.users.__init__
    1350:  [158/360] check agents_api.common.exceptions.agents
    1351:  [159/360] check agents_api.clients.worker.__init__
    1352:  [160/360] check agents_api.queries.sessions.__init__
    1353:  [161/360] check agents_api.workflows.__init__
    1354:  ninja: build stopped: cannot make progress due to previous errors.
    1355:  Computing dependencies
    1356:  Generated API key since not set in the environment: 36252688689580375791732150223197
    1357:  Sentry DSN not found. Sentry will not be enabled.
    1358:  Analyzing 332 sources with 0 local dependencies
    1359:  Leaving directory '.pytype'
    1360:  ##[error]Process completed with exit code 1.
    

    ✨ CI feedback usage guide:

    The CI feedback tool (/checks) automatically triggers when a PR has a failed check.
    The tool analyzes the failed checks and provides several feedbacks:

    • Failed stage
    • Failed test name
    • Failure summary
    • Relevant error logs

    In addition to being automatically triggered, the tool can also be invoked manually by commenting on a PR:

    /checks "https://github.com/{repo_name}/actions/runs/{run_number}/job/{job_number}"
    

    where {repo_name} is the name of the repository, {run_number} is the run number of the failed check, and {job_number} is the job number of the failed check.

    Configuration options

    • enable_auto_checks_feedback - if set to true, the tool will automatically provide feedback when a check is failed. Default is true.
    • excluded_checks_list - a list of checks to exclude from the feedback, for example: ["check1", "check2"]. Default is an empty list.
    • enable_help_text - if set to true, the tool will provide a help message with the feedback. Default is true.
    • persistent_comment - if set to true, the tool will overwrite a previous checks comment with the new feedback. Default is true.
    • final_update_message - if persistent_comment is true and updating a previous checks message, the tool will also create a new message: "Persistent checks updated to latest commit". Default is true.

    See more information about the checks tool in the docs.

    Copy link
    Contributor

    @ellipsis-dev ellipsis-dev bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    ❌ Changes requested. Reviewed everything up to cc2a5bf in 1 minute and 55 seconds

    More details
    • Looked at 5776 lines of code in 67 files
    • Skipped 0 files when reviewing.
    • Skipped posting 4 drafted comments based on config settings.
    1. agents-api/agents_api/queries/utils.py:111
    • Draft comment:
      asyncpg.NoDataFoundError is not a standard exception in asyncpg. Ensure this exception is defined elsewhere in the codebase or replace it with a standard exception.
    • Reason this comment was not posted:
      Decided after close inspection that this draft comment was likely wrong and/or not actionable:
      The comment is technically correct that NoDataFoundError is not standard. However, this appears to be an intentional custom exception used to handle the specific case of no data being found in fetchrow queries. The code is consistently using this exception and handling it appropriately. The comment suggests replacing it but that would actually make the code worse by removing the semantic meaning.
      I could be wrong about whether this is an intentional design choice - maybe it's actually a bug where they meant to use a different exception.
      Even if they meant to use a different exception, suggesting to "ensure it's defined elsewhere" or "replace with standard exception" is not helpful - they clearly want some kind of "no data found" exception here for this specific case.
      The comment should be deleted. The code is intentionally using a custom exception for semantic clarity, and the comment's suggestions would make the code worse.
    2. agents-api/agents_api/queries/entries/create_entries.py:85
    • Draft comment:
      The rewrap_exceptions decorator is commented out here. Ensure consistency in exception handling across the codebase to avoid unhandled exceptions.
    • Reason this comment was not posted:
      Comment did not seem useful.
    3. agents-api/agents_api/queries/agents/delete_agent.py:76
    • Draft comment:
      The increase_counter decorator is removed. If metrics collection is intended, ensure this is added back or replaced with an alternative solution.
    • Reason this comment was not posted:
      Comment did not seem useful.
    4. agents-api/agents_api/queries/utils.py:75
    • Draft comment:
      The match-case statement is a Python 3.10 feature. Ensure the project supports Python 3.10 or later, or refactor for compatibility with older versions.
    • Reason this comment was not posted:
      Confidence changes required: 50%
      The prepare_pg_query_args function uses a match-case statement which is a Python 3.10 feature. Ensure compatibility if the project supports older Python versions.

    Workflow ID: wflow_P1ix6oN8CQ1bdkDY


    Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

    Copy link
    Contributor

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
    🧪 PR contains tests
    🔒 Security concerns

    SQL Injection:
    While the code uses parameterized queries which is good practice, some of the dynamic SQL construction in files like list_files.py and create_file.py could potentially be vulnerable if not properly sanitized. The code should ensure all user inputs are properly escaped and validated before being used in SQL queries.

    ⚡ Recommended focus areas for review

    Debug Code
    Debug print statements left in production code that should be removed

    Error Handling
    Error handling could be improved by adding more specific error types and messages in the rewrap_exceptions decorator

    Security Concern
    File content is being stored as base64 without size validation, which could lead to memory issues with large files

    Copy link
    Contributor

    qodo-merge-pro-for-open-source bot commented Dec 19, 2024

    PR Code Suggestions ✨

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Score
    Security
    Fix SQL injection vulnerability in ORDER BY clause

    SQL injection vulnerability in ORDER BY clause due to direct string interpolation of
    sort_by and direction parameters.

    agents-api/agents_api/queries/entries/list_entries.py [45]

    -ORDER BY e.{sort_by} {direction} -- safe to interpolate
    +ORDER BY CASE 
    +    WHEN $7 = 'created_at' THEN e.created_at 
    +    WHEN $7 = 'timestamp' THEN e.timestamp 
    +END
    +CASE WHEN $8 = 'desc' THEN DESC ELSE ASC END
    • Apply this suggestion
    Suggestion importance[1-10]: 10

    Why: Direct string interpolation in ORDER BY clause creates a critical SQL injection vulnerability that could allow malicious SQL execution.

    10
    ✅ Remove debug print statements that could expose sensitive information
    Suggestion Impact:The commit removed the debug print statements that were exposing lookup_params, exactly as suggested

    code diff:

    -    print("*" * 100)
    -    print(lookup_params)
    -    print("*" * 100)

    Remove debug print statements that expose lookup parameters, which could leak
    sensitive information in logs.

    agents-api/agents_api/queries/sessions/create_session.py [126-128]

    -print("*" * 100)
    -print(lookup_params)
    -print("*" * 100)
    +# Remove debug prints
    • Apply this suggestion
    Suggestion importance[1-10]: 8

    Why: Debug print statements exposing lookup_params could leak sensitive user/agent IDs and session information in logs, creating a security risk.

    8
    Possible issue
    ✅ Enable proper error handling for database operations to prevent unhandled exceptions
    Suggestion Impact:The commit implemented the suggestion by uncommenting and enabling the @rewrap_exceptions decorator in two places to handle database errors

    code diff:

    -# @rewrap_exceptions(
    -#     {
    -#         asyncpg.ForeignKeyViolationError: partialclass(
    -#             HTTPException,
    -#             status_code=404,
    -#             detail="Session not found",
    -#         ),
    -#         asyncpg.UniqueViolationError: partialclass(
    -#             HTTPException,
    -#             status_code=409,
    -#             detail="Entry already exists",
    -#         ),
    -#         asyncpg.NotNullViolationError: partialclass(
    -#             HTTPException,
    -#             status_code=400,
    -#             detail="Not null violation",
    -#         ),
    -#         asyncpg.NoDataFoundError: partialclass(
    -#             HTTPException,
    -#             status_code=404,
    -#             detail="Session not found",
    -#         ),
    -#     }
    -# )
    +@rewrap_exceptions(
    +    {
    +        asyncpg.ForeignKeyViolationError: partialclass(
    +            HTTPException,
    +            status_code=404,
    +            detail="Session not found",
    +        ),
    +        asyncpg.UniqueViolationError: partialclass(
    +            HTTPException,
    +            status_code=409,
    +            detail="Entry already exists",
    +        ),
    +        asyncpg.NotNullViolationError: partialclass(
    +            HTTPException,
    +            status_code=400,
    +            detail="Not null violation",
    +        ),
    +        asyncpg.NoDataFoundError: partialclass(
    +            HTTPException,
    +            status_code=404,
    +            detail="Session not found",
    +        ),
    +    }
    +)
     @wrap_in_class(
         Entry,
         transform=lambda d: {
    -        "id": UUID(d.pop("entry_id")),
    +        "id": d.pop("entry_id"),
             **d,
         },
     )
    @@ -92,7 +93,7 @@
         developer_id: UUID,
         session_id: UUID,
         data: list[CreateEntryRequest],
    -) -> list[tuple[str, list, Literal["fetch", "fetchmany"]]]:
    +) -> list[tuple[str, list, Literal["fetch", "fetchmany", "fetchrow"]]]:
         # Convert the data to a list of dictionaries
         data_dicts = [item.model_dump(mode="json") for item in data]
     
    @@ -103,7 +104,7 @@
             params.append(
                 [
                     session_id,  # $1
    -                item.pop("id", None) or str(uuid7()),  # $2
    +                item.pop("id", None) or uuid7(),  # $2
                     item.get("source"),  # $3
                     item.get("role"),  # $4
                     item.get("event_type") or "message.create",  # $5
    @@ -113,8 +114,9 @@
                     content_to_json(item.get("tool_calls") or {}),  # $9
                     item.get("model"),  # $10
                     item.get("token_count"),  # $11
    -                item.get("created_at") or utcnow(),  # $12
    -                utcnow(),  # $13
    +                select_tokenizer(item.get("model"))["type"],  # $12
    +                item.get("created_at") or utcnow(),  # $13
    +                utcnow().timestamp(),  # $14
                 ]
             )
     
    @@ -122,7 +124,7 @@
             (
                 session_exists_query,
                 [session_id, developer_id],
    -            "fetch",
    +            "fetchrow",
             ),
             (
                 entry_query,
    @@ -132,20 +134,25 @@
         ]
     
     
    -# @rewrap_exceptions(
    -#     {
    -#         asyncpg.ForeignKeyViolationError: partialclass(
    -#             HTTPException,
    -#             status_code=404,
    -#             detail="Session not found",
    -#         ),
    -#         asyncpg.UniqueViolationError: partialclass(
    -#             HTTPException,
    -#             status_code=409,
    -#             detail="Entry already exists",
    -#         ),
    -#     }
    -# )
    +@rewrap_exceptions(
    +    {
    +        asyncpg.ForeignKeyViolationError: partialclass(
    +            HTTPException,
    +            status_code=404,
    +            detail="Session not found",
    +        ),
    +        asyncpg.UniqueViolationError: partialclass(
    +            HTTPException,
    +            status_code=409,
    +            detail="Entry already exists",
    +        ),
    +        asyncpg.NoDataFoundError: partialclass(
    +            HTTPException,
    +            status_code=404,
    +            detail="Session not found",
    +        ),
    +    }
    +)

    Uncomment the @rewrap_exceptions decorator to properly handle database errors. The
    current code lacks error handling for foreign key violations, unique violations and
    null violations which could lead to unhandled exceptions.

    agents-api/agents_api/queries/entries/create_entries.py [56-79]

    -# @rewrap_exceptions(
    -#     {
    -#         asyncpg.ForeignKeyViolationError: partialclass(
    -#             HTTPException,
    -#             status_code=404,
    -#             detail="Session not found",
    -#         ),
    -#         asyncpg.UniqueViolationError: partialclass(
    -#             HTTPException,
    -#             status_code=409,
    -#             detail="Entry already exists",
    -#         ),
    -#         asyncpg.NotNullViolationError: partialclass(
    -#             HTTPException,
    -#             status_code=400,
    -#             detail="Not null violation",
    -#         ),
    -#         asyncpg.NoDataFoundError: partialclass(
    -#             HTTPException,
    -#             status_code=404,
    -#             detail="Session not found",
    -#         ),
    -#     }
    -# )
    +@rewrap_exceptions(
    +    {
    +        asyncpg.ForeignKeyViolationError: partialclass(
    +            HTTPException,
    +            status_code=404,
    +            detail="Session not found",
    +        ),
    +        asyncpg.UniqueViolationError: partialclass(
    +            HTTPException,
    +            status_code=409,
    +            detail="Entry already exists",
    +        ),
    +        asyncpg.NotNullViolationError: partialclass(
    +            HTTPException,
    +            status_code=400,
    +            detail="Not null violation",
    +        ),
    +        asyncpg.NoDataFoundError: partialclass(
    +            HTTPException,
    +            status_code=404,
    +            detail="Session not found",
    +        ),
    +    }
    +)
    • Apply this suggestion
    Suggestion importance[1-10]: 9

    Why: Uncommented error handling is crucial for database operations to properly handle and communicate errors to clients. Without it, database exceptions could crash the application or expose sensitive information.

    9
    Add explicit timeout handling for database queries to prevent silent failures and provide clear error messages

    Add error handling for the case when query timeout occurs. Currently, if a query
    exceeds the timeout value, it may fail silently or with an unclear error.

    agents-api/agents_api/queries/utils.py [169-171]

    -results: list[Record] = await method(
    -    query, *args, timeout=timeout
    -)
    +try:
    +    results: list[Record] = await method(
    +        query, *args, timeout=timeout
    +    )
    +except asyncpg.QueryTimeoutError:
    +    raise HTTPException(
    +        status_code=504,
    +        detail="Database query timed out"
    +    )
    • Apply this suggestion
    Suggestion importance[1-10]: 8

    Why: The suggestion addresses an important error handling case that could affect system reliability. Adding explicit timeout handling prevents silent failures and improves error reporting.

    8
    Fix incorrect error message for unique constraint violations

    The error message for UniqueViolationError incorrectly states "user does not exist"
    when it should handle unique constraint violations.

    agents-api/agents_api/queries/users/delete_user.py [59-63]

     asyncpg.UniqueViolationError: partialclass(
         HTTPException,
    -    status_code=404,
    -    detail="The specified user does not exist.",
    +    status_code=409,
    +    detail="A unique constraint was violated.",
     ),
    • Apply this suggestion
    Suggestion importance[1-10]: 7

    Why: The error handler incorrectly uses 404 (Not Found) status code and misleading message for unique constraint violations, which should use 409 (Conflict).

    7

    Copy link
    Contributor

    @ellipsis-dev ellipsis-dev bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    👍 Looks good to me! Incremental review on c88e8d7 in 56 seconds

    More details
    • Looked at 269 lines of code in 2 files
    • Skipped 0 files when reviewing.
    • Skipped posting 0 drafted comments based on config settings.

    Workflow ID: wflow_hzpvy7u9Zhv0IDRE


    You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

    @Vedantsahai18 Vedantsahai18 merged commit 0e32cbe into f/switch-to-pg Dec 20, 2024
    1 check passed
    @Vedantsahai18 Vedantsahai18 deleted the f/file-queries branch December 20, 2024 01:23
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    2 participants