Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

F/migrate workflows to pg #989

Merged
merged 5 commits into from
Dec 25, 2024
Merged

Conversation

Ahmad-mtos
Copy link
Contributor

@Ahmad-mtos Ahmad-mtos commented Dec 25, 2024

PR Type

Enhancement


Description

Major changes to migrate from Cozo to PostgreSQL:

  • Removed all Cozo-related code and configurations
  • Updated database connection configuration to use PostgreSQL (PG_DSN)
  • Refactored database queries and models to work with PostgreSQL
  • Simplified document creation by removing embedding workflow
  • Updated Docker configurations to use PostgreSQL instead of Cozo
  • Added new PostgreSQL query step functionality
  • Modified execution and task-related queries for PostgreSQL compatibility
  • Updated environment variables and configurations across the application

Changes walkthrough 📝

Relevant files
Refactoring
utils.py
Update import paths for database queries                                 

agents-api/agents_api/activities/utils.py

  • Updated import paths from models to queries directory
+22/-22 
Enhancement
create_doc.py
Remove document embedding workflow                                             

agents-api/agents_api/routers/docs/create_doc.py

  • Removed embed docs workflow functionality
  • Simplified document creation response
  • +2/-70   
    prepare_execution_input.py
    Refactor execution input preparation query                             

    agents-api/agents_api/queries/executions/prepare_execution_input.py

  • Updated SQL query structure for execution preparation
  • Modified query parameters and response format
  • +20/-20 
    Configuration changes
    app.py
    Update database configuration and middleware                         

    agents-api/agents_api/app.py

  • Changed DB_DSN to PG_DSN environment variable
  • Commented out content length validation middleware
  • +16/-14 
    env.py
    Replace Cozo with PostgreSQL configuration                             

    agents-api/agents_api/env.py

  • Removed Cozo-related environment variables
  • Added PostgreSQL configuration
  • +5/-16   

    💡 PR-Agent usage: Comment /help "your question" on any pull request to receive relevant information

    Copy link
    Contributor

    qodo-merge-pro-for-open-source bot commented Dec 25, 2024

    CI Failure Feedback 🧐

    (Checks updated until commit c0acb49)

    Action: Typecheck

    Failed stage: Typecheck [❌]

    Failure summary:

    The pytype type checking action failed due to multiple type-related errors:

  • In agents_api.queries.users.create_user: Attempting to call model_dump() on a dict or None object
    that doesn't have this method
  • In agents_api.queries.docs.search_docs_hybrid: Type annotation mismatch error for embedding variable
  • In agents_api.activities.execute_integration: get_tool_args_from_metadata is being called but is not
    a callable object
  • In agents_api.routers.healthz.check_health: Import error for module 'router'
  • Additional type comment errors in test files

  • Relevant error logs:
    1:  ##[group]Operating System
    2:  Ubuntu
    ...
    
    1211:  [17/316] check agents_api.queries.executions.constants
    1212:  [18/316] check agents_api.common.utils.types
    1213:  [19/316] check agents_api.app
    1214:  [20/316] check agents_api.metrics.counters
    1215:  [21/316] check agents_api.autogen.openapi_model
    1216:  [22/316] check agents_api.common.protocol.developers
    1217:  [23/316] check agents_api.exceptions
    1218:  [24/316] check agents_api.queries.utils
    1219:  ERROR:pytype.matcher Invalid type: <class 'pytype.abstract.function.ParamSpecMatch'>
    ...
    
    1221:  [26/316] check agents_api.common.protocol.tasks
    1222:  [27/316] check agents_api.common.utils.messages
    1223:  [28/316] check agents_api.queries.users.update_user
    1224:  [29/316] check agents_api.queries.users.patch_user
    1225:  [30/316] check agents_api.queries.users.list_users
    1226:  [31/316] check agents_api.queries.users.get_user
    1227:  [32/316] check agents_api.queries.users.delete_user
    1228:  [33/316] check agents_api.queries.users.create_user
    1229:  FAILED: /home/runner/work/julep/julep/agents-api/.pytype/pyi/agents_api/queries/users/create_user.pyi 
    1230:  /home/runner/work/julep/julep/agents-api/.venv/bin/python -m pytype.main --disable pyi-error --imports_info /home/runner/work/julep/julep/agents-api/.pytype/imports/agents_api.queries.users.create_user.imports --module-name agents_api.queries.users.create_user --platform linux -V 3.12 -o /home/runner/work/julep/julep/agents-api/.pytype/pyi/agents_api/queries/users/create_user.pyi --analyze-annotated --nofail --none-is-not-bool --quick --strict-none-binding /home/runner/work/julep/julep/agents-api/agents_api/queries/users/create_user.py
    1231:  /home/runner/work/julep/julep/agents-api/agents_api/queries/users/create_user.py:76:16: error: in create_user: No attribute 'model_dump' on dict[str, Any] [attribute-error]
    1232:  In Optional[dict[str, Any]]
    1233:  metadata = data.metadata.model_dump(mode="json") or {}
    1234:  ~~~~~~~~~~~~~~~~~~~~~~~~
    1235:  /home/runner/work/julep/julep/agents-api/agents_api/queries/users/create_user.py:76:16: error: in create_user: No attribute 'model_dump' on None [attribute-error]
    1236:  In Optional[dict[str, Any]]
    1237:  metadata = data.metadata.model_dump(mode="json") or {}
    1238:  ~~~~~~~~~~~~~~~~~~~~~~~~
    1239:  For more details, see https://google.github.io/pytype/errors.html#attribute-error
    ...
    
    1273:  [67/316] check agents_api.queries.executions.create_execution_transition
    1274:  [68/316] check agents_api.queries.executions.count_executions
    1275:  [69/316] check agents_api.queries.executions.create_execution
    1276:  [70/316] check agents_api.queries.entries.list_entries
    1277:  [71/316] check agents_api.queries.entries.get_history
    1278:  [72/316] check agents_api.queries.entries.delete_entries
    1279:  [73/316] check agents_api.queries.entries.create_entries
    1280:  [74/316] check agents_api.queries.docs.search_docs_hybrid
    1281:  FAILED: /home/runner/work/julep/julep/agents-api/.pytype/pyi/agents_api/queries/docs/search_docs_hybrid.pyi 
    1282:  /home/runner/work/julep/julep/agents-api/.venv/bin/python -m pytype.main --disable pyi-error --imports_info /home/runner/work/julep/julep/agents-api/.pytype/imports/agents_api.queries.docs.search_docs_hybrid.imports --module-name agents_api.queries.docs.search_docs_hybrid --platform linux -V 3.12 -o /home/runner/work/julep/julep/agents-api/.pytype/pyi/agents_api/queries/docs/search_docs_hybrid.pyi --analyze-annotated --nofail --none-is-not-bool --quick --strict-none-binding /home/runner/work/julep/julep/agents-api/agents_api/queries/docs/search_docs_hybrid.py
    1283:  /home/runner/work/julep/julep/agents-api/agents_api/queries/docs/search_docs_hybrid.py:50:1: error: in <module>: Type annotation for embedding does not match type of assignment [annotation-type-mismatch]
    ...
    
    1380:  metadata_filter,
    1381:  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    1382:  search_language,
    1383:  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    1384:  ],
    1385:  ~~~~~~~~~~
    1386:  )
    1387:  ~~~~~
    1388:  For more details, see https://google.github.io/pytype/errors.html#annotation-type-mismatch
    ...
    
    1427:  [113/316] check agents_api.routers.sessions.metrics
    1428:  [114/316] check agents_api.queries.docs.mmr
    1429:  [115/316] check agents_api.common.utils.yaml
    1430:  [116/316] check agents_api.queries.chat.prepare_chat_context
    1431:  [117/316] check agents_api.common.nlp
    1432:  [118/316] check agents_api.common.interceptors
    1433:  [119/316] check agents_api.activities.task_steps.get_value_step
    1434:  [120/316] check agents_api.activities.execute_integration
    1435:  FAILED: /home/runner/work/julep/julep/agents-api/.pytype/pyi/agents_api/activities/execute_integration.pyi 
    1436:  /home/runner/work/julep/julep/agents-api/.venv/bin/python -m pytype.main --disable pyi-error --imports_info /home/runner/work/julep/julep/agents-api/.pytype/imports/agents_api.activities.execute_integration.imports --module-name agents_api.activities.execute_integration --platform linux -V 3.12 -o /home/runner/work/julep/julep/agents-api/.pytype/pyi/agents_api/activities/execute_integration.pyi --analyze-annotated --nofail --none-is-not-bool --quick --strict-none-binding /home/runner/work/julep/julep/agents-api/agents_api/activities/execute_integration.py
    1437:  /home/runner/work/julep/julep/agents-api/agents_api/activities/execute_integration.py:29:24: error: in execute_integration: 'agents_api.queries.tools.get_tool_args_from_metadata' object is not callable [not-callable]
    1438:  merged_tool_args = get_tool_args_from_metadata(
    1439:  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    1440:  developer_id=developer_id, agent_id=agent_id, task_id=task_id, arg_type="args"
    1441:  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    1442:  )
    1443:  ~~~~~
    1444:  /home/runner/work/julep/julep/agents-api/agents_api/activities/execute_integration.py:33:25: error: in execute_integration: 'agents_api.queries.tools.get_tool_args_from_metadata' object is not callable [not-callable]
    1445:  merged_tool_setup = get_tool_args_from_metadata(
    1446:  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    1447:  developer_id=developer_id, agent_id=agent_id, task_id=task_id, arg_type="setup"
    1448:  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    1449:  )
    1450:  ~~~~~
    1451:  For more details, see https://google.github.io/pytype/errors.html#not-callable
    ...
    
    1530:  [199/316] check agents_api.common.utils.json
    1531:  [200/316] check tests.test_messages_truncation
    1532:  [201/316] check agents_api.common.protocol.__init__
    1533:  [202/316] check agents_api.autogen.__init__
    1534:  [203/316] check agents_api.rec_sum.__init__
    1535:  [204/316] check agents_api.common.__init__
    1536:  [205/316] check agents_api.__init__
    1537:  [206/316] check agents_api.routers.healthz.check_health
    1538:  FAILED: /home/runner/work/julep/julep/agents-api/.pytype/pyi/agents_api/routers/healthz/check_health.pyi 
    1539:  /home/runner/work/julep/julep/agents-api/.venv/bin/python -m pytype.main --disable pyi-error --imports_info /home/runner/work/julep/julep/agents-api/.pytype/imports/agents_api.routers.healthz.check_health.imports --module-name agents_api.routers.healthz.check_health --platform linux -V 3.12 -o /home/runner/work/julep/julep/agents-api/.pytype/pyi/agents_api/routers/healthz/check_health.pyi --analyze-annotated --nofail --none-is-not-bool --quick --strict-none-binding /home/runner/work/julep/julep/agents-api/agents_api/routers/healthz/check_health.py
    1540:  /home/runner/work/julep/julep/agents-api/agents_api/routers/healthz/check_health.py:5:1: error: in <module>: Can't find module 'router'. [import-error]
    1541:  from .router import router
    1542:  ~~~~~~~~~~~~~~~~~~~~~~~~~~
    1543:  For more details, see https://google.github.io/pytype/errors.html#import-error
    1544:  [207/316] check tests.test_chat_routes
    1545:  [208/316] check tests.test_workflow_routes
    1546:  /home/runner/work/julep/julep/agents-api/tests/test_workflow_routes.py:65:1: error: : Stray type comment: object [ignored-type-comment]
    1547:  #   type: object~~~~~~~~~~~~~~~
    1548:  #   type: object
    1549:  /home/runner/work/julep/julep/agents-api/tests/test_workflow_routes.py:114:1: error: : Stray type comment: object [ignored-type-comment]
    1550:  #   type: object~~~~~~~~~~~~~~~
    1551:  #   type: object
    1552:  For more details, see https://google.github.io/pytype/errors.html#ignored-type-comment
    ...
    
    1565:  [221/316] check agents_api.routers.agents.create_agent_tool
    1566:  [222/316] check tests.sample_tasks.test_find_selector
    1567:  [223/316] check agents_api.routers.tasks.patch_execution
    1568:  [224/316] check agents_api.common.utils.__init__
    1569:  [225/316] check tests.__init__
    1570:  [226/316] check agents_api.rec_sum.summarize
    1571:  [227/316] check agents_api.worker.__init__
    1572:  [228/316] check agents_api.activities.logger
    1573:  ninja: build stopped: cannot make progress due to previous errors.
    1574:  Computing dependencies
    1575:  Generated API key since not set in the environment: 45990282359289001044345596229352
    1576:  Sentry DSN not found. Sentry will not be enabled.
    1577:  Analyzing 291 sources with 0 local dependencies
    1578:  Leaving directory '.pytype'
    1579:  ##[error]Process completed with exit code 1.
    

    ✨ CI feedback usage guide:

    The CI feedback tool (/checks) automatically triggers when a PR has a failed check.
    The tool analyzes the failed checks and provides several feedbacks:

    • Failed stage
    • Failed test name
    • Failure summary
    • Relevant error logs

    In addition to being automatically triggered, the tool can also be invoked manually by commenting on a PR:

    /checks "https://github.com/{repo_name}/actions/runs/{run_number}/job/{job_number}"
    

    where {repo_name} is the name of the repository, {run_number} is the run number of the failed check, and {job_number} is the job number of the failed check.

    Configuration options

    • enable_auto_checks_feedback - if set to true, the tool will automatically provide feedback when a check is failed. Default is true.
    • excluded_checks_list - a list of checks to exclude from the feedback, for example: ["check1", "check2"]. Default is an empty list.
    • enable_help_text - if set to true, the tool will provide a help message with the feedback. Default is true.
    • persistent_comment - if set to true, the tool will overwrite a previous checks comment with the new feedback. Default is true.
    • final_update_message - if persistent_comment is true and updating a previous checks message, the tool will also create a new message: "Persistent checks updated to latest commit". Default is true.

    See more information about the checks tool in the docs.

    Copy link

    gitguardian bot commented Dec 25, 2024

    ⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

    Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

    🔎 Detected hardcoded secret in your pull request
    GitGuardian id GitGuardian status Secret Commit Filename
    14144715 Triggered Generic Password 7798826 memory-store/docker-compose.yml View secret
    🛠 Guidelines to remediate hardcoded secrets
    1. Understand the implications of revoking this secret by investigating where it is used in your code.
    2. Replace and store your secret safely. Learn here the best practices.
    3. Revoke and rotate this secret.
    4. If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

    To avoid such incidents in the future consider


    🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

    @Ahmad-mtos Ahmad-mtos marked this pull request as ready for review December 25, 2024 16:55
    Copy link
    Contributor

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
    🧪 PR contains tests
    🔒 Security concerns

    Payload size validation:
    The content length validation middleware has been commented out in app.py, removing protection against large payload attacks that could lead to denial of service. The code indicates this was done to fix scalar access issues, but removing this security check entirely is not a proper solution.

    ⚡ Recommended focus areas for review

    Security Risk

    Content length validation middleware was commented out, which could expose the API to large payload attacks. The code indicates this is blocking access to scalar but should be fixed properly rather than disabled.

    # Global dependencies
    # FIXME: This is blocking access to scalar
    # dependencies=[Depends(valid_content_length)],

    Possible Issue
    The execution_id parameter is passed to the function but not used in the SQL query, which could lead to incorrect execution data being retrieved.

    Incomplete Feature
    The entire patch_execution endpoint functionality is commented out with a FIXME note. This could indicate missing or broken functionality for updating executions.

    Copy link
    Contributor

    PR Code Suggestions ✨

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Score
    Possible issue
    Add environment variable validation to prevent runtime errors from missing configuration

    Add validation to ensure pg_dsn is not None before attempting to create the database
    pool, as this could cause runtime errors if the environment variable is missing.

    agents-api/agents_api/app.py [20-23]

     pg_dsn = os.environ.get("PG_DSN")
    +if not pg_dsn:
    +    raise ValueError("PG_DSN environment variable is required")
     
     if not getattr(app.state, "postgres_pool", None):
         app.state.postgres_pool = await create_db_pool(pg_dsn)
    • Apply this suggestion
    Suggestion importance[1-10]: 8

    Why: The suggestion addresses a critical issue by adding validation for a required environment variable, preventing potential runtime errors that could crash the application when the database connection fails.

    8
    Add field validation and documentation for optional execution field to prevent runtime errors

    Making execution optional in ExecutionInput model could lead to runtime errors if
    code assumes it exists. Add validation or update dependent code to handle None case.

    agents-api/agents_api/common/protocol/tasks.py [142]

    -execution: Execution | None = None
    +execution: Execution | None = Field(
    +    default=None,
    +    description="Optional execution context. Code using this field must handle None case."
    +)
    • Apply this suggestion
    Suggestion importance[1-10]: 5

    Why: The suggestion adds helpful documentation and explicit field validation for an optional parameter, which can help prevent potential runtime errors and improve code maintainability.

    5
    General
    Improve error message clarity for debugging database query results

    The assertion error message should include both the expected and actual count for
    better debugging, as the current message only states that more than one result was
    found.

    agents-api/agents_api/queries/utils.py [231]

    -assert len(data) == 1, f"Expected one result, got {len(data)}"
    +assert len(data) == 1, f"Expected exactly one result, got {len(data)} results"
    • Apply this suggestion
    Suggestion importance[1-10]: 3

    Why: The suggestion offers a minor improvement to error message clarity, making debugging slightly easier, but doesn't address any functional issues.

    3

    Copy link
    Contributor

    @ellipsis-dev ellipsis-dev bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    👍 Looks good to me! Reviewed everything up to c0acb49 in 1 minute and 31 seconds

    More details
    • Looked at 8596 lines of code in 96 files
    • Skipped 0 files when reviewing.
    • Skipped posting 3 drafted comments based on config settings.
    1. agents-api/agents_api/worker/worker.py:55
    • Draft comment:
      The EmbedDocsWorkflow is not included in the worker initialization. Ensure this is intentional and not an oversight due to the migration to PostgreSQL.
    • Reason this comment was not posted:
      Confidence changes required: 50%
      The create_worker function in worker.py initializes a Temporal worker but does not include the EmbedDocsWorkflow. This might be intentional due to the migration to PostgreSQL, but it's worth noting in case it was an oversight.
    2. agents-api/agents_api/routers/docs/create_doc.py:46
    • Draft comment:
      The EmbedDocsWorkflow is no longer started in create_user_doc. Ensure this change aligns with the intended functionality after migrating to PostgreSQL.
    • Reason this comment was not posted:
      Confidence changes required: 50%
      The create_user_doc and create_agent_doc functions in create_doc.py no longer start the EmbedDocsWorkflow. This is likely due to the migration to PostgreSQL, but it's important to confirm that this change aligns with the intended functionality.
    3. agents-api/docker-compose.yml:11
    • Draft comment:
      Ensure that the transition from Cozo to PostgreSQL configurations is complete and correct, as this is a significant change.
    • Reason this comment was not posted:
      Confidence changes required: 50%
      The docker-compose.yml for the agents-api service has removed the cozo related environment variables and services, replacing them with PostgreSQL configurations. This is a significant change and should be verified for correctness.

    Workflow ID: wflow_qVzcCBZPxJsSq0hl


    You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

    @Ahmad-mtos Ahmad-mtos merged commit 3caf682 into f/switch-to-pg Dec 25, 2024
    1 check failed
    @Ahmad-mtos Ahmad-mtos deleted the f/migrate-workflows-to-pg branch December 25, 2024 17:05
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    1 participant