fix(agents-api, memory-store): fix temporal workflows, queries, and migrations #994

Ahmad-mtos · 2024-12-26T15:11:30Z

PR Type

Bug fix, Enhancement

Description

Major fixes and enhancements to temporal workflows and task execution:

Fixed critical bugs in temporal workflow handling:
- Corrected arguments processing in task execution workflow
- Fixed continuous view aggregation in memory store
- Resolved issues with task queries and transitions
Enhanced error handling and monitoring:
- Added detailed error output in task execution
- Improved transition handling with proper error states
- Added health checks for memory store
Improved query performance and data consistency:
- Enhanced task query with proper workflow/tools aggregation
- Fixed execution transitions validation
- Optimized continuous view aggregation
Added configuration improvements:
- Added transition rate limiting configuration
- Enhanced docker compose dependencies
- Added proper health checks

Changes walkthrough 📝

Relevant files

Enhancement

transition_step.py `Simplify transition step creation parameters` agents-api/agents_api/activities/task_steps/transition_step.py Removed unnecessary parameters from create_execution_transition function call	+0/-2
get_task.py `Enhance task query with proper aggregation` agents-api/agents_api/queries/tasks/get_task.py Improved SQL query to properly aggregate workflows and tools Added DISTINCT to avoid duplicates Added tools to query results	+21/-11
create_task_execution.py `Improve task execution error handling` agents-api/agents_api/routers/tasks/create_task_execution.py Added task specification to execution input Enhanced error handling with detailed error output Added transition target information	+16/-0

Bug fix

temporal.py `Fix temporal workflow arguments handling` agents-api/agents_api/clients/temporal.py Fixed incorrect arguments handling in run_task_execution_workflow Added FIXME comment about wrong logic Modified previous_inputs initialization	+4/-5
000013_executions_continuous_view.up.sql `Fix executions continuous view aggregation` memory-store/migrations/000013_executions_continuous_view.up.sql Fixed continuous view aggregation Added output to coalesce statement Removed redundant transition_id from GROUP BY	+12/-12

💡 PR-Agent usage: Comment /help "your question" on any pull request to receive relevant information

Signed-off-by: Diwank Singh Tomer <[email protected]>

gitguardian · 2024-12-26T15:11:35Z

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

🔎 Detected hardcoded secret in your pull request

GitGuardian id	GitGuardian status	Secret	Commit	Filename
14144715	Triggered	Generic Password	`15f5a89`	memory-store/docker-compose.yml	View secret

🛠 Guidelines to remediate hardcoded secrets

Understand the implications of revoking this secret by investigating where it is used in your code.
Replace and store your secret safely. Learn here the best practices.
Revoke and rotate this secret.
If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider

following these best practices for managing and storing secrets including API keys and other credentials
install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.

^{🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.}

qodo-merge-pro-for-open-source · 2024-12-26T15:12:13Z

CI Failure Feedback 🧐

(Checks updated until commit `b5a8313`)

Action: Typecheck
Failed stage: Typecheck [❌]
Failed test name: pytype
Failure summary: The pytype static type checker failed with multiple errors: Type mismatch in `search_docs_hybrid.py`: The type annotation for `embedding` doesn't match the assignment type Multiple errors in `execute_integration.py`: - Trying to access `.id` attribute on a potentially None value - `get_tool_args_from_metadata` object is not callable Import error in `check_health.py`: Unable to find module `router` Stray type comments in `test_workflow_routes.py`
Relevant error logs: 1: ##[group]Operating System 2: Ubuntu ... 1191: [14/301] check agents_api.autogen.Agents 1192: [15/301] check agents_api.autogen.Tasks 1193: [16/301] check agents_api.dependencies.exceptions 1194: [17/301] check agents_api.common.utils.datetime 1195: [18/301] check agents_api.common.protocol.developers 1196: [19/301] check agents_api.common.protocol.agents 1197: [20/301] check agents_api.metrics.counters 1198: [21/301] check agents_api.queries.utils 1199: ERROR:pytype.matcher Invalid type: <class 'pytype.abstract.function.ParamSpecMatch'> 1200: [22/301] check agents_api.autogen.openapi_model 1201: [23/301] check agents_api.common.exceptions.__init__ 1202: [24/301] check agents_api.routers.docs.router 1203: [25/301] check agents_api.exceptions 1204: [26/301] check agents_api.queries.developers.get_developer 1205: [27/301] check agents_api.queries.docs.search_docs_hybrid 1206: FAILED: /home/runner/work/julep/julep/agents-api/.pytype/pyi/agents_api/queries/docs/search_docs_hybrid.pyi 1207: /home/runner/work/julep/julep/agents-api/.venv/bin/python -m pytype.main --disable pyi-error --imports_info /home/runner/work/julep/julep/agents-api/.pytype/imports/agents_api.queries.docs.search_docs_hybrid.imports --module-name agents_api.queries.docs.search_docs_hybrid --platform linux -V 3.12 -o /home/runner/work/julep/julep/agents-api/.pytype/pyi/agents_api/queries/docs/search_docs_hybrid.pyi --analyze-annotated --nofail --none-is-not-bool --quick --strict-none-binding /home/runner/work/julep/julep/agents-api/agents_api/queries/docs/search_docs_hybrid.py 1208: /home/runner/work/julep/julep/agents-api/agents_api/queries/docs/search_docs_hybrid.py:50:1: error: in <module>: Type annotation for embedding does not match type of assignment [annotation-type-mismatch] ... 1305: metadata_filter, 1306: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1307: search_language, 1308: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1309: ], 1310: ~~~~~~~~~~ 1311: ) 1312: ~~~~~ 1313: For more details, see https://google.github.io/pytype/errors.html#annotation-type-mismatch ... 1400: [114/301] check agents_api.queries.executions.__init__ 1401: [115/301] check agents_api.queries.developers.__init__ 1402: [116/301] check agents_api.queries.entries.__init__ 1403: [117/301] check agents_api.queries.agents.__init__ 1404: [118/301] check agents_api.common.retry_policies 1405: [119/301] check agents_api.clients.integrations 1406: [120/301] check agents_api.activities.task_steps.get_value_step 1407: [121/301] check agents_api.activities.execute_integration 1408: FAILED: /home/runner/work/julep/julep/agents-api/.pytype/pyi/agents_api/activities/execute_integration.pyi 1409: /home/runner/work/julep/julep/agents-api/.venv/bin/python -m pytype.main --disable pyi-error --imports_info /home/runner/work/julep/julep/agents-api/.pytype/imports/agents_api.activities.execute_integration.imports --module-name agents_api.activities.execute_integration --platform linux -V 3.12 -o /home/runner/work/julep/julep/agents-api/.pytype/pyi/agents_api/activities/execute_integration.pyi --analyze-annotated --nofail --none-is-not-bool --quick --strict-none-binding /home/runner/work/julep/julep/agents-api/agents_api/activities/execute_integration.py 1410: /home/runner/work/julep/julep/agents-api/agents_api/activities/execute_integration.py:27:15: error: in execute_integration: No attribute 'id' on None [attribute-error] 1411: In Optional[agents_api.autogen.openapi_model.TaskSpecDef] 1412: task_id = context.execution_input.task.id 1413: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1414: /home/runner/work/julep/julep/agents-api/agents_api/activities/execute_integration.py:29:24: error: in execute_integration: 'agents_api.queries.tools.get_tool_args_from_metadata' object is not callable [not-callable] 1415: merged_tool_args = get_tool_args_from_metadata( 1416: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1417: developer_id=developer_id, agent_id=agent_id, task_id=task_id, arg_type="args" 1418: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1419: ) 1420: ~~~~~ 1421: /home/runner/work/julep/julep/agents-api/agents_api/activities/execute_integration.py:33:25: error: in execute_integration: 'agents_api.queries.tools.get_tool_args_from_metadata' object is not callable [not-callable] 1422: merged_tool_setup = get_tool_args_from_metadata( 1423: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1424: developer_id=developer_id, agent_id=agent_id, task_id=task_id, arg_type="setup" 1425: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1426: ) 1427: ~~~~~ 1428: For more details, see https://google.github.io/pytype/errors.html ... 1494: [187/301] check agents_api.queries.executions.get_paused_execution_token 1495: [188/301] check agents_api.rec_sum.data 1496: [189/301] check agents_api.rec_sum.generate 1497: [190/301] check agents_api.routers.sessions.exceptions 1498: [191/301] check tests.test_chat_routes 1499: [192/301] check agents_api.clients.worker.types 1500: [193/301] check agents_api.workflows.__init__ 1501: [194/301] check tests.test_workflow_routes 1502: /home/runner/work/julep/julep/agents-api/tests/test_workflow_routes.py:65:1: error: : Stray type comment: object [ignored-type-comment] 1503: # type: object~~~~~~~~~~~~~~~ 1504: # type: object 1505: /home/runner/work/julep/julep/agents-api/tests/test_workflow_routes.py:114:1: error: : Stray type comment: object [ignored-type-comment] 1506: # type: object~~~~~~~~~~~~~~~ 1507: # type: object 1508: For more details, see https://google.github.io/pytype/errors.html#ignored-type-comment ... 1521: [207/301] check agents_api.rec_sum.__init__ 1522: [208/301] check agents_api.worker.__init__ 1523: [209/301] check agents_api.model_registry 1524: [210/301] check agents_api.common.exceptions.sessions 1525: [211/301] check agents_api.__init__ 1526: [212/301] check tests.test_sessions 1527: [213/301] check agents_api.common.exceptions.agents 1528: [214/301] check agents_api.routers.healthz.check_health 1529: FAILED: /home/runner/work/julep/julep/agents-api/.pytype/pyi/agents_api/routers/healthz/check_health.pyi 1530: /home/runner/work/julep/julep/agents-api/.venv/bin/python -m pytype.main --disable pyi-error --imports_info /home/runner/work/julep/julep/agents-api/.pytype/imports/agents_api.routers.healthz.check_health.imports --module-name agents_api.routers.healthz.check_health --platform linux -V 3.12 -o /home/runner/work/julep/julep/agents-api/.pytype/pyi/agents_api/routers/healthz/check_health.pyi --analyze-annotated --nofail --none-is-not-bool --quick --strict-none-binding /home/runner/work/julep/julep/agents-api/agents_api/routers/healthz/check_health.py 1531: /home/runner/work/julep/julep/agents-api/agents_api/routers/healthz/check_health.py:5:1: error: in <module>: Can't find module 'router'. [import-error] 1532: from .router import router 1533: ~~~~~~~~~~~~~~~~~~~~~~~~~~ 1534: For more details, see https://google.github.io/pytype/errors.html#import-error ... 1544: [224/301] check agents_api.clients.worker.__init__ 1545: [225/301] check agents_api.rec_sum.trim 1546: [226/301] check agents_api.routers.agents.delete_agent_tool 1547: [227/301] check agents_api.clients.worker.worker 1548: [228/301] check agents_api.rec_sum.entities 1549: [229/301] check agents_api.routers.agents.update_agent_tool 1550: [230/301] check agents_api.common.utils.debug 1551: [231/301] check agents_api.routers.agents.patch_agent_tool 1552: ninja: build stopped: cannot make progress due to previous errors. 1553: Computing dependencies 1554: Generated API key since not set in the environment: 23756738964773662262192138776881 1555: Sentry DSN not found. Sentry will not be enabled. 1556: Analyzing 291 sources with 0 local dependencies 1557: Leaving directory '.pytype' 1558: ##[error]Process completed with exit code 1.

✨ CI feedback usage guide:

The CI feedback tool (/checks) automatically triggers when a PR has a failed check.
The tool analyzes the failed checks and provides several feedbacks:

Failed stage
Failed test name
Failure summary
Relevant error logs

In addition to being automatically triggered, the tool can also be invoked manually by commenting on a PR:

/checks "https://github.com/{repo_name}/actions/runs/{run_number}/job/{job_number}"

where {repo_name} is the name of the repository, {run_number} is the run number of the failed check, and {job_number} is the job number of the failed check.

Configuration options

enable_auto_checks_feedback - if set to true, the tool will automatically provide feedback when a check is failed. Default is true.
excluded_checks_list - a list of checks to exclude from the feedback, for example: ["check1", "check2"]. Default is an empty list.
enable_help_text - if set to true, the tool will provide a help message with the feedback. Default is true.
persistent_comment - if set to true, the tool will overwrite a previous checks comment with the new feedback. Default is true.
final_update_message - if persistent_comment is true and updating a previous checks message, the tool will also create a new message: "Persistent checks updated to latest commit". Default is true.

See more information about the checks tool in the docs.

qodo-merge-pro-for-open-source · 2024-12-26T15:12:20Z

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
🧪 PR contains tests
🔒 No security concerns identified
⚡ Recommended focus areas for review Data Handling The FIXME comment indicates wrong logic in arguments handling. The code modifies execution_input.arguments directly which could have side effects. Consider creating a copy before modification. # FIXME: This is wrong logic old_args = execution_input.arguments execution_input.arguments = await offload_if_large(old_args) previous_inputs: list[dict] = previous_inputs or [execution_input.arguments] Parameter Change The workflow run method signature changed to require start and previous_inputs parameters that were previously optional. This could break existing callers that don't provide these parameters. async def run( self, execution_input: ExecutionInput, start: TransitionTarget, previous_inputs: list, ) -> Any: Data Migration The continuous view aggregation changes could affect existing data. Need to verify data consistency after migration and ensure proper handling of existing records. time_bucket ('1 day', created_at) AS bucket, execution_id, last(transition_id, created_at) AS transition_id, count(*) AS total_transitions, state_agg(created_at, to_text(type)) AS state, max(created_at) AS created_at, last(type, created_at) AS type, last(step_definition, created_at) AS step_definition, last(step_label, created_at) AS step_label, last(current_step, created_at) AS current_step, last(next_step, created_at) AS next_step, last(output, created_at) AS output, last(task_token, created_at) AS task_token, last(metadata, created_at) AS metadata FROM

qodo-merge-pro-for-open-source · 2024-12-26T15:12:51Z

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Score
Possible issue	Maintain backward compatibility by providing default values for optional parameters The workflow run method has required parameters `start` and `previous_inputs` but they were previously optional with default values. This breaking change could cause failures in existing code that doesn't provide these parameters. agents-api/agents_api/workflows/task_execution/init.py [140-145] async def run( self, execution_input: ExecutionInput, - start: TransitionTarget, - previous_inputs: list, + start: TransitionTarget = TransitionTarget(workflow="main", step=0), + previous_inputs: list \| None = None, ) -> Any: + previous_inputs = previous_inputs or [execution_input.arguments] Apply this suggestion Suggestion importance[1-10]: 8 Why: This is a critical suggestion as removing default parameter values is a breaking change that could cause runtime failures in existing code that relies on these defaults.	8
Possible issue	Add validation or safe access methods when making required fields optional to prevent null pointer exceptions The `task` field in ExecutionInput was changed from required to optional, but code accessing this field may not handle the None case, potentially causing NullPointerExceptions. agents-api/agents_api/common/protocol/tasks.py [140-143] class ExecutionInput(BaseModel): developer_id: UUID execution: Execution \| None = None task: TaskSpecDef \| None = None + + def get_task(self) -> TaskSpecDef: + if self.task is None: + raise ValueError("Task is required but was not provided") + return self.task Apply this suggestion Suggestion importance[1-10]: 7 Why: Important suggestion that addresses potential runtime errors by adding proper validation for a newly optional field that was previously required.	7
General	Unclear FIXME comments should be either removed or properly documented with explanation and fix The FIXME comment indicates wrong logic but doesn't explain why. Either remove the comment if the logic is correct, or fix the underlying issue and document the fix. Unclear comments can mislead future developers. agents-api/agents_api/clients/temporal.py [99-101] -# FIXME: This is wrong logic +# Handle large arguments by offloading to blob storage if needed old_args = execution_input.arguments execution_input.arguments = await offload_if_large(old_args) Apply this suggestion Suggestion importance[1-10]: 4 Why: The suggestion correctly identifies an unclear FIXME comment that could confuse developers, but the impact is moderate since it's mainly about code documentation clarity rather than functionality.	4

ellipsis-dev

❌ Changes requested. Reviewed everything up to b5a8313 in 2 minutes and 28 seconds

More details

Looked at 846 lines of code in 20 files
Skipped 0 files when reviewing.
Skipped posting 3 drafted comments based on config settings.

1. agents-api/agents_api/common/protocol/tasks.py:245

Draft comment:
The include_remote parameter is marked as deprecated. Consider removing or replacing it to avoid confusion and potential maintenance issues.
Reason this comment was not posted:
Confidence changes required: 50%
In the prepare_execution_input function, the include_remote parameter is marked as deprecated with a FIXME comment. This indicates that the parameter should be removed or replaced in the future. However, the current implementation still uses it, which might lead to confusion or maintenance issues.

2. agents-api/agents_api/queries/tasks/create_or_update_task.py:94

Draft comment:
The ON CONFLICT clause for the tasks table updates the version by incrementing tasks.version + 1, which seems incorrect. The version should only be incremented when a new version is created, not on conflict. Consider revisiting this logic.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The version handling is complex here. The CTE calculates next_version as MAX(version) + 1. When inserting, it uses this next_version. If there's a conflict, it increments the version again. This double increment could be intentional to skip a version number on conflicts, ensuring unique versions even in race conditions. Without deeper knowledge of the system's versioning requirements, I can't be certain this is wrong.
I might be missing important business logic around version numbering. Perhaps there's a requirement that conflict resolution should maintain sequential version numbers without gaps.
While the versioning logic seems unusual, without clear evidence that it's incorrect and understanding of the system's versioning requirements, we shouldn't assume it's a bug.
The comment makes an assumption about versioning requirements without strong evidence. The current implementation could be intentional for handling race conditions.

3. agents-api/agents_api/queries/agents/create_or_update_agent.py:56

Draft comment:
Updating canonical_name in the ON CONFLICT clause might lead to inconsistencies, as it is typically a unique identifier that should not change once set. Consider revisiting this logic.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The code actually handles canonical_name carefully - it first tries to preserve the existing one through the CTE, and only generates a new one if needed. The ON CONFLICT clause isn't blindly overwriting it, it's using the same value that was just computed in the INSERT portion. This makes the comment's concern invalid - there won't be inconsistencies because the canonical_name logic is consistent between INSERT and UPDATE paths.
I could be missing some business logic requirements around canonical_name immutability. There might be other parts of the system that depend on canonical_name never changing.
Even if there were such requirements, the code already preserves existing canonical_names through the CTE, so this implementation is safe. The comment is raising a theoretical concern that the code already handles correctly.
The comment should be deleted because it raises a concern that is already properly handled by the code's careful treatment of canonical_name values.

Workflow ID: wflow_keWXJzUuAqLdRTxB

Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

agents-api/agents_api/clients/temporal.py

Ahmad-mtos and others added 11 commits December 26, 2024 13:02

fix(agents-api): resolve conflicts

f4d449b

refactor: Lint agents-api (CI)

b87f1f8

fix(agents-api): Fixed create update queries and get task

cde3773

misc

31ca260

refactor: Lint agents-api (CI)

3b188f3

fix(agents-api): Fix failing tests

c038e97

Signed-off-by: Diwank Singh Tomer <[email protected]>

refactor: Lint agents-api (CI)

7011246

fix(agents-api): Ghost in the machine

15f5a89

Signed-off-by: Diwank Singh Tomer <[email protected]>

refactor: Lint agents-api (CI)

798c625

fix(agents-api): fix get task query

35344f1

fix(agents-api): add agents_api_transitions to docker compose

b5a8313

qodo-merge-pro-for-open-source bot added the Review effort [1-5]: 4 label Dec 26, 2024

ellipsis-dev bot reviewed Dec 26, 2024

View reviewed changes

agents-api/agents_api/clients/temporal.py Show resolved Hide resolved

HamadaSalhab approved these changes Dec 26, 2024

View reviewed changes

whiterabbit1983 merged commit 9b9bd73 into f/switch-to-pg Dec 26, 2024
6 of 11 checks passed

whiterabbit1983 deleted the x/fix-temporal-workflows branch December 26, 2024 19:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(agents-api, memory-store): fix temporal workflows, queries, and migrations #994

fix(agents-api, memory-store): fix temporal workflows, queries, and migrations #994

Ahmad-mtos commented Dec 26, 2024 •

edited by qodo-merge-pro-for-open-source bot

Loading

gitguardian bot commented Dec 26, 2024

qodo-merge-pro-for-open-source bot commented Dec 26, 2024 •

edited

Loading

Configuration options

qodo-merge-pro-for-open-source bot commented Dec 26, 2024

qodo-merge-pro-for-open-source bot commented Dec 26, 2024

ellipsis-dev bot left a comment

fix(agents-api, memory-store): fix temporal workflows, queries, and migrations #994

fix(agents-api, memory-store): fix temporal workflows, queries, and migrations #994

Conversation

Ahmad-mtos commented Dec 26, 2024 • edited by qodo-merge-pro-for-open-source bot Loading

PR Type

Description

Changes walkthrough 📝

gitguardian bot commented Dec 26, 2024

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

qodo-merge-pro-for-open-source bot commented Dec 26, 2024 • edited Loading

CI Failure Feedback 🧐

(Checks updated until commit b5a8313)

Configuration options

qodo-merge-pro-for-open-source bot commented Dec 26, 2024

PR Reviewer Guide 🔍

qodo-merge-pro-for-open-source bot commented Dec 26, 2024

PR Code Suggestions ✨

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

Ahmad-mtos commented Dec 26, 2024 •

edited by qodo-merge-pro-for-open-source bot

Loading

qodo-merge-pro-for-open-source bot commented Dec 26, 2024 •

edited

Loading

(Checks updated until commit `b5a8313`)