feat(agents-api): Add parallelism option to map-reduce step #490

creatorrr · 2024-09-04T22:26:32Z

fix(agents-api,typespec): Fix chat/entry typespec models to include tool_calls
refactor(agents-api): Minor refactor to make task execution workflow easier to read
feat(agents-api): Add parallel flag to map reduce
feat(agents-api): Add parallelism option to map-reduce step

🚀	This description was created by Ellipsis for commit `888887a`

Summary:

Introduced a parallelism option for map-reduce steps in agents-api, with updates to task execution workflows, TypeScript models, and tests, including fixes and refactoring.

Key points:

fix(agents-api,typespec): Fix chat/entry typespec models to include tool_calls
refactor(agents-api): Minor refactor to make task execution workflow easier to read
feat(agents-api): Add parallel flag to map reduce
feat(agents-api): Add parallelism option to map-reduce step
Feature Addition: Added parallelism option for map-reduce steps in agents-api.
Code Changes:
- Updated agents-api/agents_api/activities/task_steps/base_evaluate.py to handle extra lambda functions.
- Modified agents-api/agents_api/activities/utils.py to include accumulate, map, range, and reduce in ALLOWED_FUNCTIONS.
- Added parallelism attribute to MapReduceStep in agents-api/agents_api/autogen/Tasks.py.
- Updated agents-api/agents_api/env.py to include task_max_parallelism.
- Refactored agents-api/agents_api/workflows/task_execution/__init__.py to handle parallel map-reduce execution.
- Created agents-api/agents_api/workflows/task_execution/helpers.py for helper functions supporting parallel execution.
Tests: Added tests in agents-api/tests/test_execution_workflow.py for parallel map-reduce functionality.
TypeScript SDK: Updated models and schemas in sdks/ts/src/api/models and sdks/ts/src/api/schemas to support parallelism in map-reduce steps.
Typespec Update: Modified typespec/tasks/steps.tsp to include parallelism in MapReduceStep.

Generated with ❤️ by ellipsis.dev

…easier to read Signed-off-by: Diwank Singh Tomer <[email protected]>

Signed-off-by: Diwank Singh Tomer <[email protected]>

ellipsis-dev

❌ Changes requested. Reviewed everything up to c96e0fb in 32 seconds

More details

Looked at 969 lines of code in 16 files
Skipped 0 files when reviewing.
Skipped posting 3 drafted comments based on config settings.

1. agents-api/agents_api/activities/utils.py:28

Draft comment:
Ensure that the use of map and reduce in ALLOWED_FUNCTIONS is safe and does not introduce security vulnerabilities. Consider restricting their use or validating inputs.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The comment is speculative, asking to ensure safety, which violates the rule against speculative comments. It doesn't point out a definite issue but rather suggests a potential concern. The comment doesn't provide a specific actionable change or point out a definite problem.
The comment does touch on a valid concern about security, which is important. However, it doesn't specify a concrete issue or suggest a specific change, making it less actionable.
While security is important, the comment should specify a definite issue or suggest a specific change to be actionable. Speculative comments are not useful according to the rules.
The comment should be removed because it is speculative and does not point out a definite issue or suggest a specific change.

2. agents-api/agents_api/workflows/task_execution/helpers.py:22

Draft comment:
Ensure that all usages of continue_as_child handle errors appropriately and consistently across the codebase.
Reason this comment was not posted:
Confidence changes required: 50%
The continue_as_child function is used multiple times across the codebase. It might be beneficial to ensure that all its usages are consistent and handle errors appropriately.

3. agents-api/agents_api/workflows/task_execution/helpers.py:193

Draft comment:
Ensure that the parallelism value is validated and does not exceed system capabilities to prevent resource exhaustion.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The comment is suggesting a validation step that is not present in the code. The code does limit parallelism to task_max_parallelism, but it does not check against system capabilities. This could be a valid concern if task_max_parallelism is not set appropriately. However, the comment is speculative as it assumes task_max_parallelism might not be set correctly.
The comment might be unnecessary if task_max_parallelism is already configured to match system capabilities. Without knowing the configuration context, it's hard to judge the necessity of the comment.
The comment could be useful if there's a risk that task_max_parallelism is not aligned with system capabilities, but this is speculative without additional context.
The comment is speculative and assumes a potential misconfiguration of task_max_parallelism. Without evidence of such a misconfiguration, the comment should be removed.

Workflow ID: wflow_zTq2u3XyqQ43gIjj

Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

agents-api/agents_api/activities/task_steps/base_evaluate.py

Signed-off-by: Diwank Singh Tomer <[email protected]>

ellipsis-dev

❌ Changes requested. Incremental review on 36526f9 in 24 seconds

More details

Looked at 105 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 0 drafted comments based on config settings.

Workflow ID: wflow_7hjinBgeEiji4Gzd

Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

agents-api/tests/test_execution_workflow.py

ellipsis-dev

👍 Looks good to me! Incremental review on 67d1d63 in 44 seconds

More details

Looked at 226 lines of code in 4 files
Skipped 0 files when reviewing.
Skipped posting 3 drafted comments based on config settings.

1. agents-api/agents_api/workflows/task_execution/helpers.py:210

Draft comment:
The assert parallelism > 1 check is too restrictive. Consider allowing parallelism of 1, which means no parallel execution but is still valid.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The comment is about a change in the diff, specifically the assertion assert parallelism > 1. The suggestion to allow parallelism of 1 could be a valid point if there's a use case for non-parallel execution within this function. However, the current implementation seems to assume that parallelism should always be greater than 1 for this function to make sense. The comment is actionable as it suggests a specific change to the code.
The comment does not provide a strong justification for why allowing parallelism of 1 would be beneficial. It assumes that non-parallel execution is a valid use case without evidence.
The comment is still actionable as it suggests a specific change. The decision to allow parallelism of 1 would depend on the intended use cases for this function, which the comment does not address.
The comment is about a change in the diff and suggests a specific, actionable change. It should be kept for further consideration by the PR author.

2. agents-api/agents_api/workflows/task_execution/helpers.py:224

Draft comment:
Ensure that all tasks created with asyncio.create_task are awaited to prevent potential resource leaks.
Reason this comment was not posted:
Confidence changes required: 50%
The asyncio.create_task usage in execute_map_reduce_step_parallel is correct, but ensure that all tasks are awaited to prevent potential resource leaks.

3. agents-api/agents_api/workflows/task_execution/helpers.py:246

Draft comment:
The schedule_to_close_timeout was increased from 2 to 5 seconds. Ensure this aligns with expected execution times and doesn't introduce unnecessary delays.
Reason this comment was not posted:
Confidence changes required: 50%
The schedule_to_close_timeout in execute_map_reduce_step_parallel was increased from 2 to 5 seconds. This change should be validated to ensure it aligns with expected execution times.

Workflow ID: wflow_dMepXVCXJ0Ffcq0I

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Signed-off-by: Diwank Singh Tomer <[email protected]>

ellipsis-dev

❌ Changes requested. Incremental review on 45106f2 in 42 seconds

More details

Looked at 245 lines of code in 5 files
Skipped 0 files when reviewing.
Skipped posting 2 drafted comments based on config settings.

1. agents-api/agents_api/workflows/task_execution/helpers.py:246

Draft comment:
The increase in schedule_to_close_timeout from 2 to 5 seconds should be justified or documented to ensure it aligns with performance expectations.
Reason this comment was not posted:
Confidence changes required: 50%
The schedule_to_close_timeout was increased from 2 to 5 seconds. This change should be justified or documented to ensure it aligns with performance expectations.

2. agents-api/tests/test_execution_workflow.py:700

Draft comment:
Consider testing various levels of parallelism, not just a fixed value of 10, to ensure robustness.
Reason this comment was not posted:
Confidence changes required: 50%
The test for map-reduce step parallelism was changed to a fixed parallelism of 10. It would be beneficial to test various levels of parallelism to ensure robustness.

Workflow ID: wflow_6BlEWpLQSn23Oxj0

Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

agents-api/agents_api/activities/task_steps/base_evaluate.py

ellipsis-dev

❌ Changes requested. Incremental review on d139570 in 42 seconds

More details

Looked at 262 lines of code in 8 files
Skipped 0 files when reviewing.
Skipped posting 5 drafted comments based on config settings.

1. sdks/ts/src/api/schemas/$Tasks_CreateTaskRequest.ts:142

Draft comment:
The parallelism field is correctly set with a minimum value of 1, ensuring valid configurations.
Reason this comment was not posted:
Confidence changes required: 0%
The parallelism field in the TypeScript schemas is correctly set with a minimum value of 1, ensuring that parallelism is always a positive integer. This is a good practice to prevent invalid configurations.

2. sdks/ts/src/api/schemas/$Tasks_PatchTaskRequest.ts:136

Draft comment:
The parallelism field is correctly set with a minimum value of 1, ensuring valid configurations. This applies to similar lines in other TypeScript schema files.
Reason this comment was not posted:
Confidence changes required: 0%
The parallelism field in the TypeScript schemas is correctly set with a minimum value of 1, ensuring that parallelism is always a positive integer. This is a good practice to prevent invalid configurations. This comment applies to similar lines in other TypeScript schema files.

3. sdks/ts/src/api/schemas/$Tasks_Task.ts:142

Draft comment:
The parallelism field is correctly set with a minimum value of 1, ensuring valid configurations. This applies to similar lines in other TypeScript schema files.
Reason this comment was not posted:
Confidence changes required: 0%
The parallelism field in the TypeScript schemas is correctly set with a minimum value of 1, ensuring that parallelism is always a positive integer. This is a good practice to prevent invalid configurations. This comment applies to similar lines in other TypeScript schema files.

4. sdks/ts/src/api/schemas/$Tasks_UpdateTaskRequest.ts:142

Draft comment:
The parallelism field is correctly set with a minimum value of 1, ensuring valid configurations. This applies to similar lines in other TypeScript schema files.
Reason this comment was not posted:
Confidence changes required: 0%
The parallelism field in the TypeScript schemas is correctly set with a minimum value of 1, ensuring that parallelism is always a positive integer. This is a good practice to prevent invalid configurations. This comment applies to similar lines in other TypeScript schema files.

5. typespec/tasks/steps.tsp:255

Draft comment:
The parallelism field is correctly set with a minimum value of 1, ensuring valid configurations.
Reason this comment was not posted:
Confidence changes required: 0%
The parallelism field in the TypeSpec file is correctly set with a minimum value of 1, ensuring that parallelism is always a positive integer. This is a good practice to prevent invalid configurations.

Workflow ID: wflow_QW2WELGI7MRvxN6I

Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

agents-api/agents_api/activities/task_steps/base_evaluate.py

Signed-off-by: Diwank Singh Tomer <[email protected]>

ellipsis-dev

👍 Looks good to me! Incremental review on 50c8e66 in 37 seconds

More details

Looked at 241 lines of code in 8 files
Skipped 0 files when reviewing.
Skipped posting 3 drafted comments based on config settings.

1. agents-api/agents_api/activities/task_steps/base_evaluate.py:36

Draft comment:
Using eval is a security risk as it can execute arbitrary code. Consider using a safer alternative, such as ast.literal_eval, if possible.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The comment correctly identifies a security risk with the use of eval, which is relevant to the changes made in the diff. However, the suggestion to use ast.literal_eval is not applicable for evaluating lambda functions. The comment is partially useful as it highlights a security concern, but the suggested alternative is not viable.
The comment could be misleading because it suggests an alternative that doesn't work for the use case. It might be better to suggest a more appropriate solution or simply highlight the risk without suggesting an incorrect alternative.
While the suggestion is incorrect, the identification of a security risk is still valuable. The comment could be improved by removing the incorrect suggestion and focusing on the risk.
Keep the comment but acknowledge that the suggestion is incorrect. The security risk is valid, but the alternative provided is not applicable.

2. agents-api/agents_api/autogen/Tasks.py:353

Draft comment:
The parallelism field is correctly annotated with a minimum value of 1, ensuring it is always a positive integer. This is a good practice.
Reason this comment was not posted:
Confidence changes required: 0%
The parallelism field in the MapReduceStep model in Tasks.py is correctly annotated with a minimum value of 1, ensuring that parallelism is always a positive integer. This is a good practice to prevent invalid configurations.

3. agents-api/tests/test_execution_workflow.py:702

Draft comment:
The loop for testing different parallelism values in map-reduce steps is a good approach for comprehensive testing.
Reason this comment was not posted:
Confidence changes required: 0%
The test for map-reduce step parallelism in test_execution_workflow.py is well-structured, using a loop to test different parallelism values. This ensures comprehensive testing of the feature.

Workflow ID: wflow_Iy7jUbiDda5BDRcH

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev

👍 Looks good to me! Incremental review on c54f172 in 22 seconds

More details

Looked at 87 lines of code in 6 files
Skipped 0 files when reviewing.
Skipped posting 1 drafted comments based on config settings.

1. agents-api/agents_api/autogen/Tasks.py:355

Draft comment:
The description for the parallelism field is slightly misleading. It mentions running the reduce expression in parallel, but it should be clarified that it applies to the map step as well.
Reason this comment was not posted:
Confidence changes required: 50%
The parallelism field has been added to the MapReduceStep class with a range constraint of 1 to 100. This is a good practice to prevent invalid values. However, the description of the parallelism field in the docstring is slightly misleading. It mentions running the reduce expression in parallel, but it should be clarified that it applies to the map step as well.

Workflow ID: wflow_5K8waNlE76T5G9G2

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Signed-off-by: Diwank Singh Tomer <[email protected]>

ellipsis-dev

❌ Changes requested. Incremental review on 888887a in 58 seconds

More details

Looked at 205 lines of code in 12 files
Skipped 0 files when reviewing.
Skipped posting 1 drafted comments based on config settings.

1. agents-api/agents_api/activities/utils.py:42

Draft comment:
Adding time to ALLOWED_FUNCTIONS can lead to non-deterministic behavior. Ensure this is intended and consider the implications for reproducibility.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The comment is about a change made in the diff, specifically the addition of the 'time' function. It raises a valid concern about non-deterministic behavior, which is a potential issue when using the 'time' function in evaluations. However, the comment asks to ensure the behavior is intended, which violates the rule against asking for confirmation or ensuring intentions.
The comment does raise a valid point about non-deterministic behavior, which is important for code quality. However, it also asks for confirmation, which is against the rules.
While the comment does ask for confirmation, the concern about non-deterministic behavior is significant enough to warrant keeping the comment, as it directly relates to a potential issue introduced by the change.
Keep the comment because it raises a valid concern about non-deterministic behavior, which is directly related to the change made in the diff.

Workflow ID: wflow_HZiZuGG5Zu90FfTW

Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev · 2024-09-07T02:05:57Z

agents-api/agents_api/activities/utils.py

    "match_regex": lambda pattern, string: bool(re2.fullmatch(pattern, string)),
    "max": max,
    "min": min,
+    "random": random,


Adding random to ALLOWED_FUNCTIONS can lead to non-deterministic behavior. Ensure this is intended and consider the implications for reproducibility.

creatorrr added 3 commits September 4, 2024 16:06

refactor(agents-api): Minor refactor to make task execution workflow …

95e0df6

…easier to read Signed-off-by: Diwank Singh Tomer <[email protected]>

feat(agents-api): Add parallel flag to map reduce

1860574

Signed-off-by: Diwank Singh Tomer <[email protected]>

feat(agents-api): Add parallelism option to map-reduce step

c96e0fb

Signed-off-by: Diwank Singh Tomer <[email protected]>