Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(agents-api): Add parallelism option to map-reduce step #490

Merged
merged 9 commits into from
Sep 7, 2024

Conversation

creatorrr
Copy link
Contributor

@creatorrr creatorrr commented Sep 4, 2024

  • fix(agents-api,typespec): Fix chat/entry typespec models to include tool_calls
  • refactor(agents-api): Minor refactor to make task execution workflow easier to read
  • feat(agents-api): Add parallel flag to map reduce
  • feat(agents-api): Add parallelism option to map-reduce step

🚀 This description was created by Ellipsis for commit 888887a

Summary:

Introduced a parallelism option for map-reduce steps in agents-api, with updates to task execution workflows, TypeScript models, and tests, including fixes and refactoring.

Key points:

  • fix(agents-api,typespec): Fix chat/entry typespec models to include tool_calls
  • refactor(agents-api): Minor refactor to make task execution workflow easier to read
  • feat(agents-api): Add parallel flag to map reduce
  • feat(agents-api): Add parallelism option to map-reduce step
  • Feature Addition: Added parallelism option for map-reduce steps in agents-api.
  • Code Changes:
    • Updated agents-api/agents_api/activities/task_steps/base_evaluate.py to handle extra lambda functions.
    • Modified agents-api/agents_api/activities/utils.py to include accumulate, map, range, and reduce in ALLOWED_FUNCTIONS.
    • Added parallelism attribute to MapReduceStep in agents-api/agents_api/autogen/Tasks.py.
    • Updated agents-api/agents_api/env.py to include task_max_parallelism.
    • Refactored agents-api/agents_api/workflows/task_execution/__init__.py to handle parallel map-reduce execution.
    • Created agents-api/agents_api/workflows/task_execution/helpers.py for helper functions supporting parallel execution.
  • Tests: Added tests in agents-api/tests/test_execution_workflow.py for parallel map-reduce functionality.
  • TypeScript SDK: Updated models and schemas in sdks/ts/src/api/models and sdks/ts/src/api/schemas to support parallelism in map-reduce steps.
  • Typespec Update: Modified typespec/tasks/steps.tsp to include parallelism in MapReduceStep.

Generated with ❤️ by ellipsis.dev

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ Changes requested. Reviewed everything up to c96e0fb in 32 seconds

More details
  • Looked at 969 lines of code in 16 files
  • Skipped 0 files when reviewing.
  • Skipped posting 3 drafted comments based on config settings.
1. agents-api/agents_api/activities/utils.py:28
  • Draft comment:
    Ensure that the use of map and reduce in ALLOWED_FUNCTIONS is safe and does not introduce security vulnerabilities. Consider restricting their use or validating inputs.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable:
    The comment is speculative, asking to ensure safety, which violates the rule against speculative comments. It doesn't point out a definite issue but rather suggests a potential concern. The comment doesn't provide a specific actionable change or point out a definite problem.
    The comment does touch on a valid concern about security, which is important. However, it doesn't specify a concrete issue or suggest a specific change, making it less actionable.
    While security is important, the comment should specify a definite issue or suggest a specific change to be actionable. Speculative comments are not useful according to the rules.
    The comment should be removed because it is speculative and does not point out a definite issue or suggest a specific change.
2. agents-api/agents_api/workflows/task_execution/helpers.py:22
  • Draft comment:
    Ensure that all usages of continue_as_child handle errors appropriately and consistently across the codebase.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The continue_as_child function is used multiple times across the codebase. It might be beneficial to ensure that all its usages are consistent and handle errors appropriately.
3. agents-api/agents_api/workflows/task_execution/helpers.py:193
  • Draft comment:
    Ensure that the parallelism value is validated and does not exceed system capabilities to prevent resource exhaustion.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable:
    The comment is suggesting a validation step that is not present in the code. The code does limit parallelism to task_max_parallelism, but it does not check against system capabilities. This could be a valid concern if task_max_parallelism is not set appropriately. However, the comment is speculative as it assumes task_max_parallelism might not be set correctly.
    The comment might be unnecessary if task_max_parallelism is already configured to match system capabilities. Without knowing the configuration context, it's hard to judge the necessity of the comment.
    The comment could be useful if there's a risk that task_max_parallelism is not aligned with system capabilities, but this is speculative without additional context.
    The comment is speculative and assumes a potential misconfiguration of task_max_parallelism. Without evidence of such a misconfiguration, the comment should be removed.

Workflow ID: wflow_zTq2u3XyqQ43gIjj


Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ Changes requested. Incremental review on 36526f9 in 24 seconds

More details
  • Looked at 105 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 0 drafted comments based on config settings.

Workflow ID: wflow_7hjinBgeEiji4Gzd


Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

agents-api/tests/test_execution_workflow.py Outdated Show resolved Hide resolved
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on 67d1d63 in 44 seconds

More details
  • Looked at 226 lines of code in 4 files
  • Skipped 0 files when reviewing.
  • Skipped posting 3 drafted comments based on config settings.
1. agents-api/agents_api/workflows/task_execution/helpers.py:210
  • Draft comment:
    The assert parallelism > 1 check is too restrictive. Consider allowing parallelism of 1, which means no parallel execution but is still valid.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable:
    The comment is about a change in the diff, specifically the assertion assert parallelism > 1. The suggestion to allow parallelism of 1 could be a valid point if there's a use case for non-parallel execution within this function. However, the current implementation seems to assume that parallelism should always be greater than 1 for this function to make sense. The comment is actionable as it suggests a specific change to the code.
    The comment does not provide a strong justification for why allowing parallelism of 1 would be beneficial. It assumes that non-parallel execution is a valid use case without evidence.
    The comment is still actionable as it suggests a specific change. The decision to allow parallelism of 1 would depend on the intended use cases for this function, which the comment does not address.
    The comment is about a change in the diff and suggests a specific, actionable change. It should be kept for further consideration by the PR author.
2. agents-api/agents_api/workflows/task_execution/helpers.py:224
  • Draft comment:
    Ensure that all tasks created with asyncio.create_task are awaited to prevent potential resource leaks.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The asyncio.create_task usage in execute_map_reduce_step_parallel is correct, but ensure that all tasks are awaited to prevent potential resource leaks.
3. agents-api/agents_api/workflows/task_execution/helpers.py:246
  • Draft comment:
    The schedule_to_close_timeout was increased from 2 to 5 seconds. Ensure this aligns with expected execution times and doesn't introduce unnecessary delays.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The schedule_to_close_timeout in execute_map_reduce_step_parallel was increased from 2 to 5 seconds. This change should be validated to ensure it aligns with expected execution times.

Workflow ID: wflow_dMepXVCXJ0Ffcq0I


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

@creatorrr creatorrr force-pushed the f/parallel-map-reduce branch from 67d1d63 to 45106f2 Compare September 4, 2024 23:27
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ Changes requested. Incremental review on 45106f2 in 42 seconds

More details
  • Looked at 245 lines of code in 5 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 drafted comments based on config settings.
1. agents-api/agents_api/workflows/task_execution/helpers.py:246
  • Draft comment:
    The increase in schedule_to_close_timeout from 2 to 5 seconds should be justified or documented to ensure it aligns with performance expectations.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The schedule_to_close_timeout was increased from 2 to 5 seconds. This change should be justified or documented to ensure it aligns with performance expectations.
2. agents-api/tests/test_execution_workflow.py:700
  • Draft comment:
    Consider testing various levels of parallelism, not just a fixed value of 10, to ensure robustness.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The test for map-reduce step parallelism was changed to a fixed parallelism of 10. It would be beneficial to test various levels of parallelism to ensure robustness.

Workflow ID: wflow_6BlEWpLQSn23Oxj0


Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ Changes requested. Incremental review on d139570 in 42 seconds

More details
  • Looked at 262 lines of code in 8 files
  • Skipped 0 files when reviewing.
  • Skipped posting 5 drafted comments based on config settings.
1. sdks/ts/src/api/schemas/$Tasks_CreateTaskRequest.ts:142
  • Draft comment:
    The parallelism field is correctly set with a minimum value of 1, ensuring valid configurations.
  • Reason this comment was not posted:
    Confidence changes required: 0%
    The parallelism field in the TypeScript schemas is correctly set with a minimum value of 1, ensuring that parallelism is always a positive integer. This is a good practice to prevent invalid configurations.
2. sdks/ts/src/api/schemas/$Tasks_PatchTaskRequest.ts:136
  • Draft comment:
    The parallelism field is correctly set with a minimum value of 1, ensuring valid configurations. This applies to similar lines in other TypeScript schema files.
  • Reason this comment was not posted:
    Confidence changes required: 0%
    The parallelism field in the TypeScript schemas is correctly set with a minimum value of 1, ensuring that parallelism is always a positive integer. This is a good practice to prevent invalid configurations. This comment applies to similar lines in other TypeScript schema files.
3. sdks/ts/src/api/schemas/$Tasks_Task.ts:142
  • Draft comment:
    The parallelism field is correctly set with a minimum value of 1, ensuring valid configurations. This applies to similar lines in other TypeScript schema files.
  • Reason this comment was not posted:
    Confidence changes required: 0%
    The parallelism field in the TypeScript schemas is correctly set with a minimum value of 1, ensuring that parallelism is always a positive integer. This is a good practice to prevent invalid configurations. This comment applies to similar lines in other TypeScript schema files.
4. sdks/ts/src/api/schemas/$Tasks_UpdateTaskRequest.ts:142
  • Draft comment:
    The parallelism field is correctly set with a minimum value of 1, ensuring valid configurations. This applies to similar lines in other TypeScript schema files.
  • Reason this comment was not posted:
    Confidence changes required: 0%
    The parallelism field in the TypeScript schemas is correctly set with a minimum value of 1, ensuring that parallelism is always a positive integer. This is a good practice to prevent invalid configurations. This comment applies to similar lines in other TypeScript schema files.
5. typespec/tasks/steps.tsp:255
  • Draft comment:
    The parallelism field is correctly set with a minimum value of 1, ensuring valid configurations.
  • Reason this comment was not posted:
    Confidence changes required: 0%
    The parallelism field in the TypeSpec file is correctly set with a minimum value of 1, ensuring that parallelism is always a positive integer. This is a good practice to prevent invalid configurations.

Workflow ID: wflow_QW2WELGI7MRvxN6I


Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

@creatorrr creatorrr force-pushed the f/parallel-map-reduce branch from d139570 to 50c8e66 Compare September 4, 2024 23:45
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on 50c8e66 in 37 seconds

More details
  • Looked at 241 lines of code in 8 files
  • Skipped 0 files when reviewing.
  • Skipped posting 3 drafted comments based on config settings.
1. agents-api/agents_api/activities/task_steps/base_evaluate.py:36
  • Draft comment:
    Using eval is a security risk as it can execute arbitrary code. Consider using a safer alternative, such as ast.literal_eval, if possible.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable:
    The comment correctly identifies a security risk with the use of eval, which is relevant to the changes made in the diff. However, the suggestion to use ast.literal_eval is not applicable for evaluating lambda functions. The comment is partially useful as it highlights a security concern, but the suggested alternative is not viable.
    The comment could be misleading because it suggests an alternative that doesn't work for the use case. It might be better to suggest a more appropriate solution or simply highlight the risk without suggesting an incorrect alternative.
    While the suggestion is incorrect, the identification of a security risk is still valuable. The comment could be improved by removing the incorrect suggestion and focusing on the risk.
    Keep the comment but acknowledge that the suggestion is incorrect. The security risk is valid, but the alternative provided is not applicable.
2. agents-api/agents_api/autogen/Tasks.py:353
  • Draft comment:
    The parallelism field is correctly annotated with a minimum value of 1, ensuring it is always a positive integer. This is a good practice.
  • Reason this comment was not posted:
    Confidence changes required: 0%
    The parallelism field in the MapReduceStep model in Tasks.py is correctly annotated with a minimum value of 1, ensuring that parallelism is always a positive integer. This is a good practice to prevent invalid configurations.
3. agents-api/tests/test_execution_workflow.py:702
  • Draft comment:
    The loop for testing different parallelism values in map-reduce steps is a good approach for comprehensive testing.
  • Reason this comment was not posted:
    Confidence changes required: 0%
    The test for map-reduce step parallelism in test_execution_workflow.py is well-structured, using a loop to test different parallelism values. This ensures comprehensive testing of the feature.

Workflow ID: wflow_Iy7jUbiDda5BDRcH


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on c54f172 in 22 seconds

More details
  • Looked at 87 lines of code in 6 files
  • Skipped 0 files when reviewing.
  • Skipped posting 1 drafted comments based on config settings.
1. agents-api/agents_api/autogen/Tasks.py:355
  • Draft comment:
    The description for the parallelism field is slightly misleading. It mentions running the reduce expression in parallel, but it should be clarified that it applies to the map step as well.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The parallelism field has been added to the MapReduceStep class with a range constraint of 1 to 100. This is a good practice to prevent invalid values. However, the description of the parallelism field in the docstring is slightly misleading. It mentions running the reduce expression in parallel, but it should be clarified that it applies to the map step as well.

Workflow ID: wflow_5K8waNlE76T5G9G2


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

@creatorrr creatorrr merged commit 60a0129 into dev-tasks Sep 7, 2024
1 of 5 checks passed
@creatorrr creatorrr deleted the f/parallel-map-reduce branch September 7, 2024 02:05
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ Changes requested. Incremental review on 888887a in 58 seconds

More details
  • Looked at 205 lines of code in 12 files
  • Skipped 0 files when reviewing.
  • Skipped posting 1 drafted comments based on config settings.
1. agents-api/agents_api/activities/utils.py:42
  • Draft comment:
    Adding time to ALLOWED_FUNCTIONS can lead to non-deterministic behavior. Ensure this is intended and consider the implications for reproducibility.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable:
    The comment is about a change made in the diff, specifically the addition of the 'time' function. It raises a valid concern about non-deterministic behavior, which is a potential issue when using the 'time' function in evaluations. However, the comment asks to ensure the behavior is intended, which violates the rule against asking for confirmation or ensuring intentions.
    The comment does raise a valid point about non-deterministic behavior, which is important for code quality. However, it also asks for confirmation, which is against the rules.
    While the comment does ask for confirmation, the concern about non-deterministic behavior is significant enough to warrant keeping the comment, as it directly relates to a potential issue introduced by the change.
    Keep the comment because it raises a valid concern about non-deterministic behavior, which is directly related to the change made in the diff.

Workflow ID: wflow_HZiZuGG5Zu90FfTW


Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

"match_regex": lambda pattern, string: bool(re2.fullmatch(pattern, string)),
"max": max,
"min": min,
"random": random,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding random to ALLOWED_FUNCTIONS can lead to non-deterministic behavior. Ensure this is intended and consider the implications for reproducibility.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant