Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added the mapper tool to map trace to vSwarm proxies #541

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

KarthikL1729
Copy link
Collaborator

Added documentation for the mapper tool.

Summary

A small summary of the requirements (in one/two sentences).

The mapper tool is used to map the Azure trace functions to the vSwarm proxy functions based on the profiles of vSwarm functions and their correlation with the Azure trace function durations and memory utilization.

Implementation Notes ⚒️

  • Briefly outline the overall technical solution. If necessary, identify talking points where the reviewer's attention should be drawn to.

This PR adds the mapper tool, which loads the function duration and memory trace with Pandas, finds the optimal mapping based on modelling the problem as a linear sum assignment, and then outputs a file which maps each HashFunction in the trace to a proxy function in vSwarm. The profile needs to be unzipped before using the mapper, and the profile is included in a tar.gz file in the mapper directory.

External Dependencies 🍀

  • N/A

Breaking API Changes ⚠️

  • N/A

Simply specify none (N/A) if not applicable.

Added documentation for the mapper tool.

Signed-off-by: KarthikL1729 <[email protected]>
Copy link
Contributor

@leokondrashov leokondrashov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This a very brief look into a single function of the mapper. Please fix all of possible problems before submitting the PR.

log.info(
f"Getting closest proxy function for every trace function. Note: Mapping may not be unique"
)
trace_functions, err = get_closest_proxy_function(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is extremely unexpected behavior. If something goes sideways in the unique assignment, I'll get a non-unique one. I would be pretty annoyed to find that my tool tried to be too smart about the operation that I asked it to do. If operation is not possible, the tool should just exit.

try:

trace_functions = OrderedDict(trace_functions)
proxy_functions = OrderedDict(proxy_functions)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, why do you need ordereddict here?

Comment on lines +98 to +101
for tf in trace_functions:
if row_index == trace_functions[tf]["index"]:
trace = tf
break
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You linearly search in the dict. Are sure that this data structure is suitable for the task? You also literally created trace_list that has all of them in the index order.

@leokondrashov leokondrashov marked this pull request as draft November 1, 2024 12:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants