-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Container support for AWS #205
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice progress! I left some comments.
The important issues are:
- CLI option and documentation
- Testing Node.js
- Moving resource allocation to proper resource classes.
- Checking the "update" of container-based function on AWS.
- Regression testing
Hi, Marcin, I have reviewed the changes. Also, I performed the regression tests for both Node.js and Python. |
@coderabbitai review |
Actions performedReview triggered.
|
Caution Review failedThe pull request is closed. WalkthroughThis pull request introduces a variety of changes across multiple files to enhance support for containerized deployment in serverless benchmarks. Key modifications include the addition of a new setup script for a Python package, updates to configuration files to reflect AWS deployment settings, and enhancements to Dockerfiles for both Python and Node.js functions. Additionally, several classes and methods across the codebase have been updated to accommodate new parameters related to container deployment, improving the functionality and flexibility of the deployment process. Changes
Possibly related issues
Possibly related PRs
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 13
Outside diff range and nitpick comments (4)
sebs/openwhisk/openwhisk.py (2)
Line range hint
123-206
: Optimize Docker Image Build and Push Logic.The method includes redundant checks and can be optimized for clarity and performance.
- if registry_name is not None and registry_name != "": - repository_name = f"{registry_name}/{repository_name}" - else: - registry_name = "Docker Hub" + repository_name = f"{registry_name}/{repository_name}" if registry_name else "Docker Hub" # Check if the image is already in the registry. if is_cached and self.find_image(self.docker_client, repository_name, image_tag): self.logging.info( f"Skipping building OpenWhisk Docker package for {benchmark}, using " f"Docker image {repository_name}:{image_tag} from registry: " f"{registry_name}." ) return False, repository_name self.logging.info( f"Image {repository_name}:{image_tag} doesn't exist in the registry, " f"building OpenWhisk package for {benchmark}." ) build_dir = os.path.join(directory, "docker") os.makedirs(build_dir, exist_ok=True) shutil.copy( os.path.join(DOCKER_DIR, self.name(), language_name, "Dockerfile.function"), os.path.join(build_dir, "Dockerfile"), ) for fn in os.listdir(directory): if fn not in ("index.js", "__main__.py"): file = os.path.join(directory, fn) shutil.move(file, build_dir) with open(os.path.join(build_dir, ".dockerignore"), "w") as f: f.write("Dockerfile") builder_image = self.system_config.benchmark_base_images(self.name(), language_name)[ language_version ] self.logging.info(f"Build the benchmark base image {repository_name}:{image_tag}.") buildargs = {"VERSION": language_version, "BASE_IMAGE": builder_image} image, _ = self.docker_client.images.build( tag=f"{repository_name}:{image_tag}", path=build_dir, buildargs=buildargs ) # Now push the image to the registry self.logging.info( f"Push the benchmark base image {repository_name}:{image_tag} " f"to registry: {registry_name}." ) self.push_image_to_repository(self.docker_client, repository_name, image_tag) return True, repository_name
Line range hint
255-335
: Optimize Function Creation Logic.The method includes redundant checks and can be optimized for clarity and performance.
- function_found = False - docker_image = "" + function_found = any( + func_name in line.split()[0] for line in actions.stdout.decode().split("\n") if line + ) function_cfg = OpenWhiskFunctionConfig.from_benchmark(code_package) function_cfg.storage = cast(Minio, self.get_storage()).config if function_found: # docker image is overwritten by the update res = OpenWhiskFunction( func_name, code_package.benchmark, code_package.hash, function_cfg ) # Update function - we don't know what version is stored self.logging.info(f"Retrieved existing OpenWhisk action {func_name}.") self.update_function(res, code_package, container_deployment, container_uri) else: try: self.logging.info(f"Creating new OpenWhisk action {func_name}") docker_image = self.system_config.benchmark_image_name( self.name(), code_package.benchmark, code_package.language_name, code_package.language_version, ) subprocess.run( [ *self.get_wsk_cmd(), "action", "create", func_name, "--web", "true", "--docker", docker_image, "--memory", str(code_package.benchmark_config.memory), "--timeout", str(code_package.benchmark_config.timeout * 1000), *self.storage_arguments(), code_package.code_location, ], stderr=subprocess.PIPE, stdout=subprocess.PIPE, check=True, ) function_cfg.docker_image = docker_image res = OpenWhiskFunction( func_name, code_package.benchmark, code_package.hash, function_cfg ) except subprocess.CalledProcessError as e: self.logging.error(f"Cannot create action {func_name}.") self.logging.error(f"Output: {e.stderr.decode('utf-8')}") raise RuntimeError(e) except FileNotFoundError: self.logging.error("Could not retrieve OpenWhisk functions - is path to wsk correct?") raise RuntimeError("Failed to access wsk binary") # Add LibraryTrigger to a new function trigger = LibraryTrigger(func_name, self.get_wsk_cmd()) trigger.logging_handlers = self.logging_handlers res.add_trigger(trigger) return ressebs.py (1)
Line range hint
541-568
: Optimize Resource Listing Logic.The function includes redundant variables and can be optimized for clarity and performance.
- for idx, bucket in enumerate(buckets): + for bucket in buckets: sebs_client.logging.info(f"({idx}) {bucket}") elif resource == "resource-groups": if deployment_client.name() != "azure": sebs_client.logging.error("Resource groups are only supported on Azure!") return - groups = deployment_client.config.resources.list_resource_groups( - deployment_client.cli_instance - ) - sebs_client.logging.info("Resource grup:") - for idx, bucket in enumerate(groups): - sebs_client.logging.info(f"({idx}) {bucket}") + groups = deployment_client.config.resources.list_resource_groups(deployment_client.cli_instance) + sebs_client.logging.info("Resource group:") + for group in groups: + sebs_client.logging.info(f"{group}")sebs/aws/aws.py (1)
Line range hint
150-188
: Use f-strings for improved readability.Convert format calls to f-strings for better readability.
- execute("zip -qu -r9 {}.zip * .".format(benchmark), shell=True, cwd=directory) - benchmark_archive = "{}.zip".format(os.path.join(directory, benchmark)) - self.logging.info("Created {} archive".format(benchmark_archive)) + execute(f"zip -qu -r9 {benchmark}.zip * .", shell=True, cwd=directory) + benchmark_archive = f"{os.path.join(directory, benchmark)}.zip" + self.logging.info(f"Created {benchmark_archive} archive")
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (22)
- benchmarks/wrappers/aws/python/setup.py (1 hunks)
- config/example.json (2 hunks)
- config/systems.json (2 hunks)
- dockerfiles/aws/nodejs/Dockerfile.function (1 hunks)
- dockerfiles/aws/python/Dockerfile.function (1 hunks)
- docs/modularity.md (1 hunks)
- docs/usage.md (1 hunks)
- sebs.py (14 hunks)
- sebs/aws/aws.py (10 hunks)
- sebs/aws/config.py (5 hunks)
- sebs/azure/azure.py (6 hunks)
- sebs/benchmark.py (6 hunks)
- sebs/cache.py (4 hunks)
- sebs/config.py (2 hunks)
- sebs/experiments/config.py (4 hunks)
- sebs/experiments/invocation_overhead.py (1 hunks)
- sebs/faas/system.py (4 hunks)
- sebs/gcp/gcp.py (4 hunks)
- sebs/local/local.py (3 hunks)
- sebs/openwhisk/openwhisk.py (9 hunks)
- sebs/regression.py (1 hunks)
- sebs/sebs.py (2 hunks)
Files not reviewed due to errors (1)
- sebs/gcp/gcp.py (no review received)
Files skipped from review due to trivial changes (3)
- benchmarks/wrappers/aws/python/setup.py
- dockerfiles/aws/nodejs/Dockerfile.function
- dockerfiles/aws/python/Dockerfile.function
Additional context used
Ruff
sebs/experiments/config.py
64-64: Use
config.get("flags", {})
instead of anif
blockReplace with
config.get("flags", {})
(SIM401)
sebs/sebs.py
110-110: Use f-string instead of
format
callConvert to f-string
(UP032)
115-115: Use f-string instead of
format
callConvert to f-string
(UP032)
sebs/local/local.py
154-154: Use f-string instead of
format
callConvert to f-string
(UP032)
sebs/experiments/invocation_overhead.py
41-41: Avoid extraneous parentheses
Remove extraneous parentheses
(UP034)
sebs/faas/system.py
263-263: Use f-string instead of
format
callConvert to f-string
(UP032)
sebs/aws/config.py
3-3:
uuid
imported but unusedRemove unused import:
uuid
(F401)
243-243: Use f-string instead of
format
callConvert to f-string
(UP032)
266-266: Use
dct.get("lambda-role", "")
instead of anif
blockReplace with
dct.get("lambda-role", "")
(SIM401)
sebs/openwhisk/openwhisk.py
238-238: Use f-string instead of
format
callConvert to f-string
(UP032)
sebs.py
139-139: Use f-string instead of
format
callConvert to f-string
(UP032)
370-370: Local variable
config
is assigned to but never usedRemove assignment to unused variable
config
(F841)
618-618: Loop control variable
idx
not used within loop bodyRename unused
idx
to_idx
(B007)
sebs/azure/azure.py
217-217: Use f-string instead of
format
callConvert to f-string
(UP032)
sebs/benchmark.py
180-180: Use f-string instead of
format
callConvert to f-string
(UP032)
490-490: Use f-string instead of
format
callConvert to f-string
(UP032)
sebs/gcp/gcp.py
199-199: Use f-string instead of
format
callConvert to f-string
(UP032)
sebs/aws/aws.py
80-80: Do not use mutable data structures for argument defaults
Replace with
None
; initialize within function(B006)
176-176: Use f-string instead of
format
callConvert to f-string
(UP032)
182-182: Use f-string instead of
format
callConvert to f-string
(UP032)
185-185: Use f-string instead of
format
callConvert to f-string
(UP032)
378-378: Use f-string instead of
format
callConvert to f-string
(UP032)
393-393: Use context handler for opening files
(SIM115)
401-401: Use f-string instead of
format
callConvert to f-string
(UP032)
408-410: Use f-string instead of
format
callConvert to f-string
(UP032)
516-520: Use f-string instead of
format
callConvert to f-string
(UP032)
Additional comments not posted (48)
config/example.json (2)
7-7
: Reminder: Ensure the new parametercontainer_deployment
is documented.The parameter
container_deployment
is a new addition. Ensure it is documented as an optional feature.
50-54
: Clarification ondocker_registry
key.The
docker_registry
key includes registry name, username, password, and container repository. Ensure these details are correctly configured.sebs/experiments/config.py (4)
10-10
: New attribute_container_deployment
added.The attribute
_container_deployment
is a new addition to theConfig
class.
35-38
: New propertycontainer_deployment
added.The property
container_deployment
is a new addition to theConfig
class.
50-50
: Includecontainer_deployment
in serialization.The
container_deployment
attribute is now included in the serialized output.
62-62
: Includecontainer_deployment
in deserialization.The
container_deployment
attribute is now included in the deserialized input.sebs/config.py (2)
42-44
: New methodsupported_container_deployment
added.The method
supported_container_deployment
is a new addition to theSeBSConfig
class.
60-62
: Updatebenchmark_image_name
method.The method
benchmark_image_name
now includes an optionalregistry
parameter.config/systems.json (2)
3-4
: New parametercontainer_deployment
added.The parameter
container_deployment
is a new addition under thegeneral
section.
74-75
: Includestorage.py
andsetup.py
in deployment configuration.The files
storage.py
andsetup.py
are now included in theaws
deployment configuration for Python.docs/usage.md (1)
7-7
: Clarify the scope of the--container-deployment
flag.The sentence "To enforce redeployment of code, benchmark inputs, container deployment (supported in AWS) please use flags
--update-code
,--update-storage
and--container-deployment
respectively." could be clearer. Consider rephrasing for better readability.- To enforce redeployment of code, benchmark inputs, container deployment (supported in AWS) please use flags `--update-code`, `--update-storage` and `--container-deployment` respectively. + To enforce redeployment of code, benchmark inputs, and container deployment (supported in AWS), please use the flags `--update-code`, `--update-storage`, and `--container-deployment` respectively.sebs/sebs.py (1)
88-89
: Use f-string instead offormat
call.Convert to f-string for better readability and performance.
- raise RuntimeError("Deployment {name} not supported!".format(name=name)) + raise RuntimeError(f"Deployment {name} not supported!")Likely invalid or redundant comment.
sebs/local/local.py (1)
135-136
: Document the new parametercontainer_deployment
.The
container_deployment
parameter is added but not documented. Add a docstring to explain its purpose.""" It would be sufficient to just pack the code and ship it as zip to AWS. However, to have a compatible function implementation across providers, we create a small module. Issue: relative imports in Python when using storage wrapper. Azure expects a relative import inside a module. Structure: function - function.py - storage.py - resources handler.py dir: directory where code is located benchmark: benchmark name container_deployment: flag indicating if container deployment is used """sebs/faas/system.py (7)
170-171
: LGTM!The addition of the
container_deployment
parameter in the abstract method signature is consistent with the changes required for container deployment support.
175-183
: LGTM!The abstract method
build_base_image
with the new signature is consistent with the changes required for container deployment support.
186-187
: LGTM!The abstract method
find_image
with the new signature is consistent with the changes required for container deployment support.
190-191
: LGTM!The abstract method
repository_authorization
with the new signature is consistent with the changes required for container deployment support.
194-195
: LGTM!The abstract method
push_image_to_repository
with the new signature is consistent with the changes required for container deployment support.
198-204
: LGTM!The addition of the
container_deployment
andcontainer_uri
parameters in the abstract methodcreate_function
is consistent with the changes required for container deployment support.
212-218
: LGTM!The addition of the
container_deployment
andcontainer_uri
parameters in the abstract methodupdate_function
is consistent with the changes required for container deployment support.sebs/cache.py (2)
165-171
: LGTM!The addition of the
container_uri
parameter in theadd_code_package
function is consistent with the changes required for container deployment support.
241-245
: LGTM!The addition of the
container_uri
parameter in theupdate_code_package
function is consistent with the changes required for container deployment support.sebs/aws/config.py (4)
229-238
: LGTM!The function
check_ecr_repository_exists
correctly checks if an ECR repository exists and handles exceptions appropriately.
239-255
: LGTM!The function
create_ecr_repository
correctly creates an ECR repository if it does not exist and handles exceptions appropriately.Tools
Ruff
243-243: Use f-string instead of
format
callConvert to f-string
(UP032)
256-265
: LGTM!The
initialize
function correctly initializes the Docker registry details.
287-298
: LGTM!The
update_cache
function correctly updates the cache with Docker registry details.sebs/regression.py (1)
346-364
: LGTM!The
regression_suite
function correctly includes assertions for deployment keys.docs/modularity.md (1)
343-343
: Update Documentation for New Parameters inupdate_function
.The
update_function
method now acceptscontainer_deployment
andcontainer_uri
parameters, which should be documented to explain their purpose and usage.sebs/openwhisk/openwhisk.py (5)
114-122
: LGTM! Proper Error Handling inpush_image_to_repository
.The method correctly handles errors by logging and raising a RuntimeError if an error is found in the response.
Line range hint
215-239
: LGTM! Convert Format Call to f-String.The method correctly handles packaging the code for deployment. Convert the format call to an f-string for consistency.
- self.logging.info("Zip archive size {:2f} MB".format(bytes_size / 1024.0 / 1024.0)) + self.logging.info(f"Zip archive size {bytes_size / 1024.0 / 1024.0:.2f} MB")
Line range hint
337-365
: LGTM! Proper Function Update Handling.The method correctly updates the function and handles errors.
Line range hint
518-518
: LGTM! Proper Cold Start Enforcement.The method correctly updates the function configuration to enforce a cold start.
Line range hint
97-112
: Ensure Proper Handling of Docker Manifest Inspection.The
find_image
method attempts to inspect a Docker manifest. Ensure that theexperimentalManifest
configuration is correctly set and that theexecute
function handles errors gracefully.Verification successful
Ensure Proper Handling of Docker Manifest Inspection.
The
execute
function insebs/utils.py
raises aRuntimeError
if the command fails, which ensures that errors are handled gracefully. TheexperimentalManifest
configuration is correctly set and used in the codebase.
sebs/utils.py
: Theexecute
function raises aRuntimeError
on failure, ensuring proper error handling.sebs/openwhisk/openwhisk.py
: Usage ofexperimentalManifest
configuration is appropriate.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the experimentalManifest configuration and error handling in the execute function. # Test: Search for the experimentalManifest configuration usage. Expect: Proper error handling and configuration. rg 'experimentalManifest'Length of output: 596
Script:
#!/bin/bash # Description: Verify the error handling in the execute function. # Search for the definition of the execute function rg 'def execute'Length of output: 292
Script:
#!/bin/bash # Description: Inspect the implementation of the execute function in sebs/utils.py for error handling. # Display the implementation of the execute function in sebs/utils.py rg --context 10 'def execute' sebs/utils.pyLength of output: 719
sebs.py (4)
100-104
: LGTM! Proper Addition of Container Deployment Option.The new option for container deployment is correctly added and documented.
Line range hint
125-151
: LGTM! Convert Format Call to f-String.The function correctly updates the configuration and handles the new option. Convert the format call to an f-string for consistency.
- sebs_client.logging.info("Created experiment output at {}".format(output_dir)) + sebs_client.logging.info(f"Created experiment output at {output_dir}")
Line range hint
231-243
: LGTM! Proper Invocation Logic Update.The function correctly updates the invocation logic and handles the new container deployment option.
347-347
: LGTM! Proper Regression Logic Update.The function correctly updates the regression logic and handles the new container deployment option.
sebs/azure/azure.py (2)
Line range hint
163-218
: LGTM! Convert Format Call to f-String.The method correctly handles packaging the code for deployment and raises an error for unsupported container deployment. Convert the format call to an f-string for consistency.
- execute("zip -qu -r9 {}.zip * .".format(benchmark), shell=True, cwd=directory) + execute(f"zip -qu -r9 {benchmark}.zip * .", shell=True, cwd=directory)
292-298
: LGTM! Proper Function Update Handling.The method correctly updates the function and handles errors.
sebs/benchmark.py (5)
127-131
: LGTM!The
container_uri
property is correctly implemented.
144-147
: LGTM!The
container_deployment
property is correctly implemented.
177-180
: LGTM! Consider using an f-string.The
_container_deployment
attribute is initialized correctly. You can optimize the string formatting.- raise RuntimeError("Benchmark {benchmark} not found!".format(benchmark=self._benchmark)) + raise RuntimeError(f"Benchmark {self._benchmark} not found!")Tools
Ruff
180-180: Use f-string instead of
format
callConvert to f-string
(UP032)
483-495
: LGTM!The
build
method handles the container deployment flag and URI correctly.Also applies to: 516-547
Tools
Ruff
490-490: Use f-string instead of
format
callConvert to f-string
(UP032)
538-544
: LGTM!The cache update logic correctly includes the container URI.
sebs/aws/aws.py (4)
202-208
: LGTM!The code changes are approved.
494-496
: Use f-strings for improved readability.Convert format calls to f-strings for better readability.
- self.client.update_function_configuration( - FunctionName=name, - Timeout=function.config.timeout, - MemorySize=function.config.memory, - ) + self.client.update_function_configuration(FunctionName=name, Timeout=function.config.timeout, MemorySize=function.config.memory)Likely invalid or redundant comment.
185-185
: Use context manager for file operations.Use a context manager to open files to ensure they are properly closed after use.
- package_body = open(package, "rb").read() + with open(package, "rb") as file: + package_body = file.read()Likely invalid or redundant comment.
Tools
Ruff
185-185: Use f-string instead of
format
callConvert to f-string
(UP032)
67-79
: Avoid using mutable default arguments.Replace mutable default arguments with
None
and initialize them within the function.- if self.config.resources.docker_username: - if self.config.resources.docker_registry: - docker_client.login( - username=self.config.resources.docker_username, - password=self.config.resources.docker_password, - registry=self.config.resources.docker_registry, - ) - else: - docker_client.login( - username=self.config.resources.docker_username, - password=self.config.resources.docker_password, - ) + docker_username = self.config.resources.docker_username + docker_password = self.config.resources.docker_password + docker_registry = self.config.resources.docker_registry + if docker_username and docker_password: + if docker_registry: + docker_client.login( + username=docker_username, + password=docker_password, + registry=docker_registry, + ) + else: + docker_client.login( + username=docker_username, + password=docker_password, + )Likely invalid or redundant comment.
@@ -53,6 +59,7 @@ def deserialize(config: dict) -> "Config": | |||
cfg._update_code = config["update_code"] | |||
cfg._update_storage = config["update_storage"] | |||
cfg._download_results = config["download_results"] | |||
cfg._container_deployment = config["container_deployment"] | |||
cfg._runtime = Runtime.deserialize(config["runtime"]) | |||
cfg._flags = config["flags"] if "flags" in config else {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optimize dictionary access.
Use config.get("flags", {})
instead of an if
block.
- cfg._flags = config["flags"] if "flags" in config else {}
+ cfg._flags = config.get("flags", {})
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
cfg._flags = config["flags"] if "flags" in config else {} | |
cfg._flags = config.get("flags", {}) |
Tools
Ruff
64-64: Use
config.get("flags", {})
instead of anif
blockReplace with
config.get("flags", {})
(SIM401)
@@ -151,9 +153,15 @@ def package_code( | |||
mbytes = bytes_size / 1024.0 / 1024.0 | |||
self.logging.info("Function size {:2f} MB".format(mbytes)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use f-string instead of format
call.
Convert to f-string for better readability and performance.
- self.logging.info("Function size {:2f} MB".format(mbytes))
+ self.logging.info(f"Function size {mbytes:.2f} MB")
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
self.logging.info("Function size {:2f} MB".format(mbytes)) | |
self.logging.info(f"Function size {mbytes:.2f} MB") |
Tools
Ruff
154-154: Use f-string instead of
format
callConvert to f-string
(UP032)
sebs/aws/aws.py
Outdated
code_package.benchmark, | ||
code_package.language_name, | ||
code_package.language_version, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use f-strings for improved readability.
Convert format calls to f-strings for better readability.
- func_name = "{}-{}-{}".format(
- code_package.benchmark,
- code_package.language_name,
- code_package.language_version,
- )
+ func_name = f"{code_package.benchmark}-{code_package.language_name}-{code_package.language_version}"
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
code_package.benchmark, | |
code_package.language_name, | |
code_package.language_version, | |
func_name = f"{code_package.benchmark}-{code_package.language_name}-{code_package.language_version}" |
@@ -41,7 +41,7 @@ def before_sample(self, size: int, input_benchmark: dict): | |||
arr = bytearray((random.getrandbits(8) for i in range(size))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid extraneous parentheses.
Remove the extraneous parentheses for better readability.
- arr = bytearray((random.getrandbits(8) for i in range(size)))
+ arr = bytearray(random.getrandbits(8) for i in range(size))
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
arr = bytearray((random.getrandbits(8) for i in range(size))) | |
arr = bytearray(random.getrandbits(8) for i in range(size)) |
Tools
Ruff
41-41: Avoid extraneous parentheses
Remove extraneous parentheses
(UP034)
Remaining issues:
|
…iders that support container deployment]
…ate function with container URI
…uthorization, and rase NotImplementedError for GCP and Azure
df21c86
to
9df1aa0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 27
🧹 Outside diff range and nitpick comments (34)
dockerfiles/aws/nodejs/Dockerfile.function (3)
1-2
: Consider adding a default base image value and version constraints.While using ARG for base image is flexible, providing a default value would prevent build failures when BASE_IMAGE is not specified.
-ARG BASE_IMAGE +ARG BASE_IMAGE=public.ecr.aws/lambda/nodejs:18
9-9
: Set explicit working directory for handler resolution.While the CMD is correctly formatted for AWS Lambda, setting an explicit WORKDIR would make the handler resolution more reliable.
+WORKDIR /function CMD ["handler.handler"]
1-9
: Consider multi-architecture build support.Since AWS Lambda supports both x86_64 and arm64 architectures, and PR objectives mention testing for both architectures, consider implementing multi-arch builds:
- Use BuildKit's
--platform
flag during build- Tag images appropriately with architecture suffix
- Consider using Docker manifests for transparent arch selection
Example build command structure:
docker buildx build --platform linux/amd64,linux/arm64 \ --build-arg BASE_IMAGE=public.ecr.aws/lambda/nodejs:18 .
dockerfiles/aws/python/Dockerfile.function (2)
1-4
: Consider adding version validationWhile the setup is correct, consider adding a build-time check to validate the Python version to prevent issues with unsupported versions.
ARG BASE_IMAGE FROM $BASE_IMAGE ARG VERSION +RUN if [ -z "$VERSION" ]; then echo "Python version not specified" && exit 1; fi ENV PYTHON_VERSION=${VERSION}
1-11
: Address security concerns in container configurationSeveral security improvements should be considered:
- Run the container as a non-root user
- Add security scanning for dependencies
- Set appropriate file permissions
Add these security enhancements:
ARG BASE_IMAGE FROM $BASE_IMAGE ARG VERSION ENV PYTHON_VERSION=${VERSION} +# Create non-root user +RUN useradd -m -s /bin/bash appuser + COPY . function/ RUN touch function/__init__.py \ && if test -f "function/reqUirements.txt.${PYTHON_VERSION}"; then pip install --no-cache-dir -r function/requirements.txt -r function/requirements.txt.${PYTHON_VERSION} function/ ; else pip install --no-cache-dir -r function/requirements.txt function/ ; fi +# Set appropriate permissions +RUN chown -R appuser:appuser /function + +# Switch to non-root user +USER appuser + CMD ["function/handler.handler"]Consider also adding a security scanning step in your CI/CD pipeline using tools like Snyk or Trivy to scan for vulnerabilities in dependencies and container configuration.
config/example.json (1)
9-12
: Document multi-runtime container support.While this example shows Python 3.8 configuration, the PR objectives mention Node.js support and testing. Consider adding documentation or a separate example for Node.js container deployment configuration.
Consider adding comments or creating a separate example file:
"runtime": { "language": "python", # Supported languages: python, nodejs "version": "3.8" # Python: 3.8, Node.js: 16 },sebs/local/local.py (1)
Container deployment implementation needs cleanup handling improvements
The verification reveals several concerns with the container deployment implementation:
- Unlike Azure and GCP that explicitly reject container deployment with
NotImplementedError
, the Local platform silently accepts it without validation.- While there is a
remove_containers
flag (defaulting toTrue
), it only handles container removal during normal operation. There's no cleanup implementation for:
- Container images built during deployment
- Containers that may remain if the process crashes
- Allocated ports in the
allocated_ports
set🔗 Analysis chain
Line range hint
134-298
: Verify container deployment support for local platform.The container deployment implementation needs additional considerations:
- Early validation of container deployment support
- Cleanup of container images when functions are destroyed
Let's verify the container deployment implementation:
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check for container deployment support in other files # Test 1: Check if container deployment is validated echo "Checking for container deployment validation..." rg -l "container_deployment.*not.*supported" . # Test 2: Check for container cleanup implementation echo "Checking for container cleanup..." rg -l "container.*cleanup|cleanup.*container" .Length of output: 280
Script:
#!/bin/bash # Let's try a different approach to verify container deployment implementation # Check for container-related configuration and cleanup in the Local class echo "Checking Local class implementation..." rg -A 5 -B 5 "class Local" . # Check for container cleanup in destructor or cleanup methods echo "Checking destructor and cleanup methods..." ast-grep --pattern 'def __del__($$$) { $$$ }' # Check container deployment flag usage echo "Checking container_deployment flag usage..." rg -A 3 "container_deployment:" . # Check for docker-related cleanup echo "Checking docker cleanup..." rg -A 5 "docker.*clean|remove_container" .Length of output: 10543
sebs/cache.py (1)
249-252
: Consider adding container URI validation.The method correctly forwards the container URI to
add_code_package
. However, consider adding validation to ensure the URI follows the ECR format (e.g.,{account}.dkr.ecr.{region}.amazonaws.com/{repository}:{tag}
).Example validation:
def _validate_container_uri(uri: str) -> bool: import re ecr_pattern = r'^\d+\.dkr\.ecr\.[a-z0-9-]+\.amazonaws\.com/[a-z0-9-_/]+:[a-zA-Z0-9-_.]+' return bool(re.match(ecr_pattern, uri))Also applies to: 295-295
docs/modularity.md (2)
343-343
: Document the new parameters in the update_function method.The method signature has been updated to support container deployment, but the new parameters need to be documented:
container_deployment
: Purpose and usage of this flagcontainer_uri
: Format and structure of the container URI
Line range hint
1-343
: Add documentation for container deployment support.The documentation needs to be updated to include the new container deployment feature for AWS. Consider adding:
A new section explaining container-based deployments:
- How to build container images locally
- Process for pushing images to ECR
- Configuration of container_deployment flag
- Format and usage of container_uri
- Caching mechanism for container-based functions
Updates to existing sections:
- Mention container deployment in the introduction
- Add container-specific configuration in the "New Benchmarks" section
- Update platform-specific documentation for AWS
Would you like me to help draft the new documentation sections?
sebs.py (1)
647-647
: Rename unused loop variable.The loop variable
idx
is not used within the loop body. Rename it to_
or_idx
to indicate it's intentionally unused.- for idx, group in enumerate(groups): + for _, group in enumerate(groups):🧰 Tools
🪛 Ruff
647-647: Loop control variable
idx
not used within loop bodyRename unused
idx
to_idx
(B007)
sebs/faas/system.py (7)
170-178
: Update docstring forpackage_code
methodThe
package_code
method signature has been updated to include thecontainer_deployment
parameter and now returns a tuple of(str, int, str)
. Please update the method's docstring to reflect the new parameter and explain the purpose of the additional return value.
175-183
: Add docstring for the newbuild_base_image
methodThe newly added
build_base_image
method lacks a docstring. Adding a docstring will improve code readability and maintainability by explaining the method's purpose, parameters, and return values.
185-187
: Add type annotations tofind_image
method parametersThe
find_image
method is missing type annotations for its parameters. Adding type hints enhances code clarity and aids static type checkers.Apply this diff to add type annotations:
@abstractmethod -def find_image(self, repository_client, repository_name, image_tag) -> bool: +def find_image( + self, + repository_client: Any, # Replace 'Any' with the specific type + repository_name: str, + image_tag: str, +) -> bool: pass
189-191
: Add type annotations topush_image_to_repository
method parametersSimilarly, the
push_image_to_repository
method lacks type annotations. Consider adding them for consistency and improved code quality.Apply this diff:
@abstractmethod -def push_image_to_repository(self, repository_client, repository_uri, image_tag): +def push_image_to_repository( + self, + repository_client: Any, # Replace 'Any' with the specific type + repository_uri: str, + image_tag: str, +): pass
193-200
: Updatecreate_function
method with type hints and docstringThe
create_function
method parameterscontainer_deployment
andcontainer_uri
are missing type annotations and a docstring. Adding these will enhance code clarity.Apply this diff:
@abstractmethod def create_function( self, code_package: Benchmark, func_name: str, - container_deployment, - container_uri, + container_deployment: bool, + container_uri: str, ) -> Function: + """ + Create a new function in the FaaS system. + + :param code_package: The benchmark code package. + :param func_name: The name of the function to create. + :param container_deployment: Flag indicating if container deployment is used. + :param container_uri: The URI of the container image. + :return: The created Function object. + """ pass
208-214
: Include type annotations and docstring forupdate_function
methodThe
update_function
method should also have type hints for the new parameters and a descriptive docstring.Apply this diff:
@abstractmethod def update_function( self, function: Function, code_package: Benchmark, - container_deployment, - container_uri, + container_deployment: bool, + container_uri: str, ): + """ + Update an existing function's code and configuration. + + :param function: The function to update. + :param code_package: The new benchmark code package. + :param container_deployment: Flag indicating if container deployment is used. + :param container_uri: The URI of the new container image. + """ pass
296-296
: Typographical error in log messageThere's a double "of" in the log message.
Apply this diff to correct it:
self.logging.info( f"Enforcing rebuild and update of of cached function " f"{func_name} with hash {function.code_package_hash}." ) + f"Enforcing rebuild and update of cached function " )
sebs/aws/config.py (4)
119-124
: Simplify default parameter handling in__init__
In the
__init__
method, you're checking for empty strings to assignNone
, which can be simplified by using default empty strings and theor
operator.Apply this diff to streamline the initialization:
def __init__( self, - registry: Optional[str] = None, - username: Optional[str] = None, - password: Optional[str] = None, + registry: str = "", + username: str = "", + password: str = "", ): super().__init__(name="aws") - self._docker_registry: Optional[str] = registry if registry != "" else None - self._docker_username: Optional[str] = username if username != "" else None - self._docker_password: Optional[str] = password if password != "" else None + self._docker_registry: Optional[str] = registry or None + self._docker_username: Optional[str] = username or None + self._docker_password: Optional[str] = password or None
126-128
: Use truthiness checks for empty stringsChecking if a string is not empty can be simplified by leveraging the truthy nature of non-empty strings in Python.
Simplify the assignments as follows:
-self._docker_registry: Optional[str] = registry if registry != "" else None -self._docker_username: Optional[str] = username if username != "" else None -self._docker_password: Optional[str] = password if password != "" else None +self._docker_registry: Optional[str] = registry or None +self._docker_username: Optional[str] = username or None +self._docker_password: Optional[str] = password or None
319-323
: Clarify omission ofdocker_password
in serializationThe
docker_password
is omitted from theserialize
method, which might be intentional for security reasons.
- If intentional, consider adding a comment explaining why
docker_password
is excluded to improve code readability.- If not, and if secure to do so, include it in the serialization:
"docker": { "registry": self.docker_registry, "username": self.docker_username, + "password": self.docker_password, },
Ensure that serializing sensitive information complies with security policies.
329-337
: Update cache only with valid Docker valuesWhen updating the cache, ensure that
None
values are not written unintentionally, which can overwrite valid entries or cause inconsistencies.Modify the cache update logic:
if self.docker_registry: cache.update_config( val=self.docker_registry, keys=["aws", "resources", "docker", "registry"] ) if self.docker_username: cache.update_config( val=self.docker_username, keys=["aws", "resources", "docker", "username"] ) if self.container_repository: cache.update_config( val=self.container_repository, keys=["aws", "resources", "container_repository"] )This ensures only meaningful values are stored in the cache.
sebs/openwhisk/openwhisk.py (4)
Line range hint
97-104
: Userepository_client
parameter instead ofself.docker_client
infind_image
The method
find_image
acceptsrepository_client
as a parameter but usesself.docker_client
inside the method. To ensure consistency and avoid potential issues, replaceself.docker_client
withrepository_client
.Apply this diff to fix the issue:
def find_image(self, repository_client, repository_name, image_tag) -> bool: if self.config.experimentalManifest: try: # This requires enabling experimental Docker features # Furthermore, it's not yet supported in the Python library execute(f"docker manifest inspect {repository_name}:{image_tag}") return True except RuntimeError: return False else: try: # default version requires pulling for an image - self.docker_client.images.pull(repository=repository_name, tag=image_tag) + repository_client.images.pull(repository=repository_name, tag=image_tag) return True except docker.errors.NotFound: return False
255-261
: Unused parameterscontainer_deployment
andcontainer_uri
increate_function
The parameters
container_deployment
andcontainer_uri
are added to thecreate_function
method but are not utilized within the method body. This may indicate incomplete implementation or redundant parameters. Consider implementing the logic that uses these parameters or removing them if they're unnecessary.
286-286
: Passingcontainer_deployment
andcontainer_uri
toupdate_function
When calling
update_function
, you passcontainer_deployment
andcontainer_uri
, but insideupdate_function
, these parameters are not used. Ensure thatupdate_function
utilizes these parameters as intended or remove them if they are not needed.
337-343
: Unused parameters inupdate_function
The
update_function
method acceptscontainer_deployment
andcontainer_uri
but doesn't use them within the method. This could be an oversight or indicate incomplete implementation. Make sure to use these parameters if they are necessary, or remove them if they are not required.sebs/gcp/gcp.py (5)
143-149
: Updatepackage_code
method docstring to reflect signature changesThe
package_code
method now includes thecontainer_deployment
parameter and returns an additionalcontainer_uri
. Please update the method's docstring to reflect these changes for clarity and maintainability.
146-149
: Clarify the usage ofcontainer_uri
The variable
container_uri
is initialized to an empty string but is not utilized within the method since container deployment is not supported for GCP. Consider adding a comment to explain its purpose or remove it if it's unnecessary.
199-199
: Includecontainer_uri
in the return statement only if necessaryThe
package_code
method returnscontainer_uri
, which is always an empty string in this context. Ifcontainer_uri
is not used downstream, consider removing it from the return statement to simplify the method's interface.
205-207
: Unused parameterscontainer_deployment
andcontainer_uri
increate_function
The parameters
container_deployment
andcontainer_uri
are added to thecreate_function
method but are not used within this method except when passed toupdate_function
. If they are not required here, consider removing them to enhance code clarity.
345-347
: Remove unused parameters fromupdate_function
The
update_function
method includescontainer_deployment
andcontainer_uri
in its signature, but these parameters are not used within the method body. To improve maintainability, consider removing these unused parameters.Apply this diff to remove the unused parameters:
def update_function( self, function: Function, code_package: Benchmark, - container_deployment: bool, - container_uri: str, ):sebs/aws/aws.py (3)
201-201
: Move import statements to the top of the fileFor consistency and to follow best practices, all import statements should be placed at the top of the file.
Apply this diff to move the imports:
import math import os import shutil import time import uuid +import json +from rich.progress import Progress from typing import cast, Dict, List, Optional, Tuple, Type, Union # noqa import boto3 import docker from botocore.exceptions import ClientErrorAnd remove the imports from inside the methods:
- import json
- from rich.progress import Progress
Also applies to: 238-239
207-210
: Add exception handling when parsing JSONWhen parsing JSON strings, add exception handling to handle potential
JSONDecodeError
exceptions.Apply this diff:
def show_progress(line, progress): if isinstance(line, str): + try: line = json.loads(line) + except json.JSONDecodeError as e: + self.logging.error(f"Failed to parse JSON line: {line}. Error: {e}") + return
266-271
: Use exception chaining when re-raising exceptionsWhen re-raising an exception, use exception chaining to preserve the original traceback and provide more context.
Apply this diff:
self.logging.error( f"Failed to push the image to registry {repository_uri} after retry." ) self.logging.error(f"Error: {str(e)}") - raise RuntimeError("Couldn't push to Docker registry") + raise RuntimeError("Couldn't push to Docker registry") from e🧰 Tools
🪛 Ruff
270-270: Within an
except
clause, raise exceptions withraise ... from err
orraise ... from None
to distinguish them from errors in exception handling(B904)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (23)
benchmarks/wrappers/aws/python/setup.py
(1 hunks)config/example.json
(1 hunks)config/systems.json
(2 hunks)dockerfiles/aws/nodejs/Dockerfile.function
(1 hunks)dockerfiles/aws/python/Dockerfile.function
(1 hunks)docs/modularity.md
(1 hunks)docs/usage.md
(1 hunks)requirements.aws.txt
(1 hunks)requirements.txt
(1 hunks)sebs.py
(12 hunks)sebs/aws/aws.py
(7 hunks)sebs/aws/config.py
(4 hunks)sebs/azure/azure.py
(6 hunks)sebs/benchmark.py
(6 hunks)sebs/cache.py
(4 hunks)sebs/config.py
(2 hunks)sebs/experiments/config.py
(4 hunks)sebs/experiments/invocation_overhead.py
(1 hunks)sebs/faas/system.py
(4 hunks)sebs/gcp/gcp.py
(4 hunks)sebs/local/local.py
(3 hunks)sebs/openwhisk/openwhisk.py
(9 hunks)sebs/sebs.py
(1 hunks)
✅ Files skipped from review due to trivial changes (2)
- benchmarks/wrappers/aws/python/setup.py
- requirements.txt
🧰 Additional context used
🪛 Ruff
sebs.py
647-647: Loop control variable idx
not used within loop body
Rename unused idx
to _idx
(B007)
sebs/aws/aws.py
270-270: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
419-419: Use a context manager for opening files
(SIM115)
🔇 Additional comments (33)
requirements.aws.txt (1)
6-6
: LGTM! ECR support added correctly.
The addition of ecr
to the boto3-stubs
extras is appropriate for supporting container deployment to Amazon Elastic Container Registry (ECR) as outlined in the PR objectives.
dockerfiles/aws/nodejs/Dockerfile.function (1)
5-7
: 🛠️ Refactor suggestion
Reconsider package management strategy.
While the cache cleaning is good for image size optimization, not using package-lock.json could lead to inconsistent builds across different environments.
Consider these improvements:
RUN cd function \
- && npm install --no-package-lock --production \
+ && npm ci --production \
&& npm cache clean --force
config/example.json (1)
7-7
: Verify ARM64 architecture support.
The configuration specifies "architecture": "arm64"
. Ensure that the container builds and ECR pushes work correctly for ARM64 architecture.
sebs/experiments/config.py (2)
10-10
: LGTM: Well-structured attribute initialization.
The new attribute follows the class's conventions with proper typing and a sensible default value.
56-56
: LGTM: Proper serialization of the new field.
The serialization is implemented correctly and follows the established pattern.
sebs/config.py (1)
45-46
: LGTM: Well-structured method for container deployment support.
The new method cleanly exposes container deployment options from the configuration, supporting the new AWS container deployment feature.
docs/usage.md (1)
7-7
: LGTM! The container deployment flag is correctly documented.
The --container-deployment
flag is properly documented and correctly specified as AWS-specific.
config/systems.json (4)
Line range hint 102-102
: Verify ARM64 support for AWS Lambda base images.
The addition of arm64 architecture support is significant. Please verify:
- All AWS Lambda base images listed in the configuration support arm64
- Container build and deployment processes handle architecture selection correctly
- Testing has been performed on both architectures
#!/bin/bash
# Description: Check AWS Lambda base images for arm64 support
# Test: Extract and verify AWS Lambda base images
echo "AWS Lambda Python images:"
rg -A 5 '"base_images".*python.*aws' config/systems.json
echo "AWS Lambda Node.js images:"
rg -A 5 '"base_images".*nodejs.*aws' config/systems.json
76-77
: Verify Python package setup configuration.
The addition of setup.py
to deployment files is good. However, ensure that:
- The corresponding
setup.py
file exists in the repository - All required dependencies are properly listed in the setup file
#!/bin/bash
# Description: Verify setup.py existence and content
# Test: Check if setup.py exists in the expected location
fd -t f "setup.py" benchmarks/wrappers/aws/python/
# Test: Verify setup.py content for dependencies
if [ -f "benchmarks/wrappers/aws/python/setup.py" ]; then
cat benchmarks/wrappers/aws/python/setup.py
fi
76-77
: Review Node.js configuration for container deployment.
While Python configuration has been updated with setup.py
, verify if Node.js requires similar package management updates for container deployment. Consider:
- Adding a
package.json
to deployment files if needed - Ensuring all Node.js dependencies are properly configured for container builds
#!/bin/bash
# Description: Check for Node.js package management files
# Test: Look for package.json in Node.js AWS wrapper
fd -t f "package.json" benchmarks/wrappers/aws/nodejs/
# Test: Compare Python and Node.js deployment structures
echo "Python deployment files:"
rg -A 5 '"files".*python.*aws' config/systems.json
echo "Node.js deployment files:"
rg -A 5 '"files".*nodejs.*aws' config/systems.json
Also applies to: 82-95
3-4
: Consider making the Docker repository configurable.
The Docker repository is currently hardcoded to "spcleth/serverless-benchmarks". Consider:
- Making it configurable through environment variables
- Adding documentation about repository requirements
- Verifying repository accessibility during setup
sebs/sebs.py (1)
Line range hint 133-139
: Consider additional validation for container deployment configuration.
When container deployment is enabled, verify that the deployment platform is properly configured with necessary container-related settings (e.g., ECR configuration for AWS).
Let's verify the deployment configuration handling:
Consider adding a validation method in the deployment classes to verify container-specific configuration before initialization.
sebs/experiments/invocation_overhead.py (1)
44-44
: Verify container deployment information.
Given that this PR adds container support, consider whether container-related information should be passed in the additional_info
parameter instead of an empty string. This is particularly important if the function needs to be updated differently when using container deployment.
Let's verify how additional_info
is used in container deployments:
✅ Verification successful
Empty string for additional_info
is correct in this context
The empty string for additional_info
(container_uri) is appropriate here because:
- The AWS implementation shows that container_uri is only used when container_deployment is True
- In this invocation overhead experiment, container_deployment is explicitly set to False
- The update_function implementation correctly handles both container and non-container deployments separately
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check how additional_info is used in container deployments
# Test 1: Search for other usages of update_function with container info
rg -A 5 'update_function.*container'
# Test 2: Search for container_deployment flag usage
rg -A 5 'container_deployment.*True'
Length of output: 2909
Script:
#!/bin/bash
# Let's check the implementation of update_function in AWS client to understand container handling
# Search for update_function implementation in AWS
ast-grep --pattern 'def update_function($$$)'
# Also check for container-related configurations
rg -A 10 'container_deployment.*=.*'
Length of output: 3132
Script:
#!/bin/bash
# Let's check the specific implementation in AWS and the invocation_overhead experiment
# Check AWS update_function implementation
rg -A 15 'def update_function.*container' sebs/aws/aws.py
# Check the invocation_overhead experiment's configuration handling
rg -A 10 'class CodePackageSize' sebs/experiments/invocation_overhead.py
Length of output: 520
Script:
#!/bin/bash
# Let's check the AWS implementation and the benchmark configuration
# Check the AWS implementation of update_function
cat sebs/aws/aws.py | rg -A 20 'def update_function'
# Check if the benchmark being used supports container deployment
rg -l 'container_deployment.*=.*true' benchmarks/*/config.json
Length of output: 1711
sebs/local/local.py (1)
134-135
: 🛠️ Refactor suggestion
Update method signature and implementation for container deployment.
Several issues need to be addressed:
- The method lacks type hints for the new return tuple structure
- The
container_deployment
parameter is not utilized - The method lacks documentation for the new parameter and return value
Apply these changes:
def package_code(
self,
directory: str,
language_name: str,
language_version: str,
benchmark: str,
is_cached: bool,
- container_deployment: bool,
- ) -> Tuple[str, int, str]:
+ container_deployment: bool = False,
+ ) -> Tuple[str, int, str]:
+ """Package the code for deployment.
+
+ Args:
+ directory: Directory where code is located
+ language_name: Programming language name
+ language_version: Programming language version
+ benchmark: Benchmark name
+ is_cached: Whether the code is cached
+ container_deployment: Whether to deploy as container
+
+ Returns:
+ Tuple containing:
+ - Directory path (str)
+ - Size in bytes (int)
+ - Container URI if container deployment (str)
+ """
The initialization of container_uri
should be conditional on container_deployment
.
Also applies to: 141-141, 155-155
sebs/cache.py (2)
168-172
: LGTM: Method signature updated correctly for container support.
The addition of the container_uri
parameter aligns with the PR objectives for container deployment support.
201-201
: Verify container URI persistence.
The container URI is correctly stored in the language configuration. However, we should verify that this URI is properly retrieved and used during function updates.
✅ Verification successful
Let me analyze the container_uri usage further by checking how it's retrieved and used in the cache operations.
Let me check one final aspect - how the container_uri is persisted and loaded in the cache configuration.
Container URI persistence is properly implemented
Based on the verification results, the container_uri is:
- Correctly stored in the cache configuration when adding code packages via
add_code_package
- Properly persisted in the JSON config file structure
- Retrieved and used consistently across function updates in various cloud providers (AWS, GCP, Azure, OpenWhisk)
- Handled appropriately during both creation and updates of containerized deployments
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify that container_uri is properly stored and retrieved in the cache configuration
# Test: Search for container_uri usage in the codebase
rg -A 2 'container_uri'
# Test: Check cache configuration structure
fd -e json -x cat {} \; | jq -r 'select(.container_uri != null) | .container_uri'
Length of output: 6577
Script:
#!/bin/bash
# Check cache-related operations with container_uri
rg -A 5 'add_function|update_function|add_code_package|update_code_package' sebs/cache.py
# Check the cache configuration and storage
ast-grep --pattern 'def add_function($$$)' sebs/cache.py
ast-grep --pattern 'def update_function($$$)' sebs/cache.py
Length of output: 1100
Script:
#!/bin/bash
# Check how container_uri is stored and loaded in cache configuration
rg -B 5 -A 5 'config.json' sebs/cache.py
# Check the structure of cache configuration
ast-grep --pattern 'language_config = {$$$}' sebs/cache.py
Length of output: 4414
sebs.py (1)
384-386
: LGTM: Storage configuration changes look good.
The storage configuration and deserialization changes are well-structured and maintain proper separation of concerns.
Also applies to: 420-422
sebs/faas/system.py (2)
245-245
: Verify the unpacking of return values from code_package.build
In the line:
rebuilt, _, container_deployment, container_uri = code_package.build(self.package_code)
The use of _
suggests that one of the return values from code_package.build
is being ignored. Please ensure that this is intentional and that the ignored value is not necessary for subsequent operations.
262-264
: Pass correct parameters to create_function
method
Ensure that the parameters passed to self.create_function
match the method's signature, especially after adding type annotations.
Apply this diff if necessary:
function = self.create_function(
code_package,
func_name,
- container_deployment,
- container_uri
+ container_deployment=container_deployment,
+ container_uri=container_uri,
)
sebs/aws/config.py (3)
1-9
: New imports are appropriate and necessary
The added imports base64
, ECRClient
from mypy_boto3_ecr
, and ClientError
from botocore.exceptions
are necessary for handling ECR interactions and error management.
137-152
: Properties for Docker attributes are correctly implemented
The property methods for docker_registry
, docker_username
, docker_password
, and container_repository
are well-defined and provide necessary access to the private attributes.
246-255
: Exception handling in check_ecr_repository_exists
is appropriate
The check_ecr_repository_exists
method correctly handles the absence of the repository and logs other exceptions, ensuring robust error handling.
sebs/openwhisk/openwhisk.py (2)
158-164
: Ensure repository_client
is correctly passed to find_image
When calling find_image
, you're passing self.docker_client
as the repository_client
. Confirm that self.docker_client
is the intended client to use. If different clients might be used, ensure consistency across calls.
204-204
: Pass the correct repository client to push_image_to_repository
In build_base_image
, when calling push_image_to_repository
, you're passing self.docker_client
as the repository_client
. Verify that this is the correct client to use, especially if different clients could be used for different repositories.
sebs/azure/azure.py (1)
164-170
: Properly handle container_deployment
in package_code
The package_code
method correctly raises NotImplementedError
when container_deployment
is True
, indicating that container deployment is not supported in Azure. This ensures compatibility and prevents unexpected behavior.
sebs/benchmark.py (7)
148-151
: Property container_deployment
correctly exposes _container_deployment
The new container_deployment
property provides access to the internal _container_deployment
attribute, enabling external components to check if container deployment is enabled.
182-182
: Initialize _container_deployment
from configuration
Assigning self._container_deployment = config.container_deployment
correctly initializes the attribute based on the experiment configuration.
531-538
:
Ensure deployment_build_step
accepts new parameter
The deployment_build_step
function is now called with self.container_deployment
as an additional argument. Confirm that all implementations of deployment_build_step
are updated to accept this new parameter and handle it appropriately.
Run the following script to find definitions of deployment_build_step
:
#!/bin/bash
# Description: Find implementations of `deployment_build_step` to ensure they accept the new parameter.
# Test: Search for function definitions matching `deployment_build_step`.
rg 'def .*deployment_build_step\(' -t py
553-555
:
Update cache methods to handle container_uri
The update_code_package
and add_code_package
methods now receive self._container_uri
as an additional argument. Ensure that these methods in the Cache
class are updated to accept and store the container_uri
correctly.
Run the following script to find and review the definitions of these methods:
#!/bin/bash
# Description: Find the definitions of `update_code_package` and `add_code_package` in the cache module.
# Test: Search for method definitions in the cache module.
rg 'def (update_code_package|add_code_package)\(' -t py
558-558
:
Confirm handling of additional return values from build
The build
method now returns a tuple with four elements. Verify that all code invoking this method correctly unpacks and handles all returned values to prevent runtime errors.
Run the following script to identify all calls to build
and check return value usage:
#!/bin/bash
# Description: Find all assignments from `build` method to check if they unpack four values.
# Test: Search for lines where `build` is called and the result is assigned.
rg '\s*=\s*self\.build\(' -t py
507-510
:
Verify return values match updated method signature
The build
method now returns four values, including self.container_deployment
and self.container_uri
. Ensure that all calling code unpacks these return values correctly to handle the additional information.
Run the following script to check how the return values from build
are used:
498-500
: 🛠️ Refactor suggestion
Update build
method signature and ensure consistency
The build
method signature has been updated to include the deployment_build_step
callable with new parameters and return types related to container deployment. Ensure that all overridden implementations and calls to build
are updated accordingly to prevent inconsistencies.
Run the following script to find and review all calls to the build
method:
sebs/gcp/gcp.py (1)
289-289
: Verify usage of container_deployment
and container_uri
in update_function
In the call to update_function
, the parameters container_deployment
and container_uri
are passed, but their usage inside update_function
is not evident. Ensure that these parameters are necessary and utilized properly.
ARG VERSION | ||
ENV PYTHON_VERSION=${VERSION} | ||
|
||
COPY . function/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Optimize COPY command to reduce image size
The current COPY command includes all files from the build context. Consider using .dockerignore or being more specific about what files to copy to:
- Reduce image size
- Prevent including sensitive files
- Improve build performance
-COPY . function/
+COPY requirements.txt function/
+COPY requirements.txt.* function/ 2>/dev/null || true
+COPY handler.py function/
+COPY __init__.py function/ 2>/dev/null || true
Committable suggestion skipped: line range outside the PR's diff.
sebs/aws/aws.py
Outdated
print(username, password, registry_url) | ||
print( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid printing sensitive credentials
Printing username
and password
may expose sensitive information in logs or console output.
Apply this diff to remove the print statements:
- print(username, password, registry_url)
- print(
- self.docker_client.login(
- username=username, password=password, registry=registry_url
- )
- )
+ self.docker_client.login(
+ username=username, password=password, registry=registry_url
+ )
Committable suggestion skipped: line range outside the PR's diff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 11
🧹 Outside diff range and nitpick comments (8)
sebs/faas/system.py (1)
168-172
: LGTM! Consider adding docstring updates.The new parameters
architecture
andcontainer_deployment
align well with the container support objectives. Consider updating the method's docstring to document:
- The new parameters and their purpose
- The updated return type that includes container deployment info
sebs/benchmark.py (1)
512-516
: Improve type hints and add docstring for the build methodThe method would benefit from:
- Type aliases for better readability
- A docstring explaining the return tuple values
Consider applying this improvement:
+# At the top of the file with other imports +from typing import TypeAlias +# Type aliases for better readability +BuildStepCallable: TypeAlias = Callable[[str, str, str, str, str, bool, bool], Tuple[str, int, str]] +BuildResult: TypeAlias = Tuple[bool, str, bool, str] def build( self, - deployment_build_step: Callable[[str, str, str, str, str, bool, bool], Tuple[str, int, str]], -) -> Tuple[bool, str, bool, str]: + deployment_build_step: BuildStepCallable, +) -> BuildResult: + """Build the benchmark package. + + Args: + deployment_build_step: Callback for building the deployment package + + Returns: + Tuple containing: + - bool: Whether a new build was performed + - str: Location of the code package + - bool: Whether this is a container deployment + - str: Container URI if applicable, empty string otherwise + """Also applies to: 523-526
sebs/aws/aws.py (2)
139-142
: Initialize container_uri with None instead of empty string.For better type safety and explicit handling of the unset case, consider using None as the default value.
- container_uri = "" + container_uri: Optional[str] = None
333-341
: Consider early architecture validation.The architecture mismatch is detected during the build process. Consider validating this earlier to fail fast and provide better user experience.
+ def _validate_architecture(self, target_arch: str) -> None: + """Validate if the target architecture is supported on the current system.""" + isa = platform.processor() + if (isa == "x86_64" and target_arch != "x64") or ( + isa == "arm64" and target_arch != "arm64" + ): + raise ValueError( + f"Unsupported target architecture: {target_arch} on CPU architecture: {isa}. " + "Please ensure QEMU is configured for cross-architecture builds." + ) def build_base_image( self, directory: str, language_name: str, language_version: str, architecture: str, benchmark: str, is_cached: bool, ) -> Tuple[bool, str]: + self._validate_architecture(architecture)sebs/cache.py (4)
Line range hint
127-131
: Ensure consistency inget_code_package
method parametersThe
get_code_package
method signature has been updated to includelanguage_version
andarchitecture
, but the docstring and parameter descriptions have not been updated accordingly. Please update the docstring to reflect the new parameters for clarity and maintainability.Apply this diff to update the docstring:
""" Access cached version of benchmark code. :param deployment: allowed deployment clouds or local :param benchmark: - :param language: + :param language: programming language of the benchmark + :param language_version: version of the programming language + :param architecture: system architecture (e.g., x86_64)
264-272
: Ensure consistent updates to cached configurationIn the block updating
cached_config
withinadd_code_package
, consider combining the duplicate code for updatingcontainers
andcode_package
to reduce redundancy and improve maintainability.Refactor the code to unify the update logic:
# Determine the main key based on the deployment type main_key = "containers" if code_package.container_deployment else "code_package" # Update the cached configuration cached_config[deployment_name][language][main_key][key] = language_config
182-186
: Update method docstring foradd_code_package
The
add_code_package
method has been updated, but its docstring is missing or outdated. Adding a docstring with parameter descriptions will enhance code readability and maintenance.Add the following docstring:
def add_code_package( self, deployment_name: str, code_package: "Benchmark", ): """ Add a new code package or container image to the cache. :param deployment_name: Name of the deployment platform (e.g., 'aws') :param code_package: Instance of Benchmark containing code package information """ # ... rest of the method ...
293-295
: Update method docstring forupdate_code_package
Similarly, the
update_code_package
method lacks a docstring. Adding a docstring will clarify the purpose and usage of the method.Add the following docstring:
def update_code_package( self, deployment_name: str, code_package: "Benchmark", ): """ Update an existing code package or container image in the cache. :param deployment_name: Name of the deployment platform (e.g., 'aws') :param code_package: Instance of Benchmark containing updated code package information """ # ... rest of the method ...
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (8)
config/systems.json
(2 hunks)dockerfiles/aws/python/Dockerfile.function
(1 hunks)sebs/aws/aws.py
(8 hunks)sebs/benchmark.py
(9 hunks)sebs/cache.py
(9 hunks)sebs/config.py
(1 hunks)sebs/faas/system.py
(5 hunks)sebs/sebs.py
(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- dockerfiles/aws/python/Dockerfile.function
- sebs/sebs.py
🧰 Additional context used
🪛 Ruff
sebs/aws/aws.py
263-263: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
423-423: Use a context manager for opening files
(SIM115)
🔇 Additional comments (13)
sebs/config.py (3)
50-55
: LGTM! Verify base image configurations.
The method correctly incorporates architecture-specific base images. The implementation maintains consistency with other architecture-related changes.
Let's verify the base image configurations:
#!/bin/bash
# Description: Check base image configurations
# Test: Extract and validate base image structure for different architectures
jq -r '.[] | select(.languages != null) | .languages[] | select(.base_images != null) | .base_images | keys[]' config/systems.json
47-48
: Add docstring and verify config structure.
While the implementation is straightforward, please add a docstring to document the purpose and expected return values. Also, let's verify the config structure matches the return type hint.
#!/bin/bash
# Description: Verify container_deployment config structure
# Test: Check if the value is a list of strings
jq -r '.general.container_deployment | if type=="array" then "Valid array" else "Invalid type: " + type end' config/systems.json
63-69
: LGTM! Image naming follows Docker conventions.
The implementation correctly:
- Incorporates architecture in image tags
- Follows Docker naming conventions
- Fixed the previous issue with spaces in tags
Let's verify the image tag format:
Also applies to: 77-84
✅ Verification successful
Image tag format follows Docker conventions and is correctly implemented
The implementation in benchmark_image_tag
correctly:
- Uses only allowed characters (alphanumeric, dots, hyphens)
- Follows a consistent format:
function.<system>.<benchmark>.<language>-<version>-<architecture>[-prefix]
- Properly incorporates the architecture parameter
- Has no spaces or invalid characters in the tag format
The code meets all Docker naming conventions and the previous review's concerns have been addressed.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify image tag format
# Test: Check for any invalid characters or formats in image tags
rg -g '*.py' 'benchmark_image_tag.*return.*tag' -A 1 | grep -v 'benchmark_image_tag' | grep -v '\-\-' | grep -v '^$' | while read -r line; do
if echo "$line" | grep -q '[^a-zA-Z0-9_.-]'; then
echo "Invalid characters found in tag format: $line"
fi
done
Length of output: 131
Script:
#!/bin/bash
# Let's first see the actual implementation of benchmark_image_tag
rg -g '*.py' -A 10 'def benchmark_image_tag'
Length of output: 581
config/systems.json (4)
84-85
: Verify setup.py integration with container builds.
The addition of setup.py
to deployment files suggests Python package installation during container builds.
#!/bin/bash
# Check if setup.py exists and contains necessary configuration
fd -t f "setup.py" -X grep -l "install_requires\|packages"
# Verify Dockerfile references setup.py
fd -t f "Dockerfile" -X grep -l "setup.py"
Line range hint 1-276
: Verify regression test coverage for all configurations.
The PR comments mention regression testing was performed, but please ensure:
- All architecture combinations (x64/arm64) are tested
- Container deployment works with both Python and Node.js
- Cache mechanism works as expected with container_uri
#!/bin/bash
# Check for test files covering container deployment
fd -t f "test_.*\.py" -X grep -l "container_deployment\|container_uri"
# Look for cache configuration handling
rg "container_uri.*cache"
92-97
:
Review Node.js version support and version tagging.
-
Limited Version Support:
- Only Node.js 16 is supported
- Consider adding support for Node.js 18 and 20 as they are widely used
-
Version Tag Inconsistency:
- x64: Generic tag (
amazon/aws-lambda-nodejs:16
) - arm64: Specific version tag (
amazon/aws-lambda-nodejs:16.2024.09.06.13
)
- x64: Generic tag (
#!/bin/bash
# Check for Node.js version compatibility in codebase
rg -l "engine.*node" package.json
# Verify Node.js function compatibility
fd -t f "handler.js" -X grep -l "async.*exports.handler"
3-4
: Consider making docker_repository configurable and document container deployment.
The hardcoded docker repository might limit flexibility. Consider:
- Making it configurable through environment variables
- Adding documentation about:
- Required repository permissions
- How to set up custom repositories
- AWS-specific container deployment limitations
sebs/benchmark.py (3)
182-182
: LGTM!
The initialization of container-related fields is properly typed and follows the class's initialization pattern.
Also applies to: 205-205
246-265
: LGTM!
The cache query logic properly handles both container and non-container deployments, with safe dictionary access for the container URI.
182-182
: Consider early validation of container deployment support
From the PR objectives, we know that container deployment is only supported for AWS. Consider adding an early validation in the constructor to fail fast if container deployment is requested for unsupported platforms.
Let's verify the current platform support:
Also applies to: 246-265
sebs/aws/aws.py (2)
185-196
: 🛠️ Refactor suggestion
Improve error handling in find_image method.
The method silently catches ClientError. Consider logging the error for better debugging.
- except ClientError:
+ except ClientError as e:
+ self.logging.debug(f"Image not found: {e}")
return False
Likely invalid or redundant comment.
423-424
:
Use context manager for file operations.
Use a context manager to ensure proper file handling and resource cleanup.
- package_body = open(package, "rb").read()
- create_function_params["Code"] = {"ZipFile": package_body}
+ with open(package, "rb") as f:
+ create_function_params["Code"] = {"ZipFile": f.read()}
Likely invalid or redundant comment.
🧰 Tools
🪛 Ruff
423-423: Use a context manager for opening files
(SIM115)
sebs/cache.py (1)
303-309
: Handle missing cache directory in update_code_package
In the update_code_package
method, if the cached_dir
does not exist, the method calls self.add_code_package
. Ensure that this behavior is intentional and does not lead to any unintended side effects. Consider logging a message to indicate that the code package was not found and is being added.
Please confirm that this fallback to add_code_package
is the desired behavior. If so, consider adding a log message.
sebs/faas/system.py
Outdated
def build_base_image( | ||
self, | ||
directory: str, | ||
language_name: str, | ||
language_version: str, | ||
architecture: str, | ||
benchmark: str, | ||
is_cached: bool, | ||
) -> Tuple[bool, str]: | ||
pass | ||
|
||
@abstractmethod | ||
def find_image(self, repository_client, repository_name, image_tag) -> bool: | ||
pass | ||
|
||
@abstractmethod | ||
def push_image_to_repository(self, repository_client, repository_uri, image_tag): | ||
pass | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add docstrings and error handling for container-related methods.
The new abstract methods for container support are well-structured, but they need:
- Docstrings describing:
- Purpose and responsibilities
- Parameters and return values
- Expected behavior and error conditions
- Error handling guidance for implementations:
- Network issues during image push
- Repository authentication failures
- Image build failures
@property | ||
def container_uri(self) -> str: | ||
assert self._container_uri is not None | ||
return self._container_uri | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace assertion with proper None check in container_uri
property
The assertion could be disabled in production Python environments (using the -O flag). Consider using a proper None check with a descriptive error message.
Apply this diff:
@property
def container_uri(self) -> str:
- assert self._container_uri is not None
+ if self._container_uri is None:
+ raise ValueError("Container URI is not set")
return self._container_uri
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
@property | |
def container_uri(self) -> str: | |
assert self._container_uri is not None | |
return self._container_uri | |
@property | |
def container_uri(self) -> str: | |
if self._container_uri is None: | |
raise ValueError("Container URI is not set") | |
return self._container_uri | |
sebs/aws/aws.py
Outdated
try: | ||
self.docker_client.login(username=username, password=password, registry=registry_url) | ||
push_image(repository_uri, image_tag) | ||
self.logging.info(f"Successfully pushed the image to registry {repository_uri}.") | ||
except docker.errors.APIError as e: | ||
self.logging.error(f"Failed to push the image to registry {repository_uri}.") | ||
self.logging.error(f"Error: {str(e)}") | ||
raise RuntimeError("Couldn't push to Docker registry") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Improve exception chaining in push_image_to_repository.
Use proper exception chaining to preserve the original error context.
- raise RuntimeError("Couldn't push to Docker registry")
+ raise RuntimeError("Couldn't push to Docker registry") from e
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
try: | |
self.docker_client.login(username=username, password=password, registry=registry_url) | |
push_image(repository_uri, image_tag) | |
self.logging.info(f"Successfully pushed the image to registry {repository_uri}.") | |
except docker.errors.APIError as e: | |
self.logging.error(f"Failed to push the image to registry {repository_uri}.") | |
self.logging.error(f"Error: {str(e)}") | |
raise RuntimeError("Couldn't push to Docker registry") | |
try: | |
self.docker_client.login(username=username, password=password, registry=registry_url) | |
push_image(repository_uri, image_tag) | |
self.logging.info(f"Successfully pushed the image to registry {repository_uri}.") | |
except docker.errors.APIError as e: | |
self.logging.error(f"Failed to push the image to registry {repository_uri}.") | |
self.logging.error(f"Error: {str(e)}") | |
raise RuntimeError("Couldn't push to Docker registry") from e |
🧰 Tools
🪛 Ruff
263-263: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
def __init__(self, cache_dir: str, docker_client: docker.DockerClient): | ||
super().__init__() | ||
self.docker_client = docker_client |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider making 'docker_client' an optional parameter
Adding the required docker_client
parameter to the __init__
method may break existing code that instantiates the Cache
class without providing this parameter. To maintain backward compatibility, consider making docker_client
an optional parameter with a default value of None
. This way, existing code will continue to function, and you can handle docker_client
being None
where necessary.
Apply this diff to make docker_client
optional:
+from typing import Optional
-def __init__(self, cache_dir: str, docker_client: docker.DockerClient):
+def __init__(self, cache_dir: str, docker_client: Optional[docker.DockerClient] = None):
super().__init__()
self.docker_client = docker_client
Ensure that you handle the case where self.docker_client
might be None
in methods that use it.
Committable suggestion skipped: line range outside the PR's diff.
def get_container( | ||
self, deployment: str, benchmark: str, language: str, | ||
language_version: str, architecture: str | ||
) -> Optional[Dict[str, Any]]: | ||
cfg = self.get_benchmark_config(deployment, benchmark) | ||
|
||
key = f"{language_version}-{architecture}" | ||
if cfg and language in cfg and key in cfg[language]["containers"]: | ||
return cfg[language]["containers"][key] | ||
else: | ||
return None | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle missing 'containers' key in get_container
method
In the get_container
method, if cfg[language]
does not contain the key "containers"
, accessing it will raise a KeyError
. To prevent this, consider using the dict.get()
method with a default empty dictionary.
Apply this diff to safely access the "containers"
key:
key = f"{language_version}-{architecture}"
- if cfg and language in cfg and key in cfg[language]["containers"]:
+ if cfg and language in cfg and key in cfg[language].get("containers", {}):
return cfg[language]["containers"][key]
else:
return None
Committable suggestion skipped: line range outside the PR's diff.
image = self.docker_client.images.get(code_package.container_uri) | ||
language_config["image-uri"] = code_package.container_uri | ||
language_config["image-id"] = image.id | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add exception handling when retrieving Docker images
When retrieving the Docker image with self.docker_client.images.get(code_package.container_uri)
, an exception may be raised if the image is not found or if there is a connection issue with the Docker daemon. To make the code more robust, add exception handling to catch potential exceptions and handle them gracefully.
Apply this diff to add exception handling:
if code_package.container_deployment:
- image = self.docker_client.images.get(code_package.container_uri)
+ try:
+ image = self.docker_client.images.get(code_package.container_uri)
+ except docker.errors.ImageNotFound:
+ raise RuntimeError(f"Image '{code_package.container_uri}' not found.")
+ except docker.errors.APIError as e:
+ raise RuntimeError(f"Docker API error: {e}")
language_config["image-uri"] = code_package.container_uri
language_config["image-id"] = image.id
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
image = self.docker_client.images.get(code_package.container_uri) | |
language_config["image-uri"] = code_package.container_uri | |
language_config["image-id"] = image.id | |
if code_package.container_deployment: | |
try: | |
image = self.docker_client.images.get(code_package.container_uri) | |
except docker.errors.ImageNotFound: | |
raise RuntimeError(f"Image '{code_package.container_uri}' not found.") | |
except docker.errors.APIError as e: | |
raise RuntimeError(f"Docker API error: {e}") | |
language_config["image-uri"] = code_package.container_uri | |
language_config["image-id"] = image.id |
|
||
key = f"{language_version}-{architecture}" | ||
config[deployment_name][language]["code_package"][key]["date"][ | ||
if code_package.container_deployment: | ||
main_key = "containers" | ||
else: | ||
main_key = "code_package" | ||
|
||
config[deployment_name][language][main_key][key]["date"][ | ||
"modified" | ||
] = date | ||
config[deployment_name][language]["code_package"][key][ | ||
config[deployment_name][language][main_key][key][ | ||
"hash" | ||
] = code_package.hash | ||
config[deployment_name][language]["code_package"][key][ | ||
config[deployment_name][language][main_key][key][ | ||
"size" | ||
] = code_package.code_size | ||
|
||
if code_package.container_deployment: | ||
|
||
image = self.docker_client.images.get(code_package.container_uri) | ||
config[deployment_name][language][main_key][key][ | ||
"image-id" | ||
] = image.id | ||
config[deployment_name][language][main_key][key][ | ||
"image-uri" | ||
] = code_package.container_uri | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add exception handling when updating Docker image information
Similar to the addition in add_code_package
, when updating the Docker image information in update_code_package
, add exception handling around the self.docker_client.images.get
call to handle potential errors gracefully.
Apply this diff to add exception handling:
if code_package.container_deployment:
main_key = "containers"
else:
main_key = "code_package"
config[deployment_name][language][main_key][key]["date"]["modified"] = date
config[deployment_name][language][main_key][key]["hash"] = code_package.hash
config[deployment_name][language][main_key][key]["size"] = code_package.code_size
if code_package.container_deployment:
- image = self.docker_client.images.get(code_package.container_uri)
+ try:
+ image = self.docker_client.images.get(code_package.container_uri)
+ except docker.errors.ImageNotFound:
+ raise RuntimeError(f"Image '{code_package.container_uri}' not found.")
+ except docker.errors.APIError as e:
+ raise RuntimeError(f"Docker API error: {e}")
config[deployment_name][language][main_key][key]["image-id"] = image.id
config[deployment_name][language][main_key][key]["image-uri"] = code_package.container_uri
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
key = f"{language_version}-{architecture}" | |
config[deployment_name][language]["code_package"][key]["date"][ | |
if code_package.container_deployment: | |
main_key = "containers" | |
else: | |
main_key = "code_package" | |
config[deployment_name][language][main_key][key]["date"][ | |
"modified" | |
] = date | |
config[deployment_name][language]["code_package"][key][ | |
config[deployment_name][language][main_key][key][ | |
"hash" | |
] = code_package.hash | |
config[deployment_name][language]["code_package"][key][ | |
config[deployment_name][language][main_key][key][ | |
"size" | |
] = code_package.code_size | |
if code_package.container_deployment: | |
image = self.docker_client.images.get(code_package.container_uri) | |
config[deployment_name][language][main_key][key][ | |
"image-id" | |
] = image.id | |
config[deployment_name][language][main_key][key][ | |
"image-uri" | |
] = code_package.container_uri | |
key = f"{language_version}-{architecture}" | |
if code_package.container_deployment: | |
main_key = "containers" | |
else: | |
main_key = "code_package" | |
config[deployment_name][language][main_key][key]["date"][ | |
"modified" | |
] = date | |
config[deployment_name][language][main_key][key][ | |
"hash" | |
] = code_package.hash | |
config[deployment_name][language][main_key][key][ | |
"size" | |
] = code_package.code_size | |
if code_package.container_deployment: | |
try: | |
image = self.docker_client.images.get(code_package.container_uri) | |
except docker.errors.ImageNotFound: | |
raise RuntimeError(f"Image '{code_package.container_uri}' not found.") | |
except docker.errors.APIError as e: | |
raise RuntimeError(f"Docker API error: {e}") | |
config[deployment_name][language][main_key][key][ | |
"image-id" | |
] = image.id | |
config[deployment_name][language][main_key][key][ | |
"image-uri" | |
] = code_package.container_uri |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 11
🧹 Outside diff range and nitpick comments (13)
sebs/aws/config.py (1)
257-282
: Add retry mechanism for ECR API callsThe repository creation could fail due to transient AWS API issues. Consider implementing a retry mechanism similar to the one used in the HTTP API creation.
def get_ecr_repository(self, ecr_client: ECRClient) -> str: if self._container_repository is not None: return self._container_repository self._container_repository = "sebs-benchmarks-{}".format(self._resources_id) self._docker_registry = self.check_ecr_repository_exists( ecr_client, self._container_repository ) if self._docker_registry is None: + retries = 0 + while retries < 5: try: resp = ecr_client.create_repository(repositoryName=self._container_repository) self.logging.info(f"Created ECR repository: {self._container_repository}") self._docker_registry = resp["repository"]["repositoryUri"] + break except ecr_client.exceptions.RepositoryAlreadyExistsException: self.logging.info(f"ECR repository {self._container_repository} already exists.") self._docker_registry = self.check_ecr_repository_exists( ecr_client, self._container_repository ) + except ecr_client.exceptions.ServerException: + retries += 1 + if retries == 5: + raise + time.sleep(retries) return self._container_repositorysebs.py (3)
450-459
: Enhance function documentation.The docstring should include details about the new
measure_interval
parameter and its impact on memory measurements.def start( benchmark, benchmark_input_size, output, deployments, storage_configuration, measure_interval, remove_containers, **kwargs, ): """ Start a given number of function instances and a storage instance. + + Args: + measure_interval: Duration between memory measurements in milliseconds. + Use -1 to disable measurements. + remove_containers: If True, containers are removed after stopping. """
596-596
: Fix typo in log message.There's a typo in the log message.
- sebs_client.logging.info("Resource grup:") + sebs_client.logging.info("Resource groups:")
643-654
: Improve resource group management implementation.
- Fix the unused loop variable as suggested by static analysis.
- Add more descriptive error messages for resource group operations.
- groups = deployment_client.config.resources.list_resource_groups( - deployment_client.cli_instance - ) - for idx, group in enumerate(groups): + groups = deployment_client.config.resources.list_resource_groups( + deployment_client.cli_instance + ) + for _idx, group in enumerate(groups): if len(prefix) > 0 and not group.startswith(prefix): continue - sebs_client.logging.info(f"Removing resource group: {group}") + sebs_client.logging.info(f"Attempting to remove resource group: {group}") deployment_client.config.resources.delete_resource_group( deployment_client.cli_instance, group, wait ) + sebs_client.logging.info(f"Successfully removed resource group: {group}")🧰 Tools
🪛 Ruff
646-646: Loop control variable
idx
not used within loop bodyRename unused
idx
to_idx
(B007)
sebs/azure/azure.py (1)
162-171
: LGTM! Consider adding a docstring.The changes correctly handle container deployment parameters and properly raise NotImplementedError for unsupported functionality. The return type has been updated to include container_uri.
Consider adding a docstring to document the new parameters and return value:
def package_code( self, directory: str, language_name: str, language_version: str, architecture: str, benchmark: str, is_cached: bool, container_deployment: bool, ) -> Tuple[str, int, str]: """Package the code for deployment. Args: directory: Source directory containing the code language_name: Programming language name language_version: Language version architecture: Target architecture benchmark: Benchmark name is_cached: Whether to use cached package container_deployment: Whether to use container deployment Returns: Tuple containing: - Directory path - Code size in bytes - Container URI (empty for non-container deployments) Raises: NotImplementedError: If container deployment is True as it's not supported """sebs/aws/aws.py (3)
138-151
: Add type hints for improved code maintainability.Consider adding type hints for the new parameters and return type:
def package_code( self, directory: str, language_name: str, language_version: str, - architecture: str, - benchmark: str, - is_cached: bool, - container_deployment: bool, + architecture: str, + benchmark: str, + is_cached: bool, + container_deployment: bool, - ) -> Tuple[str, int, str]: + ) -> Tuple[str, int, str]: # (package_path, size_in_bytes, container_uri)
238-278
: Consider extracting zip deployment logic to a separate method.The create_function method handles both container and zip deployments. Consider extracting the zip deployment logic to improve readability and maintainability:
+ def _create_zip_function( + self, + create_function_params: Dict[str, Any], + package: str, + code_size: int, + benchmark: str, + architecture: str, + language: str, + language_runtime: str, + ) -> None: + if code_size < 50 * 1024 * 1024: + with open(package, "rb") as f: + create_function_params["Code"] = {"ZipFile": f.read()} + else: + storage_client = self.get_storage() + code_bucket = storage_client.get_bucket( + Resources.StorageBucketType.DEPLOYMENT + ) + code_prefix = os.path.join( + benchmark, architecture, os.path.basename(package) + ) + storage_client.upload(code_bucket, package, code_prefix) + create_function_params["Code"] = { + "S3Bucket": code_bucket, + "S3Key": code_prefix, + } + create_function_params["Runtime"] = f"{language}{self._map_language_runtime(language, language_runtime)}" + create_function_params["Handler"] = "handler.handler" def create_function( self, code_package: Benchmark, func_name: str, container_deployment: bool, container_uri: str, ) -> "LambdaFunction": # ... existing code ... if container_deployment: create_function_params["PackageType"] = "Image" create_function_params["Code"] = {"ImageUri": container_uri} else: create_function_params["PackageType"] = "Zip" self._create_zip_function( create_function_params, package, code_size, benchmark, architecture, language, language_runtime, )🧰 Tools
🪛 Ruff
253-253: Use a context manager for opening files
(SIM115)
323-367
: Consider extracting zip deployment update logic to a separate method.Similar to the create_function refactoring suggestion, consider extracting the zip deployment update logic to improve readability and maintainability.
+ def _update_zip_function( + self, + name: str, + package: str, + code_size: int, + benchmark: str, + architecture: str, + function: LambdaFunction, + ) -> None: + if code_size < 50 * 1024 * 1024: + with open(package, "rb") as code_body: + self.client.update_function_code( + FunctionName=name, + ZipFile=code_body.read(), + Architectures=[self._map_architecture(architecture)], + ) + else: + storage = cast(S3, self.get_storage()) + bucket = function.code_bucket(benchmark, storage) + code_prefix = os.path.join( + benchmark, + architecture, + os.path.basename(package) + ) + storage.upload(bucket, package, code_prefix) + self.client.update_function_code( + FunctionName=name, + S3Bucket=bucket, + S3Key=code_prefix, + Architectures=[self._map_architecture(architecture)], + ) def update_function( self, function: Function, code_package: Benchmark, container_deployment: bool, container_uri: str, ): name = function.name function = cast(LambdaFunction, function) if container_deployment: self.client.update_function_code( FunctionName=name, ImageUri=container_uri ) else: self._update_zip_function( name, code_package.code_location, code_package.code_size, code_package.benchmark, function_cfg.architecture.value, function, )sebs/faas/container.py (2)
40-43
: Simplify code by using a ternary operatorYou can make the code more concise by replacing the
if
-else
block with a ternary operator.- if isinstance(txt, str): - line = json.loads(txt) - else: - line = txt + line = json.loads(txt) if isinstance(txt, str) else txt🧰 Tools
🪛 Ruff
40-43: Use ternary operator
line = json.loads(txt) if isinstance(txt, str) else txt
instead ofif
-else
-blockReplace
if
-else
-block withline = json.loads(txt) if isinstance(txt, str) else txt
(SIM108)
66-67
: Raise a more specific exception for better error handlingCurrently, a generic
Exception
is raised when an error occurs. Consider raising a more specific exception, such asdocker.errors.APIError
or defining a custom exception class. This improves clarity and allows for more precise exception handling.- raise Exception(line["error"]) + raise docker.errors.APIError(line["error"])sebs/aws/container.py (2)
69-159
: Consider refactoring 'build_base_image' method for improved readabilityThe
build_base_image
method is lengthy and encompasses multiple logical steps, such as checking the image cache, preparing the build context, building the Docker image, and pushing the image to the repository. Breaking it down into smaller, well-named helper methods would enhance readability and maintainability.
148-150
: Add error handling for the Docker image build processTo ensure issues during the Docker build are properly managed, consider adding exception handling around the image build step. This allows for graceful error reporting and recovery.
Example modification:
+ try: image, _ = self.docker_client.images.build( tag=repository_uri, path=build_dir, buildargs=buildargs ) + except docker.errors.BuildError as e: + self.logging.error(f"Failed to build Docker image: {e}") + raise RuntimeError("Docker image build failed") from esebs/gcp/gcp.py (1)
200-200
: Consider removingcontainer_uri
from the return value if unused.Since
container_deployment
is not supported andcontainer_uri
is initialized but not utilized within thepackage_code
method, consider removing it from the return statement to simplify the code.Apply this diff to adjust the return statement:
-return os.path.join(directory, "{}.zip".format(benchmark)), bytes_size, container_uri +return os.path.join(directory, "{}.zip".format(benchmark)), bytes_sizeAnd update the method signature accordingly.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (13)
dockerfiles/aws/python/Dockerfile.function
(1 hunks)requirements.aws.txt
(1 hunks)sebs.py
(12 hunks)sebs/aws/aws.py
(8 hunks)sebs/aws/config.py
(4 hunks)sebs/aws/container.py
(1 hunks)sebs/azure/azure.py
(6 hunks)sebs/faas/container.py
(1 hunks)sebs/faas/system.py
(5 hunks)sebs/gcp/gcp.py
(4 hunks)sebs/local/local.py
(3 hunks)sebs/openwhisk/openwhisk.py
(9 hunks)sebs/sebs.py
(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
- dockerfiles/aws/python/Dockerfile.function
- requirements.aws.txt
- sebs/local/local.py
- sebs/openwhisk/openwhisk.py
- sebs/sebs.py
🧰 Additional context used
🪛 Ruff
sebs.py
646-646: Loop control variable idx
not used within loop body
Rename unused idx
to _idx
(B007)
sebs/aws/aws.py
253-253: Use a context manager for opening files
(SIM115)
sebs/aws/container.py
67-67: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
sebs/faas/container.py
40-43: Use ternary operator line = json.loads(txt) if isinstance(txt, str) else txt
instead of if
-else
-block
Replace if
-else
-block with line = json.loads(txt) if isinstance(txt, str) else txt
(SIM108)
🔇 Additional comments (17)
sebs/aws/config.py (4)
1-8
: LGTM: Appropriate imports for ECR support
The added imports and type hints are well-chosen for ECR integration.
136-151
: LGTM: Well-structured property definitions
The property getters are properly implemented with correct type hints and follow Python best practices.
330-338
: LGTM: Proper cache management
The cache updates are well-structured and maintain consistency with the existing caching mechanism.
303-307
: 🛠️ Refactor suggestion
Use safe dictionary access
The current implementation assumes dictionary keys exist, which could raise KeyError.
if "docker" in dct:
- ret._docker_registry = dct["docker"]["registry"]
- ret._docker_username = dct["docker"]["username"]
- ret._container_repository = dct["container_repository"]
+ docker_config = dct["docker"]
+ ret._docker_registry = docker_config.get("registry")
+ ret._docker_username = docker_config.get("username")
+ ret._container_repository = dct.get("container_repository")
Likely invalid or redundant comment.
sebs/azure/azure.py (3)
295-304
: Changes look good!
The implementation correctly handles container deployment parameters and raises NotImplementedError for unsupported functionality.
341-350
: Changes look good!
The implementation correctly handles container deployment parameters and raises NotImplementedError for unsupported functionality.
530-530
: LGTM!
The update_function call correctly passes False for container_deployment and empty string for container_uri, which is consistent with Azure not supporting container deployment.
sebs/aws/aws.py (3)
13-13
: LGTM: Clean ECR client integration.
The ECR client integration follows the existing pattern of client initialization in the AWS class.
Also applies to: 76-78
400-401
: LGTM: Clean function name update for container deployments.
The addition of the "-docker" suffix for container deployments is consistent with the existing naming pattern.
253-253
:
Use context manager for file operations.
Use a context manager to ensure proper file handling:
- package_body = open(package, "rb").read()
+ with open(package, "rb") as f:
+ package_body = f.read()
Likely invalid or redundant comment.
🧰 Tools
🪛 Ruff
253-253: Use a context manager for opening files
(SIM115)
sebs/faas/container.py (2)
11-19
: Well-defined constructor initializes class attributes correctly
The __init__
method properly sets up the Docker client, experimental manifest flag, and system configuration for the DockerContainer
class.
69-87
: push_image
method handles image pushing effectively
The push_image
method correctly manages the image push process, including progress tracking and error handling with proper logging.
sebs/aws/container.py (1)
66-66
: 🛠️ Refactor suggestion
Enhance exception handling by chaining exceptions
To preserve the original exception traceback and provide better error context, use exception chaining by adding from e
when raising a new exception in an except
block.
Apply this diff to improve exception handling:
- raise RuntimeError("Couldn't push to Docker registry")
+ raise RuntimeError("Couldn't push to Docker registry") from e
Likely invalid or redundant comment.
sebs/gcp/gcp.py (4)
141-145
: Method signature updated to include container deployment parameters.
The package_code
method now accepts architecture
and container_deployment
parameters, aligning with the changes made for container support. This update ensures consistency across different cloud providers.
149-150
: Proper handling of unsupported container deployment in GCP.
Raising a NotImplementedError
when container_deployment
is True
correctly indicates that container deployment is not supported in GCP, preventing unintended usage.
202-208
: Method signature updated to include container deployment parameters.
The create_function
method now includes container_deployment
and container_uri
parameters, ensuring consistency with the overall interface and preparing for potential future support.
210-211
: Correctly raising an error for unsupported container deployment.
The NotImplementedError
provides clear feedback that container deployment is not yet implemented for GCP, which helps maintain correct usage.
… OpenWhisk support
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 15
🧹 Outside diff range and nitpick comments (20)
sebs/openwhisk/container.py (3)
1-9
: LGTM with minor style suggestion.The imports and class declaration are well-organized with proper type hints.
Remove the extra blank line at line 8 to maintain consistent spacing.
10-17
: Add docstrings to static methods.While the methods are clear and concise, adding docstrings would improve code documentation and maintainability.
Consider adding docstrings:
@staticmethod def name() -> str: + """Return the identifier for OpenWhisk container. + + Returns: + str: The string 'openwhisk' + """ return "openwhisk" @staticmethod def typename() -> str: + """Return the full type name for OpenWhisk container. + + Returns: + str: The string 'OpenWhisk.Container' + """ return "OpenWhisk.Container"
18-27
: Add constructor documentation and parameter validation.The constructor implementation is correct but could benefit from additional documentation and validation.
Consider these improvements:
def __init__( self, system_config: SeBSConfig, config: OpenWhiskConfig, docker_client: docker.client, experimental_manifest: bool, ): + """Initialize OpenWhisk container manager. + + Args: + system_config (SeBSConfig): System-wide configuration + config (OpenWhiskConfig): OpenWhisk-specific configuration + docker_client (docker.client): Docker client instance + experimental_manifest (bool): Flag for experimental features + + Raises: + ValueError: If required configurations are missing + """ + if not system_config: + raise ValueError("system_config is required") + if not config: + raise ValueError("config is required") + if not docker_client: + raise ValueError("docker_client is required") super().__init__(system_config, docker_client, experimental_manifest) self.config = configsebs/aws/container.py (2)
14-21
: Add return type annotation toname()
method.For consistency with
typename()
, add a return type annotation to thename()
method.@staticmethod - def name(): + def name() -> str: return "aws"
38-53
: Add docstring to document the registry_name method.The method logic is correct, but adding a docstring would improve code documentation by explaining the purpose and return values.
def registry_name( self, benchmark: str, language_name: str, language_version: str, architecture: str ) -> Tuple[str, str, str, str]: + """ + Constructs AWS ECR registry details for a given benchmark and runtime. + + Args: + benchmark: Name of the benchmark + language_name: Programming language name + language_version: Language version + architecture: Target architecture + + Returns: + Tuple containing (registry_name, repository_name, image_tag, image_uri) + """config/systems.json (1)
Line range hint
1-285
: Consider adding JSON schema validation.To ensure configuration consistency and catch potential issues early, consider:
- Creating a JSON schema for this configuration file
- Adding validation in the code that reads this configuration
- Including schema documentation
This would help prevent configuration errors and improve maintainability.
sebs/utils.py (1)
Line range hint
103-138
: Remove commented-out logging code.The commented-out
configure_logging
implementation should be removed to maintain code cleanliness. If this alternative implementation is needed in the future, it can be retrieved from version control history.sebs/faas/container.py (4)
29-41
: Add parameter validation in constructor.Consider adding validation for
docker_client
to ensure it's a valid Docker client instance, as its methods are used throughout the class.def __init__( self, system_config: SeBSConfig, docker_client, experimental_manifest: bool = False, ): super().__init__() + if not hasattr(docker_client, 'images'): + raise ValueError("Invalid docker_client provided") self.docker_client = docker_client self.experimental_manifest = experimental_manifest self.system_config = system_config self._disable_rich_output = False
60-90
: Optimize code and improve error handling.Two suggestions for improvement:
- Use ternary operator for better readability
- Add more specific error handling for JSON parsing
- if isinstance(txt, str): - line = json.loads(txt) - else: - line = txt + try: + line = json.loads(txt) if isinstance(txt, str) else txt + except json.JSONDecodeError as e: + self.logging.error(f"Failed to parse progress update: {e}") + raise🧰 Tools
🪛 Ruff
62-65: Use ternary operator
line = json.loads(txt) if isinstance(txt, str) else txt
instead ofif
-else
-blockReplace
if
-else
-block withline = json.loads(txt) if isinstance(txt, str) else txt
(SIM108)
91-122
: Refactor to reduce code duplication.The image push logic is duplicated between the progress and non-progress paths. Consider extracting the common functionality.
def push_image(self, repository_uri, image_tag): + def handle_push_stream(stream): + for line in stream: + if "error" in line: + self.logging.error(f"Failed to push the image to registry {repository_uri}") + raise RuntimeError(line) + yield line + try: + self.logging.info(f"Pushing image {image_tag} to {repository_uri}") + ret = self.docker_client.images.push( + repository=repository_uri, tag=image_tag, stream=True, decode=True + ) if not self.disable_rich_output: - layer_tasks = {} with Progress() as progress: - - self.logging.info(f"Pushing image {image_tag} to {repository_uri}") - ret = self.docker_client.images.push( - repository=repository_uri, tag=image_tag, stream=True, decode=True - ) - for line in ret: + for line in handle_push_stream(ret): self.show_progress(line, progress, layer_tasks) - else: - self.logging.info(f"Pushing image {image_tag} to {repository_uri}") - ret = self.docker_client.images.push( - repository=repository_uri, tag=image_tag, stream=True, decode=True - ) - - for val in ret: - if "error" in val: - self.logging.error(f"Failed to push the image to registry {repository_uri}") - raise RuntimeError(val) + for _ in handle_push_stream(ret): + pass
1-211
: Consider adding timeout and retry mechanisms.For improved reliability, consider:
- Adding configurable timeouts for Docker operations
- Implementing retry mechanisms for transient failures in image push/pull operations
- Adding circuit breakers for Docker daemon communication
This would make the container operations more resilient in production environments.
🧰 Tools
🪛 Ruff
62-65: Use ternary operator
line = json.loads(txt) if isinstance(txt, str) else txt
instead ofif
-else
-blockReplace
if
-else
-block withline = json.loads(txt) if isinstance(txt, str) else txt
(SIM108)
sebs/openwhisk/openwhisk.py (1)
376-377
: Consider adding enable_rich_output method for symmetry.For better API design, consider adding a corresponding method to enable rich output.
def disable_rich_output(self): self.container_client.disable_rich_output = True + +def enable_rich_output(self): + self.container_client.disable_rich_output = Falsesebs/faas/system.py (2)
158-171
: Improve docstring accuracy and completeness.The docstring has a few minor issues:
- Architecture example uses 'x64' but the code uses 'x86_64' (line 162).
- Return tuple description is missing the second element (size of package).
- architecture: Target architecture (e.g., 'x64', 'arm64') + architecture: Target architecture (e.g., 'x86_64', 'arm64') - - Size of the package + - Size of the package in bytes
401-403
: Add docstring for disable_rich_output method.The purpose and usage of this method are unclear. Please add a docstring explaining:
- The method's purpose
- Whether it should be abstract
- Expected behavior in subclasses
def disable_rich_output(self): + """ + Disable rich output formatting in the FaaS system. + + This method can be overridden by subclasses to implement + system-specific rich output disabling behavior. + """ passsebs.py (3)
228-228
: Add validation for storage configuration.Consider adding validation to ensure the storage configuration file is properly formatted and contains required fields.
@click.option("--storage-configuration", default=None, type=str, help="JSON configuration of deployed storage.") +def validate_storage_config(ctx, param, value): + if value is None: + return None + try: + with open(value, 'r') as f: + config = json.load(f) + # Add validation for required fields + return value + except (json.JSONDecodeError, IOError) as e: + raise click.BadParameter(f"Invalid storage configuration: {str(e)}") + +@click.option("--storage-configuration", + default=None, + type=str, + callback=validate_storage_config, + help="JSON configuration of deployed storage.")
439-460
: Enhance documentation for measurement interval parameter.Consider adding more detailed documentation about:
- The purpose and impact of the measurement interval
- Recommended values for different use cases
- Performance implications of different intervals
644-655
: Fix unused loop variable and enhance error handling.
- Address the unused loop variable.
- Consider adding error handling for the resource group deletion operation.
- for idx, group in enumerate(groups): + for _idx, group in enumerate(groups): if len(prefix) > 0 and not group.startswith(prefix): continue sebs_client.logging.info(f"Removing resource group: {group}") - deployment_client.config.resources.delete_resource_group( - deployment_client.cli_instance, group, wait - ) + try: + deployment_client.config.resources.delete_resource_group( + deployment_client.cli_instance, group, wait + ) + except Exception as e: + sebs_client.logging.error(f"Failed to delete resource group {group}: {str(e)}") + if not wait: + sebs_client.logging.warning("Resource group deletion might still be in progress")🧰 Tools
🪛 Ruff
647-647: Loop control variable
idx
not used within loop bodyRename unused
idx
to_idx
(B007)
sebs/aws/aws.py (1)
238-278
: Consider extracting zip deployment logic.The create_function method is handling both container and zip deployments, making it quite long. Consider extracting the zip deployment logic into a separate method for better maintainability.
+def _create_zip_function(self, create_function_params: dict, package: str, code_size: int, + benchmark: str, architecture: str, language: str, + language_runtime: str) -> dict: + if code_size < 50 * 1024 * 1024: + with open(package, "rb") as f: + package_body = f.read() + create_function_params["Code"] = {"ZipFile": package_body} + else: + code_package_name = os.path.basename(package) + storage_client = self.get_storage() + code_bucket = storage_client.get_bucket(Resources.StorageBucketType.DEPLOYMENT) + code_prefix = os.path.join(benchmark, architecture, code_package_name) + storage_client.upload(code_bucket, package, code_prefix) + create_function_params["Code"] = { + "S3Bucket": code_bucket, + "S3Key": code_prefix, + } + create_function_params["Runtime"] = f"{language}{self._map_language_runtime(language, language_runtime)}" + create_function_params["Handler"] = "handler.handler" + return create_function_params def create_function(self, code_package: Benchmark, func_name: str, container_deployment: bool, container_uri: str) -> "LambdaFunction": # ... existing code ... if container_deployment: create_function_params["PackageType"] = "Image" create_function_params["Code"] = {"ImageUri": container_uri} else: create_function_params["PackageType"] = "Zip" - if code_size < 50 * 1024 * 1024: - # ... existing zip handling code ... + create_function_params = self._create_zip_function( + create_function_params, package, code_size, benchmark, + architecture, language, language_runtime + )🧰 Tools
🪛 Ruff
253-253: Use a context manager for opening files
(SIM115)
sebs/gcp/gcp.py (1)
141-150
: LGTM! Consider improving the error message.The changes to add container deployment support are well-structured. However, the error message could be more informative.
Consider updating the error message to be more specific:
- raise NotImplementedError("Container Deployment is not supported in GCP") + raise NotImplementedError("Container deployment is not supported in GCP. Use AWS for container-based deployments.")sebs/regression.py (1)
131-131
: Remove Unnecessary Commented CodeThe line:
# for trigger in triggers:
is commented out. If this code is no longer needed, consider removing it to keep the codebase clean. If it's temporarily commented out for debugging or development purposes, please add a comment explaining its intended future use.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (14)
config/systems.json
(8 hunks)sebs.py
(13 hunks)sebs/aws/aws.py
(9 hunks)sebs/aws/container.py
(1 hunks)sebs/benchmark.py
(9 hunks)sebs/config.py
(1 hunks)sebs/faas/container.py
(1 hunks)sebs/faas/system.py
(7 hunks)sebs/gcp/gcp.py
(4 hunks)sebs/openwhisk/container.py
(1 hunks)sebs/openwhisk/openwhisk.py
(9 hunks)sebs/regression.py
(11 hunks)sebs/sebs.py
(2 hunks)sebs/utils.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- sebs/benchmark.py
- sebs/sebs.py
🧰 Additional context used
🪛 Ruff
sebs.py
647-647: Loop control variable idx
not used within loop body
Rename unused idx
to _idx
(B007)
sebs/aws/aws.py
253-253: Use a context manager for opening files
(SIM115)
sebs/aws/container.py
79-79: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
sebs/faas/container.py
62-65: Use ternary operator line = json.loads(txt) if isinstance(txt, str) else txt
instead of if
-else
-block
Replace if
-else
-block with line = json.loads(txt) if isinstance(txt, str) else txt
(SIM108)
🔇 Additional comments (26)
sebs/openwhisk/container.py (1)
28-46
: 🛠️ Refactor suggestion
Add input validation and improve registry handling.
While the method correctly constructs registry information, it could benefit from additional validation and error handling.
Consider these improvements:
def registry_name(
self, benchmark: str, language_name: str, language_version: str, architecture: str
) -> Tuple[str, str, str, str]:
+ """Construct registry information for container image.
+
+ Args:
+ benchmark (str): Benchmark identifier
+ language_name (str): Programming language name
+ language_version (str): Language version
+ architecture (str): Target architecture
+
+ Returns:
+ Tuple[str, str, str, str]: Registry name, repository name, image tag, and URI
+
+ Raises:
+ ValueError: If input parameters are invalid
+ """
+ if not all([benchmark, language_name, language_version, architecture]):
+ raise ValueError("All parameters must be non-empty strings")
+
+ # Sanitize inputs to ensure valid Docker names
+ safe_benchmark = benchmark.lower().replace(' ', '-')
+ safe_lang_name = language_name.lower().replace(' ', '-')
registry_name = self.config.resources.docker_registry
# We need to retag created images when pushing to registry other
# than default
repository_name = self.system_config.docker_repository()
image_tag = self.system_config.benchmark_image_tag(
- self.name(), benchmark, language_name, language_version, architecture
+ self.name(), safe_benchmark, safe_lang_name, language_version, architecture
)
if registry_name is not None and registry_name != "":
+ # Validate registry name format
+ if not registry_name.lower().replace('.', '').replace('-', '').isalnum():
+ raise ValueError(f"Invalid registry name: {registry_name}")
repository_name = f"{registry_name}/{repository_name}"
else:
- registry_name = "Docker Hub"
+ registry_name = "docker.io" # Use official Docker Hub registry name
image_uri = f"{repository_name}:{image_tag}"
return registry_name, repository_name, image_tag, image_uri
Key improvements:
- Added input validation
- Added parameter sanitization for Docker compatibility
- Added registry name validation
- Used official Docker Hub registry name
Let's verify the Docker registry naming conventions:
sebs/aws/container.py (4)
1-13
: LGTM! Well-structured imports and class definition.
The imports are properly organized and the class inheritance is appropriate for implementing AWS ECR container support.
22-33
: LGTM! Well-structured constructor.
The constructor properly initializes the ECR client and maintains parent class functionality.
34-37
: LGTM! Well-typed client property.
54-65
: 🛠️ Refactor suggestion
Simplify the method and improve error handling.
- The method has a redundant
return False
at the end. - The past review comment about logging exceptions is still valid.
def find_image(self, repository_name, image_tag) -> bool:
try:
response = self.ecr_client.describe_images(
repositoryName=repository_name, imageIds=[{"imageTag": image_tag}]
)
if response["imageDetails"]:
return True
- except ClientError:
+ except ClientError as e:
+ self.logging.error(f"Error checking image existence: {e}")
return False
-
- return False
config/systems.json (6)
58-59
: LGTM: Local configuration changes are consistent.
The addition of architecture and deployment specifications for local development is well-structured.
113-114
: LGTM: AWS container deployment configuration is well-structured.
The changes properly support:
- Multiple architectures (x64, arm64)
- Multiple deployment methods (package, container)
This aligns well with the PR objectives for container deployment support.
120-126
: LGTM: Azure configuration changes maintain consistency.
The restructuring of base images and addition of architecture/deployment specifications align with the overall configuration structure.
Also applies to: 171-172
178-185
: LGTM: GCP configuration uses secure base images.
Good practice observed:
- Using latest Ubuntu LTS (22.04) for Python runtimes
- Consistent architecture specification
- Clear deployment method specification
Also applies to: 228-229
84-85
: Document the setup.py requirements and usage.
The addition of setup.py
to deployment files is a significant change that affects the packaging process. Please ensure this is documented in:
- Deployment documentation
- Configuration guide
- Any relevant README files
282-283
: Verify container-only deployment for OpenWhisk.
OpenWhisk is configured with "deployments": ["container"]
while other providers support both package and container deployments. Please verify if this is intentional and document the reasoning.
✅ Verification successful
Based on the extensive search results, I can now provide a definitive response about the container-only deployment configuration for OpenWhisk:
Container-only deployment is intentional and required for OpenWhisk
OpenWhisk is specifically configured to use container-only deployment due to its inherent 48MB code package size limitation. This is well-documented in the codebase:
- The container-only approach is explicitly mentioned in docs/platforms.md: "OpenWhisk has a very low code package size limit of only 48 megabytes. So, to circumvent this limit, we deploy functions using pre-built Docker images."
- The implementation in sebs/openwhisk/ consistently uses Docker containers for function deployment
- All official OpenWhisk base images are properly configured in config/systems.json
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check if this limitation is documented
fd -t f "README|DEPLOY|CONFIGURATION" -X grep -l -A 5 "OpenWhisk.*container"
Length of output: 78
Script:
#!/bin/bash
# Let's check the systems.json file to understand deployment patterns across providers
cat config/systems.json | jq -r 'to_entries[] | select(.value.deployments != null) | "\(.key): \(.value.deployments | join(", "))"'
# Also search for any OpenWhisk related documentation or configuration
rg -i "openwhisk" -A 5 -B 5
Length of output: 69282
sebs/utils.py (1)
57-57
: Verify the impact of the condition change.
The change from if value:
to if value is not None:
allows falsy values (0, '', [], {}, False) to be stored in the nested dictionary. While this might be intentional for container configurations, please verify that this doesn't break existing functionality.
✅ Verification successful
Let me analyze the usage of update_nested_dict
further by checking the values being passed to it.
Based on the gathered information, I can now provide the final response:
The condition change is safe and intentional
The change from if value:
to if value is not None:
is actually a safer implementation because:
- All calls to
update_nested_dict
pass configuration values that should be explicitly checked forNone
rather than falsy values - The function is used to update configuration dictionaries where empty strings, zeros, or
False
values might be valid configuration options - The change aligns with the function's type hint
Optional[str]
, making the implementation more consistent with its interface
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Search for calls to update_nested_dict to understand the impact
# Test: Look for potential issues with falsy values being stored
# Find all direct calls to update_nested_dict
echo "Direct calls to update_nested_dict:"
rg "update_nested_dict\(" -A 2
# Find potential configuration files that might be affected
echo -e "\nPotential configuration files:"
fd -e json -e yaml -e yml
Length of output: 2993
Script:
#!/bin/bash
# Check the values being passed to update_nested_dict
echo "Checking update_nested_dict usage in sebs.py:"
rg -B 2 "update_nested_dict\(" sebs.py
# Check the implementation of update_nested_dict
echo -e "\nChecking update_nested_dict implementation:"
rg -B 5 -A 10 "def update_nested_dict" sebs/utils.py
# Check if there are any tests for update_nested_dict
echo -e "\nChecking for tests:"
fd -e py test
Length of output: 1949
sebs/faas/container.py (4)
1-14
: LGTM!
The imports are well-organized and the class hierarchy is appropriate.
15-28
: LGTM!
The abstract method and property implementations follow Python best practices.
123-128
: LGTM!
The abstract method is well-defined with clear parameters and return type annotation.
42-59
:
Security: Fix command injection vulnerability.
The execute()
call with f-string interpolation is vulnerable to command injection. This is still a valid issue from the previous review.
Apply this fix:
- execute(f"docker manifest inspect {repository_name}:{image_tag}")
+ execute(["docker", "manifest", "inspect", f"{repository_name}:{image_tag}"])
sebs/openwhisk/openwhisk.py (2)
35-37
: LGTM: Container client initialization is well-structured.
The OpenWhiskContainer is properly initialized with all necessary parameters.
114-116
: Verify container image building across architectures.
The image building process needs to be tested across different architectures to ensure compatibility.
sebs/faas/system.py (2)
188-212
: LGTM!
The method signature and docstring are well-structured and complete. The documentation properly describes the new container-related parameters and potential errors.
220-238
: LGTM!
The method signature and docstring are well-structured and complete. The documentation properly describes the new container-related parameters and potential errors.
sebs.py (2)
100-104
: LGTM! Well-documented CLI option.
The --container-deployment
flag is well-implemented with clear, descriptive help text that properly indicates AWS-only support.
133-133
: Verify container deployment configuration usage.
The container deployment configuration is properly added. Let's verify its usage across the codebase.
Also applies to: 157-157
✅ Verification successful
Based on the shell script results, I can generate the final response:
Container deployment configuration is properly integrated and verified
The container deployment configuration is correctly used across the codebase:
- Properly propagated through function parameters in deployment systems
- Correctly handled in AWS Lambda with container/zip package type switching
- Appropriately rejected with clear error messages in unsupported platforms (Azure, GCP, Local)
- Well-integrated with the configuration system and cache validation
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify container deployment configuration usage
# Test: Search for container_deployment usage in deployment implementations
rg -A 5 "container_deployment"
# Test: Search for container deployment related configuration
ast-grep --pattern 'container_deployment = $_'
Length of output: 19665
sebs/aws/aws.py (2)
13-13
: LGTM: Clean ECR client integration.
The ECR client integration follows the existing pattern of client initialization in the AWS class.
Also applies to: 76-78
400-401
: LGTM: Clean helper method implementations.
The changes to default_function_name
and the addition of disable_rich_output
are straightforward and well-implemented.
Also applies to: 629-630
sebs/gcp/gcp.py (2)
202-211
: LGTM! The changes are consistent.
The method signature update and error handling for container deployment are consistent with the package_code method.
345-354
: LGTM! The changes maintain consistency.
The method signature update and error handling for container deployment are consistent with other methods in the class.
def push_image(self, repository_uri, image_tag): | ||
|
||
username, password, registry_url = self.config.resources.ecr_repository_authorization( | ||
self.client | ||
) | ||
|
||
try: | ||
self.docker_client.login(username=username, password=password, registry=registry_url) | ||
super().push_image(repository_uri, image_tag) | ||
self.logging.info(f"Successfully pushed the image to registry {repository_uri}.") | ||
except docker.errors.APIError as e: | ||
self.logging.error(f"Failed to push the image to registry {repository_uri}.") | ||
self.logging.error(f"Error: {str(e)}") | ||
raise RuntimeError("Couldn't push to Docker registry") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Implement proper exception chaining in error handling.
The error handling is good, but the exception should be chained to preserve the original error context.
try:
self.docker_client.login(username=username, password=password, registry=registry_url)
super().push_image(repository_uri, image_tag)
self.logging.info(f"Successfully pushed the image to registry {repository_uri}.")
except docker.errors.APIError as e:
self.logging.error(f"Failed to push the image to registry {repository_uri}.")
self.logging.error(f"Error: {str(e)}")
- raise RuntimeError("Couldn't push to Docker registry")
+ raise RuntimeError("Couldn't push to Docker registry") from e
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def push_image(self, repository_uri, image_tag): | |
username, password, registry_url = self.config.resources.ecr_repository_authorization( | |
self.client | |
) | |
try: | |
self.docker_client.login(username=username, password=password, registry=registry_url) | |
super().push_image(repository_uri, image_tag) | |
self.logging.info(f"Successfully pushed the image to registry {repository_uri}.") | |
except docker.errors.APIError as e: | |
self.logging.error(f"Failed to push the image to registry {repository_uri}.") | |
self.logging.error(f"Error: {str(e)}") | |
raise RuntimeError("Couldn't push to Docker registry") | |
def push_image(self, repository_uri, image_tag): | |
username, password, registry_url = self.config.resources.ecr_repository_authorization( | |
self.client | |
) | |
try: | |
self.docker_client.login(username=username, password=password, registry=registry_url) | |
super().push_image(repository_uri, image_tag) | |
self.logging.info(f"Successfully pushed the image to registry {repository_uri}.") | |
except docker.errors.APIError as e: | |
self.logging.error(f"Failed to push the image to registry {repository_uri}.") | |
self.logging.error(f"Error: {str(e)}") | |
raise RuntimeError("Couldn't push to Docker registry") from e |
🧰 Tools
🪛 Ruff
79-79: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
def benchmark_base_images( | ||
self, deployment_name: str, language_name: str, architecture: str | ||
) -> Dict[str, str]: | ||
return self._system_config[deployment_name]["languages"][language_name]["base_images"][ | ||
architecture | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add parameter validation for benchmark_base_images.
Similar to supported_language_versions, this method should validate its parameters.
Consider this implementation:
def benchmark_base_images(
self, deployment_name: str, language_name: str, architecture: str
) -> Dict[str, str]:
+ if deployment_name not in self._system_config:
+ raise ValueError(f"Unknown deployment: {deployment_name}")
+ if language_name not in self._system_config[deployment_name]["languages"]:
+ raise ValueError(f"Unsupported language: {language_name}")
+ if architecture not in self.supported_architecture(deployment_name):
+ raise ValueError(f"Unsupported architecture: {architecture}")
return self._system_config[deployment_name]["languages"][language_name]["base_images"][
architecture
]
Committable suggestion skipped: line range outside the PR's diff.
def supported_package_deployment(self, deployment_name: str) -> bool: | ||
return "package" in self._system_config[deployment_name]["deployments"] | ||
|
||
def supported_container_deployment(self, deployment_name: str) -> bool: | ||
return "container" in self._system_config[deployment_name]["deployments"] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add deployment name validation.
Both methods should validate the deployment_name parameter to prevent KeyError exceptions.
Consider this implementation for both methods:
def supported_package_deployment(self, deployment_name: str) -> bool:
+ if deployment_name not in self._system_config:
+ raise ValueError(f"Unknown deployment: {deployment_name}")
return "package" in self._system_config[deployment_name]["deployments"]
def supported_container_deployment(self, deployment_name: str) -> bool:
+ if deployment_name not in self._system_config:
+ raise ValueError(f"Unknown deployment: {deployment_name}")
return "container" in self._system_config[deployment_name]["deployments"]
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def supported_package_deployment(self, deployment_name: str) -> bool: | |
return "package" in self._system_config[deployment_name]["deployments"] | |
def supported_container_deployment(self, deployment_name: str) -> bool: | |
return "container" in self._system_config[deployment_name]["deployments"] | |
def supported_package_deployment(self, deployment_name: str) -> bool: | |
if deployment_name not in self._system_config: | |
raise ValueError(f"Unknown deployment: {deployment_name}") | |
return "package" in self._system_config[deployment_name]["deployments"] | |
def supported_container_deployment(self, deployment_name: str) -> bool: | |
if deployment_name not in self._system_config: | |
raise ValueError(f"Unknown deployment: {deployment_name}") | |
return "container" in self._system_config[deployment_name]["deployments"] |
def supported_language_versions( | ||
self, deployment_name: str, language_name: str, architecture: str | ||
) -> List[str]: | ||
return self._system_config[deployment_name]["languages"][language_name]["base_images"][ | ||
architecture |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add architecture parameter validation and error handling.
The method should validate the architecture parameter and handle missing architectures gracefully to prevent runtime errors.
Consider this implementation:
def supported_language_versions(
self, deployment_name: str, language_name: str, architecture: str
) -> List[str]:
+ if architecture not in self.supported_architecture(deployment_name):
+ raise ValueError(f"Unsupported architecture: {architecture}")
return self._system_config[deployment_name]["languages"][language_name]["base_images"][
architecture
].keys()
Committable suggestion skipped: line range outside the PR's diff.
tag = self.benchmark_image_tag( | ||
system, benchmark, language_name, language_version, architecture | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
Based on the investigation of the codebase, I can now provide the final response:
Parameter validation is needed, but tag format is Docker-compliant
The current implementation of benchmark_image_tag
follows Docker's tag naming conventions correctly:
- Uses only allowed characters (alphanumeric, dots, dashes)
- Has a consistent format:
function.<system>.<benchmark>.<language>-<version>-<architecture>[-prefix]
However, there's no parameter validation, which could lead to runtime issues. The suggested improvements in the review comment are valid and would enhance code reliability.
- Add parameter validation in
benchmark_image_tag
method - Add docstring to document the tag format
- Consider adding test cases for the validation
🔗 Analysis chain
Verify image tag format and add parameter validation.
The tag format looks good, but consider adding parameter validation and documenting the format.
Let's verify the tag format meets Docker requirements:
Consider adding validation and documentation:
def benchmark_image_tag(
self,
system: str,
benchmark: str,
language_name: str,
language_version: str,
architecture: str,
) -> str:
+ """Generate a Docker image tag following the format: function.<system>.<benchmark>.<language>-<version>-<arch>[-prefix]
+
+ Args:
+ system: The deployment system (e.g., 'aws')
+ benchmark: The benchmark name
+ language_name: The programming language
+ language_version: The language version
+ architecture: The target architecture
+
+ Returns:
+ str: A valid Docker image tag
+ """
+ if not all([system, benchmark, language_name, language_version, architecture]):
+ raise ValueError("All parameters must be non-empty strings")
tag = f"function.{system}.{benchmark}.{language_name}-{language_version}-{architecture}"
if self.image_tag_prefix:
tag = f"{tag}-{self.image_tag_prefix}"
return tag
Also applies to: 87-87
Let me gather more information about the image tag format by checking the actual implementation and test cases.
Let me check for any test cases that might validate the image tag format and any existing validation in the config class.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify image tag format in the codebase
# Test: Search for any potential invalid characters in image tags
rg -g '*.py' 'tag\s*=.*[^a-zA-Z0-9_.-]'
Length of output: 998
Script:
#!/bin/bash
# Check the implementation of benchmark_image_tag and its test cases
rg -g '*.py' -A 10 'def benchmark_image_tag'
# Check for any existing parameter validation
rg -g '*.py' 'raise ValueError.*tag'
# Look for test cases related to image tags
rg -g '*test*.py' 'test.*image.*tag'
Length of output: 659
Script:
#!/bin/bash
# Check for test files related to config
fd test_config.py
# Check the content of test files for image tag testing
rg -g '*test*.py' -B 2 -A 5 'benchmark_image_tag'
# Check for any validation in Config class
rg -g '*.py' -B 2 -A 5 'class Config'
Length of output: 820
architecture: str, | ||
benchmark: str, | ||
is_cached: bool, | ||
) -> Tuple[str, int]: | ||
container_deployment: bool, | ||
) -> Tuple[str, int, str]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix return type annotation inconsistency.
The return type annotation doesn't match the docstring. The tuple should include the container URI as the fourth element.
- ) -> Tuple[str, int, str]:
+ ) -> Tuple[str, int, bool, str]:
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
architecture: str, | |
benchmark: str, | |
is_cached: bool, | |
) -> Tuple[str, int]: | |
container_deployment: bool, | |
) -> Tuple[str, int, str]: | |
architecture: str, | |
benchmark: str, | |
is_cached: bool, | |
container_deployment: bool, | |
) -> Tuple[str, int, bool, str]: |
Runtime="{}{}".format( | ||
create_function_params["PackageType"] = "Zip" | ||
if code_size < 50 * 1024 * 1024: | ||
package_body = open(package, "rb").read() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use context manager for file operations.
Use a context manager to ensure proper file cleanup.
-package_body = open(package, "rb").read()
+with open(package, "rb") as f:
+ package_body = f.read()
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
package_body = open(package, "rb").read() | |
with open(package, "rb") as f: | |
package_body = f.read() |
🧰 Tools
🪛 Ruff
253-253: Use a context manager for opening files
(SIM115)
if container_deployment: | ||
self.client.update_function_code(FunctionName=name, ImageUri=container_uri) | ||
else: | ||
code_size = code_package.code_size | ||
package = code_package.code_location | ||
benchmark = code_package.benchmark | ||
|
||
function_cfg = FunctionConfig.from_benchmark(code_package) | ||
architecture = function_cfg.architecture.value | ||
|
||
# Run AWS update | ||
# AWS Lambda limit on zip deployment | ||
if code_size < 50 * 1024 * 1024: | ||
with open(package, "rb") as code_body: | ||
self.client.update_function_code( | ||
FunctionName=name, | ||
ZipFile=code_body.read(), | ||
Architectures=[self._map_architecture(architecture)], | ||
) | ||
# Upload code package to S3, then update | ||
else: | ||
code_package_name = os.path.basename(package) | ||
|
||
storage = cast(S3, self.get_storage()) | ||
bucket = function.code_bucket(code_package.benchmark, storage) | ||
code_prefix = os.path.join(benchmark, architecture, code_package_name) | ||
storage.upload(bucket, package, code_prefix) | ||
|
||
# Run AWS update | ||
# AWS Lambda limit on zip deployment | ||
if code_size < 50 * 1024 * 1024: | ||
with open(package, "rb") as code_body: | ||
self.client.update_function_code( | ||
FunctionName=name, | ||
ZipFile=code_body.read(), | ||
S3Bucket=bucket, | ||
S3Key=code_prefix, | ||
Architectures=[self._map_architecture(architecture)], | ||
) | ||
# Upload code package to S3, then update | ||
else: | ||
code_package_name = os.path.basename(package) | ||
|
||
storage = cast(S3, self.get_storage()) | ||
bucket = function.code_bucket(code_package.benchmark, storage) | ||
code_prefix = os.path.join(benchmark, architecture, code_package_name) | ||
storage.upload(bucket, package, code_prefix) | ||
|
||
self.client.update_function_code( | ||
FunctionName=name, | ||
S3Bucket=bucket, | ||
S3Key=code_prefix, | ||
Architectures=[self._map_architecture(architecture)], | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Reduce code duplication in deployment logic.
The zip deployment logic is duplicated between create_function
and update_function
. Consider extracting common code into a shared helper method.
+def _handle_zip_deployment(self, package: str, code_size: int, benchmark: str,
+ architecture: str) -> Tuple[str, str, str]:
+ if code_size < 50 * 1024 * 1024:
+ return None, None, None
+ code_package_name = os.path.basename(package)
+ storage = cast(S3, self.get_storage())
+ bucket = storage.get_bucket(Resources.StorageBucketType.DEPLOYMENT)
+ code_prefix = os.path.join(benchmark, architecture, code_package_name)
+ storage.upload(bucket, package, code_prefix)
+ return bucket, code_prefix, code_package_name
def update_function(self, function: Function, code_package: Benchmark,
container_deployment: bool, container_uri: str):
# ... existing code ...
if container_deployment:
self.client.update_function_code(FunctionName=name, ImageUri=container_uri)
else:
- # ... existing zip handling code ...
+ bucket, code_prefix, _ = self._handle_zip_deployment(
+ package, code_size, benchmark, architecture
+ )
+ if bucket is None:
+ with open(package, "rb") as code_body:
+ self.client.update_function_code(
+ FunctionName=name,
+ ZipFile=code_body.read(),
+ Architectures=[self._map_architecture(architecture)],
+ )
+ else:
+ self.client.update_function_code(
+ FunctionName=name,
+ S3Bucket=bucket,
+ S3Key=code_prefix,
+ Architectures=[self._map_architecture(architecture)],
+ )
Committable suggestion skipped: line range outside the PR's diff.
# if the containerized deployment is set to True | ||
if container_deployment: | ||
# build base image and upload to ECR | ||
_, container_uri = self.ecr_client.build_base_image( | ||
directory, language_name, language_version, architecture, benchmark, is_cached | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
🛠️ Refactor suggestion
Based on the gathered information, I can now provide the final response. The codebase shows consistent error handling patterns in container-related operations (in container.py) and other AWS services, but the specific build_base_image
call is not wrapped in error handling. This confirms the original review comment's concern.
Add error handling for container image building
The container image building process should follow the established error handling patterns found in the codebase, particularly since other ECR and container operations already implement proper error handling (as seen in container.py with docker.errors.APIError handling).
if container_deployment:
- # build base image and upload to ECR
- _, container_uri = self.ecr_client.build_base_image(
- directory, language_name, language_version, architecture, benchmark, is_cached
- )
+ try:
+ # build base image and upload to ECR
+ _, container_uri = self.ecr_client.build_base_image(
+ directory, language_name, language_version, architecture, benchmark, is_cached
+ )
+ except Exception as e:
+ self.logging.error(f"Failed to build container image: {e}")
+ raise RuntimeError("Container image build failed") from e
🔗 Analysis chain
Add error handling for container image building.
The container image building process could fail for various reasons (network issues, invalid Dockerfile, etc.). Consider adding error handling to gracefully handle failures.
if container_deployment:
- # build base image and upload to ECR
- _, container_uri = self.ecr_client.build_base_image(
- directory, language_name, language_version, architecture, benchmark, is_cached
- )
+ try:
+ # build base image and upload to ECR
+ _, container_uri = self.ecr_client.build_base_image(
+ directory, language_name, language_version, architecture, benchmark, is_cached
+ )
+ except Exception as e:
+ self.logging.error(f"Failed to build container image: {e}")
+ raise RuntimeError("Container image build failed") from e
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check if there are any error handling patterns in the codebase
rg -A 2 "try:.*?(build|push).*?except"
Length of output: 40
Script:
#!/bin/bash
# Let's check for any existing error handling patterns in the ECR client and related files
rg -A 5 "try:" sebs/aws/
# Also check for any error handling in the aws.py file specifically
rg -A 5 "except" sebs/aws/aws.py
# Look for the ECR client implementation to understand current error handling
ast-grep --pattern 'class ECRClient { $$$ }'
Length of output: 5853
Script:
#!/bin/bash
# Let's check the ECR client's build_base_image method implementation
ast-grep --pattern 'build_base_image($$$) { $$$ }'
# Also check for any existing error handling in container.py since it seems to handle ECR operations
rg -A 10 "build_base_image" sebs/aws/
Length of output: 770
bench, arch, deployment_type = kwargs["test_id"].split("_")[-3:None] | ||
test_name = f"{bench}, {arch}, {deployment_type}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct Slice Notation in Test ID Parsing
The current code:
bench, arch, deployment_type = kwargs["test_id"].split("_")[-3:None]
uses incorrect slice notation. In Python, slicing to the end of a list should omit the stop index or use None
without brackets. The [-3:None]
slice does not work as intended. The correct way is to use [-3:]
.
Apply this diff to fix the slice notation:
-bench, arch, deployment_type = kwargs["test_id"].split("_")[-3:None]
+bench, arch, deployment_type = kwargs["test_id"].split("_")[-3:]
18cc75b
to
9df1aa0
Compare
Added Support for container deployment in AWS.
Summary by CodeRabbit
Release Notes
New Features
ECRContainer
class for managing Docker images in AWS Elastic Container Registry.Bug Fixes
Documentation
Chores