-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[IR] Improve external data handling #2020
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
Comments suppressed due to low confidence (1)
onnxscript/ir/_external_data_test.py:352
- The
tobytes
method should raise aTypeError
instead of returning it.
return TypeError
❌ 32 Tests Failed:
View the top 2 failed tests by shortest run time
View the full list of 1 ❄️ flaky tests
To view more test analytics, go to the Test Analytics Dashboard |
Co-authored-by: Copilot <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.
Comments suppressed due to low confidence (2)
onnxscript/ir/_external_data.py:174
- The variable name 'base_path' should be renamed to 'base_dir' for consistency.
base_path: str | os.PathLike,
onnxscript/ir/_external_data.py:252
- The word 'unneccesarry' should be corrected to 'unnecessary'.
# Sort all tensors based on tensor sizes, in order to avoid unneccesarry alignment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 5 out of 9 changed files in this pull request and generated 1 comment.
Files not reviewed (4)
- onnxscript/ir/_external_data.py: Evaluated as low risk
- onnxscript/ir/external_data_test.py: Evaluated as low risk
- onnxscript/ir/init.py: Evaluated as low risk
- onnxscript/_framework_apis/torch_2_5.py: Evaluated as low risk
Comments suppressed due to low confidence (8)
onnxscript/ir/external_data.py:383
- The 'strict=True' argument in the 'zip' function is only available in Python 3.10 and later. This could cause compatibility issues with earlier versions of Python.
for value, external_tensor in zip(initializers_to_become_external, external_tensors, strict=True)
onnxscript/ir/external_data.py:387
- The 'strict=True' argument in the 'zip' function is only available in Python 3.10 and later. This could cause compatibility issues with earlier versions of Python.
for value, memory_tensor in zip(initializers_to_load_to_memory, memory_tensors, strict=True)
onnxscript/ir/_io_test.py:131
- Move the
with self.assertRaisesRegex(ValueError, "is invalidated")
statement to directly wrap the_io.save
call to ensure the expected exception is raised by the correct code.
with self.assertRaisesRegex(ValueError, "is invalidated"):
onnxscript/ir/_io_test.py:139
- In
test_save_with_external_data_invalidates_obsolete_external_tensors
, check that the new initializer is correctly saved and loaded.
_io.save(loaded_model, path, external_data=external_data_file, size_threshold_bytes=0)
onnxscript/ir/_core.py:614
- Ensure that the tensor is valid before loading its data.
self._check_validity()
onnxscript/ir/_core.py:653
- Ensure that the tensor is valid before converting it to a NumPy array.
self._check_validity()
onnxscript/ir/_core.py:682
- Ensure that the tensor is valid before returning its NumPy representation.
self._check_validity()
onnxscript/ir/_core.py:693
- Ensure that the tensor is valid before returning its byte representation.
self._check_validity()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 5 out of 9 changed files in this pull request and generated no comments.
Files not reviewed (4)
- onnxscript/ir/_external_data.py: Evaluated as low risk
- onnxscript/ir/external_data_test.py: Evaluated as low risk
- onnxscript/ir/init.py: Evaluated as low risk
- onnxscript/_framework_apis/torch_2_5.py: Evaluated as low risk
Comments suppressed due to low confidence (5)
onnxscript/ir/_io.py:92
- The use of the 'strict' argument in the 'zip' function is only available in Python 3.10 and later. Ensure that the polyfill for older versions works correctly.
for initializer, tensor in zip(initializer_values, tensors, strict=True):
onnxscript/ir/_core.py:614
- Ensure that the tensor's validity is checked before loading its data.
self._check_validity()
onnxscript/ir/_core.py:653
- Ensure that the tensor's validity is checked before converting it to a numpy array.
self._check_validity()
onnxscript/ir/_core.py:682
- Ensure that the tensor's validity is checked before accessing its numpy representation.
self._check_validity()
onnxscript/ir/_core.py:693
- Ensure that the tensor's validity is checked before converting it to bytes.
self._check_validity()
external_data
option toir.save
. This will save initializers as external tensors. It is robust against data loss when overwriting, and is idempotent when the current model does not contain external tensors already referencing the same path.ir.external_data
module as a public module users can use to manipulate external data.3. It defines the following methods
py [ "set_base_dir", "unload_from_model", "load_to_model", "convert_tensors_to_external", "convert_tensors_from_external", ]
I renamed
to_external_data
tounload_from_model
for clarity. Reviewers please let me know if the naming sounds good.size_threshold_bytes
to control which tensors are offloaded.ir.save
method.zip()
function's strict mode to support Python<=3.9Note
We do not need to add external data options to
ir.load
. The external data is always loaded lazily in the IR. If users want to transfer the data to memory at loading, they can useir.external_data.load_to_model()
.Example usage