You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Neuron Compiler version 1.24.0.0+d58fa6134
HWM version 1.17.6.0-fbcd6c853
NEFF version Dynamic
TVM version 1.19.6.0+0
NumPy version 1.23.4
MXNet not available
TF not available
Log Output from Neuron Compiler
(aws_neuron_venv_pytorch_1_13_inf1) root@ip-10-104-110-148:/var/snap/amazon-ssm-agent/6312/ultralytics# ipython
Python 3.10.12 (main, Sep 11 2024, 15:47:36) [GCC 11.4.0]
Type 'copyright', 'credits' or 'license'for more information
IPython 8.28.0 -- An enhanced Interactive Python. Type '?'for help.
In [1]: from ultralytics import NeuronYOLO
...: model = NeuronYOLO("yolov8x_person_face.pt")
...: model.export(format = "neuron")
...:
Ultralytics YOLOv8.2.48 🚀 Python-3.10.12 torch-1.13.1+cu117 CPU (Intel Xeon Platinum 8275CL 3.00GHz)
Model summary (fused): 268 layers, 68125494 parameters, 0 gradients, 257.4 GFLOPs
PyTorch: starting from 'yolov8x_person_face.pt' with input shape (1, 3, 640, 640) BCHW and output shape(s) (1, 6, 8400) (130.4 MB)
AWS Neuron: starting export with torch 1.13.1.2.11.7.0...
INFO:Neuron:All operators are compiled by neuron-cc (this does not guarantee that neuron-cc will successfully compile)
INFO:Neuron:Number of arithmetic operators (pre-compilation) before = 278, fused = 278, percent fused = 100.0%
/opt/aws_neuron_venv_pytorch_1_13_inf1/lib/python3.10/site-packages/dask/dataframe/__init__.py:42: FutureWarning:
Dask dataframe query planning is disabled because dask-expr is not installed.
You can install it with `pip install dask[dataframe]` or `conda install dask`.
This will raise in a future version.
warnings.warn(msg, FutureWarning)
INFO:Neuron:Compiling function_NeuronGraph$1070 with neuron-cc
INFO:Neuron:Compiling with command line: '/opt/aws_neuron_venv_pytorch_1_13_inf1/bin/neuron-cc compile /tmp/tmp5ldpdpcf/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /tmp/tmp5ldpdpcf/graph_def.neff --io-config {"inputs": {"0:0": [[1, 3, 640, 640], "float32"]}, "outputs": ["Detect_74/aten_cat_5/concat:0"]} --verbose 35'
............................WARNING:Neuron:The neuron-cc (neuron compiler) process was killed (SIG_KILL). This typically happens when there is insufficient memory to compile and the linux Out Of Memory (OOM) killer terminates the compiler. Consider trying compilation on an instance with more memory
WARNING:Neuron:torch.neuron.trace failed on _NeuronGraph$1070; falling back to native python functioncall
ERROR:Neuron:neuron-cc failed with the following command line call:
/opt/aws_neuron_venv_pytorch_1_13_inf1/bin/neuron-cc compile /tmp/tmp5ldpdpcf/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /tmp/tmp5ldpdpcf/graph_def.neff --io-config '{"inputs": {"0:0": [[1, 3, 640, 640], "float32"]}, "outputs": ["Detect_74/aten_cat_5/concat:0"]}' --verbose 35
Traceback (most recent call last):
File "/opt/aws_neuron_venv_pytorch_1_13_inf1/lib/python3.10/site-packages/torch_neuron/convert.py", line 413, in op_converter
neuron_function = self.subgraph_compiler(
File "/opt/aws_neuron_venv_pytorch_1_13_inf1/lib/python3.10/site-packages/torch_neuron/decorators.py", line 263, in trace
raise subprocess.SubprocessError(
subprocess.SubprocessError: neuron-cc failed with the following command line call:
/opt/aws_neuron_venv_pytorch_1_13_inf1/bin/neuron-cc compile /tmp/tmp5ldpdpcf/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /tmp/tmp5ldpdpcf/graph_def.neff --io-config '{"inputs": {"0:0": [[1, 3, 640, 640], "float32"]}, "outputs": ["Detect_74/aten_cat_5/concat:0"]}' --verbose 35
INFO:Neuron:Number of arithmetic operators (post-compilation) before = 278, compiled = 0, percent compiled = 0.0%
INFO:Neuron:The neuron partitioner created 1 sub-graphs
INFO:Neuron:Neuron successfully compiled 0 sub-graphs, Total fused subgraphs = 1, Percent of model sub-graphs successfully compiled = 0.0%
INFO:Neuron:Compiled these operators (and operator counts) to Neuron:
INFO:Neuron:Not compiled operators (and operator counts) to Neuron:
INFO:Neuron: => aten::Int: 7 [supported]
INFO:Neuron: => aten::_convolution: 104 [supported]
INFO:Neuron: => aten::add: 20 [supported]
INFO:Neuron: => aten::cat: 19 [supported]
INFO:Neuron: => aten::chunk: 1 [supported]
INFO:Neuron: => aten::div: 1 [supported]
INFO:Neuron: => aten::max_pool2d: 3 [supported]
INFO:Neuron: => aten::mul: 1 [supported]
INFO:Neuron: => aten::sigmoid: 1 [supported]
INFO:Neuron: => aten::silu_: 97 [supported]
INFO:Neuron: => aten::size: 3 [supported]
INFO:Neuron: => aten::softmax: 1 [supported]
INFO:Neuron: => aten::split_with_sizes: 9 [supported]
INFO:Neuron: => aten::sub: 2 [supported]
INFO:Neuron: => aten::transpose: 1 [supported]
INFO:Neuron: => aten::unsqueeze: 1 [supported]
INFO:Neuron: => aten::upsample_nearest2d: 2 [supported]
INFO:Neuron: => aten::view: 5 [supported]
AWS Neuron: export failure ❌ 644.2s: No operations were successfully partitioned and compiled to neuron for this model - aborting trace!
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
Cell In[1], line 3
1 from ultralytics import NeuronYOLO
2 model = NeuronYOLO("yolov8x_person_face.pt")
----> 3 model.export(format = "neuron")
File /var/snap/amazon-ssm-agent/6312/ultralytics/ultralytics/engine/neuron_model.py:55, in NeuronModel.export(self, **kwargs)
43 custom = {
44 "imgsz": self.model.args["imgsz"],
45 "batch": 1,
46 "data": None,
47 "verbose": False,
48 } # method defaults
49 args = {
50 **self.overrides,
51 **custom,
52 **kwargs,
53 "mode": "export",
54 } # highest priority args on the right
---> 55 return NeuronExporter(overrides=args, _callbacks=self.callbacks)(model=self.model)
File /opt/aws_neuron_venv_pytorch_1_13_inf1/lib/python3.10/site-packages/torch/autograd/grad_mode.py:27, in _DecoratorContextManager.__call__.<locals>.decorate_context(*args, **kwargs)
24 @functools.wraps(func)
25 def decorate_context(*args, **kwargs):
26 with self.clone():
---> 27 return func(*args, **kwargs)
File /var/snap/amazon-ssm-agent/6312/ultralytics/ultralytics/engine/neuron_exporter.py:319, in NeuronExporter.__call__(self, model)
317 f[12], _ = self.export_neuronx()
318 if neuron: # Neuron
--> 319 f[13], _ = self.export_neuron()
321 # Finish
322 f = [str(x) forxin f if x] # filter out '' and None
File /var/snap/amazon-ssm-agent/6312/ultralytics/ultralytics/engine/neuron_exporter.py:130, in try_export.<locals>.outer_func(*args, **kwargs)
128 except Exception as e:
129 LOGGER.info(f"{prefix} export failure ❌ {dt.t:.1f}s: {e}")
--> 130 raise e
File /var/snap/amazon-ssm-agent/6312/ultralytics/ultralytics/engine/neuron_exporter.py:125, in try_export.<locals>.outer_func(*args, **kwargs)
123 try:
124 with Profile() as dt:
--> 125 f, model = inner_func(*args, **kwargs)
126 LOGGER.info(f"{prefix} export success ✅ {dt.t:.1f}s, saved as '{f}' ({file_size(f):.1f} MB)")
127 return f, model
File /var/snap/amazon-ssm-agent/6312/ultralytics/ultralytics/engine/neuron_exporter.py:372, in NeuronExporter.export_neuron(self, prefix)
370 LOGGER.info(f"\n{prefix} starting export with torch {torch_neuron.__version__}...")
371 f = self.file.with_suffix(".neuron")
--> 372 ts = torch_neuron.trace(self.model, self.im, strict=False)
373 extra_files = {"config.txt": json.dumps(self.metadata)}
374 ts.save(str(f), _extra_files=extra_files)
File /opt/aws_neuron_venv_pytorch_1_13_inf1/lib/python3.10/site-packages/torch_neuron/convert.py:217, in trace(func, example_inputs, fallback, op_whitelist, minimum_segment_size, subgraph_builder_function, subgraph_inputs_pruning, skip_compiler, debug_must_trace, allow_no_ops_on_neuron, compiler_workdir, dynamic_batch_size, compiler_timeout, single_fusion_ratio_threshold, _neuron_trace, compiler_args, optimizations, separate_weights, verbose, **kwargs)
215 logger.debug("skip_inference_context - trace with fallback at {}".format(get_file_and_line()))
216 neuron_graph = cu.compile_fused_operators(neuron_graph, **compile_kwargs)
--> 217 cu.stats_post_compiler(neuron_graph)
219 # Wrap the compiled version of the model in a script module. Note that this is
220 # necessary for torch==1.8.1 due to the usage of `torch.classes.model.Model`. The
221 # custom class must be a submodule of the traced graph.
222 neuron_graph = AwsNeuronGraphModule(neuron_graph)
File /opt/aws_neuron_venv_pytorch_1_13_inf1/lib/python3.10/site-packages/torch_neuron/convert.py:530, in CompilationUnit.stats_post_compiler(self, neuron_graph)
526 logger.info(' => {}: {} {}'.format(
527 name, remaining_count, supported_string))
529 if succesful_compilations == 0 and not self.allow_no_ops_on_neuron:
--> 530 raise RuntimeError(
531 "No operations were successfully partitioned and compiled to neuron for this model - aborting trace!")
533 if percent_operations_compiled < 50.0:
534 logger.warning(
535 "torch.neuron.trace was unable to compile > 50% of the operators in the compiled model!")
RuntimeError: No operations were successfully partitioned and compiled to neuron for this model - aborting trace!
How to Reproduce
Start EC2 instance c5.2xlarge with AMI: ami-09c4564a5c7fa27d5
@takipipo this model is a 68B model, and it takes twice that much memory to compile in Neuron V1. Can you try with a larger instance that has at least 192GB of memory?
Additionally, you may encounter issues running a model this size on Inf1. We recommend that you upgrade to Inferentia2 and the latest version of the Neuron SDK (which includes neuronx_cc and torch_neuronx).
Description
I am able to compile the pretrained detection tasks COCO weight from ultralytics (i.e.
yolov8l.pt
,yolov8x.pt
). However when I load the weight from https://github.com/WildChlamydia/MiVOLO?tab=readme-ov-file#demo at the Download, I cannot compile the model to neuron due to the OOMEnvironments
pip list
neuron-cc -V
Log Output from Neuron Compiler
How to Reproduce
c5.2xlarge
with AMI:ami-09c4564a5c7fa27d5
What I've Tried
64GB
memory instance, but still failed.The text was updated successfully, but these errors were encountered: