Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"RuntimeError: No available kernel. Aborting execution." with cuda LayerMask: SAM2Ultra #473

Open
ProtoBelisarius opened this issue Jan 7, 2025 · 2 comments

Comments

@ProtoBelisarius
Copy link

SAM2 Ultra works with CPU only, though only at fp32, with fp16 I get this error:

RuntimeError: Input type (float) and bias type (c10::Half) should be the same

That might be due to other nodes I have beforehand, idk, just wanted to mention it.

Neither bf16, fp16 or fp32 work with the SAM2 Ultra node when cuda is set.

Other nodes from LayerStyle that need cuda like lama or Joycaption work well. So Im somewhat confused where this comes from. Normal t2i or i2i generation works as well, its only this node that finds an issue.

Im on linux running AMD (6800XT), so that is likely related, but as mentioned, everything else works, so I wonder if this might be a bug or if it needs a certain version of a certain package.

I have tried different pytorch versions in case there are any bugs, rocm5.7, 6.1, 6.2 and now back to nightly 6.2.4, but none changed the error message.

Error message:

got prompt
Starting image processing
</s><s><s><s>(tree:1.5,(bush:<loc_812><loc_936><loc_998><loc_993></s>
match index: 0 in mask_indexes: ['0']
# 😺dzNodes: LayerStyle -> Object Detector MASK found 1 object(s)
# 😺dzNodes: LayerStyle -> SAM2 Ultra: Using model config: /home/user/Git/ComfyUI/custom_nodes/ComfyUI_LayerStyle_Advance/py/sam2/sam2_configs/sam2.1_hiera_l.yaml
Processing Images:   0%|                                                                                                                                                                                                                                                                                                       | 0/1 [00:00<?, ?it/s]!!! Exception during processing !!! No available kernel. Aborting execution.
Traceback (most recent call last):
  File "/home/user/Git/ComfyUI/execution.py", line 327, in execute
    output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/Git/ComfyUI/execution.py", line 202, in get_output_data
    return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/Git/ComfyUI/execution.py", line 174, in _map_node_over_list
    process_inputs(input_dict, i)
  File "/home/user/Git/ComfyUI/execution.py", line 163, in process_inputs
    results.append(getattr(obj, func)(**inputs))
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/Git/ComfyUI/custom_nodes/ComfyUI_LayerStyle_Advance/py/sam_2_ultra.py", line 373, in sam2_ultra
    out_masks, scores, logits = model.predict(
                                ^^^^^^^^^^^^^^
  File "/home/user/Git/ComfyUI/custom_nodes/ComfyUI_LayerStyle_Advance/py/sam2/sam2_image_predictor.py", line 271, in predict
    masks, iou_predictions, low_res_masks = self._predict(
                                            ^^^^^^^^^^^^^^
  File "/home/user/Git/ComfyUI/venv/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/Git/ComfyUI/custom_nodes/ComfyUI_LayerStyle_Advance/py/sam2/sam2_image_predictor.py", line 400, in _predict
    low_res_masks, iou_predictions, _, _ = self.model.sam_mask_decoder(
                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/Git/ComfyUI/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/Git/ComfyUI/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/Git/ComfyUI/custom_nodes/ComfyUI_LayerStyle_Advance/py/sam2/modeling/sam/mask_decoder.py", line 136, in forward
    masks, iou_pred, mask_tokens_out, object_score_logits = self.predict_masks(
                                                            ^^^^^^^^^^^^^^^^^^^
  File "/home/user/Git/ComfyUI/custom_nodes/ComfyUI_LayerStyle_Advance/py/sam2/modeling/sam/mask_decoder.py", line 213, in predict_masks
    hs, src = self.transformer(src, pos_src, tokens)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/Git/ComfyUI/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/Git/ComfyUI/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/Git/ComfyUI/custom_nodes/ComfyUI_LayerStyle_Advance/py/sam2/modeling/sam/transformer.py", line 113, in forward
    queries, keys = layer(
                    ^^^^^^
  File "/home/user/Git/ComfyUI/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/Git/ComfyUI/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/Git/ComfyUI/custom_nodes/ComfyUI_LayerStyle_Advance/py/sam2/modeling/sam/transformer.py", line 179, in forward
    queries = self.self_attn(q=queries, k=queries, v=queries)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/Git/ComfyUI/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/Git/ComfyUI/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/Git/ComfyUI/custom_nodes/ComfyUI_LayerStyle_Advance/py/sam2/modeling/sam/transformer.py", line 265, in forward
    out = F.scaled_dot_product_attention(q, k, v, dropout_p=dropout_p)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: No available kernel. Aborting execution.
@chflame163
Copy link
Owner

Sorry, I don't know the answer to this question and cannot provide any further assistance

@ProtoBelisarius
Copy link
Author

The following made it work for me:

Im running the nightly torch version, though I didnt test if my fix works with other versions.
pytorch version: 2.7.0.dev20250105+rocm6.2.4

I modified the following function and added AMD specific handling to: ComfyUI_LayerStyle_Advance/py/sam2/modeling/sam/transformer.py

def forward(self, q: Tensor, k: Tensor, v: Tensor) -> Tensor:
    q = self.q_proj(q)
    k = self.k_proj(k)
    v = self.v_proj(v)
    
    # Separate into heads
    q = self._separate_heads(q, self.num_heads)
    k = self._separate_heads(k, self.num_heads)
    v = self._separate_heads(v, self.num_heads)
    
    #use basic attention implementation
    dropout_p = self.dropout_p if self.training else 0.0
    
    #fallback for AMD
    with torch.backends.cuda.sdp_kernel(enable_math=True):
        out = F.scaled_dot_product_attention(q, k, v, dropout_p=dropout_p)
    
    out = self._recombine_heads(out)
    out = self.out_proj(out)
    return out

I also set these environment variables, Im not sure if they are necessary but I use them for something else as well, so they are always active for me.

export PYTORCH_HIP_ALLOC_CONF=garbage_collection_threshold:0.8,max_split_size_mb:512

export HSA_OVERRIDE_GFX_VERSION=10.3.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants