"RuntimeError: No available kernel. Aborting execution." with cuda LayerMask: SAM2Ultra #473

ProtoBelisarius · 2025-01-07T14:25:10Z

SAM2 Ultra works with CPU only, though only at fp32, with fp16 I get this error:

RuntimeError: Input type (float) and bias type (c10::Half) should be the same

That might be due to other nodes I have beforehand, idk, just wanted to mention it.

Neither bf16, fp16 or fp32 work with the SAM2 Ultra node when cuda is set.

Other nodes from LayerStyle that need cuda like lama or Joycaption work well. So Im somewhat confused where this comes from. Normal t2i or i2i generation works as well, its only this node that finds an issue.

Im on linux running AMD (6800XT), so that is likely related, but as mentioned, everything else works, so I wonder if this might be a bug or if it needs a certain version of a certain package.

I have tried different pytorch versions in case there are any bugs, rocm5.7, 6.1, 6.2 and now back to nightly 6.2.4, but none changed the error message.

Error message:

got prompt
Starting image processing
</s><s><s><s>(tree:1.5,(bush:<loc_812><loc_936><loc_998><loc_993></s>
match index: 0 in mask_indexes: ['0']
# 😺dzNodes: LayerStyle -> Object Detector MASK found 1 object(s)
# 😺dzNodes: LayerStyle -> SAM2 Ultra: Using model config: /home/user/Git/ComfyUI/custom_nodes/ComfyUI_LayerStyle_Advance/py/sam2/sam2_configs/sam2.1_hiera_l.yaml
Processing Images:   0%|                                                                                                                                                                                                                                                                                                       | 0/1 [00:00<?, ?it/s]!!! Exception during processing !!! No available kernel. Aborting execution.
Traceback (most recent call last):
  File "/home/user/Git/ComfyUI/execution.py", line 327, in execute
    output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/Git/ComfyUI/execution.py", line 202, in get_output_data
    return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/Git/ComfyUI/execution.py", line 174, in _map_node_over_list
    process_inputs(input_dict, i)
  File "/home/user/Git/ComfyUI/execution.py", line 163, in process_inputs
    results.append(getattr(obj, func)(**inputs))
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/Git/ComfyUI/custom_nodes/ComfyUI_LayerStyle_Advance/py/sam_2_ultra.py", line 373, in sam2_ultra
    out_masks, scores, logits = model.predict(
                                ^^^^^^^^^^^^^^
  File "/home/user/Git/ComfyUI/custom_nodes/ComfyUI_LayerStyle_Advance/py/sam2/sam2_image_predictor.py", line 271, in predict
    masks, iou_predictions, low_res_masks = self._predict(
                                            ^^^^^^^^^^^^^^
  File "/home/user/Git/ComfyUI/venv/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/Git/ComfyUI/custom_nodes/ComfyUI_LayerStyle_Advance/py/sam2/sam2_image_predictor.py", line 400, in _predict
    low_res_masks, iou_predictions, _, _ = self.model.sam_mask_decoder(
                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/Git/ComfyUI/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/Git/ComfyUI/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/Git/ComfyUI/custom_nodes/ComfyUI_LayerStyle_Advance/py/sam2/modeling/sam/mask_decoder.py", line 136, in forward
    masks, iou_pred, mask_tokens_out, object_score_logits = self.predict_masks(
                                                            ^^^^^^^^^^^^^^^^^^^
  File "/home/user/Git/ComfyUI/custom_nodes/ComfyUI_LayerStyle_Advance/py/sam2/modeling/sam/mask_decoder.py", line 213, in predict_masks
    hs, src = self.transformer(src, pos_src, tokens)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/Git/ComfyUI/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/Git/ComfyUI/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/Git/ComfyUI/custom_nodes/ComfyUI_LayerStyle_Advance/py/sam2/modeling/sam/transformer.py", line 113, in forward
    queries, keys = layer(
                    ^^^^^^
  File "/home/user/Git/ComfyUI/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/Git/ComfyUI/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/Git/ComfyUI/custom_nodes/ComfyUI_LayerStyle_Advance/py/sam2/modeling/sam/transformer.py", line 179, in forward
    queries = self.self_attn(q=queries, k=queries, v=queries)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/Git/ComfyUI/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/Git/ComfyUI/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/Git/ComfyUI/custom_nodes/ComfyUI_LayerStyle_Advance/py/sam2/modeling/sam/transformer.py", line 265, in forward
    out = F.scaled_dot_product_attention(q, k, v, dropout_p=dropout_p)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: No available kernel. Aborting execution.

The text was updated successfully, but these errors were encountered:

chflame163 · 2025-01-08T05:29:47Z

Sorry, I don't know the answer to this question and cannot provide any further assistance

ProtoBelisarius · 2025-01-08T14:13:46Z

The following made it work for me:

Im running the nightly torch version, though I didnt test if my fix works with other versions.
pytorch version: 2.7.0.dev20250105+rocm6.2.4

I modified the following function and added AMD specific handling to: ComfyUI_LayerStyle_Advance/py/sam2/modeling/sam/transformer.py

def forward(self, q: Tensor, k: Tensor, v: Tensor) -> Tensor:
    q = self.q_proj(q)
    k = self.k_proj(k)
    v = self.v_proj(v)
    
    # Separate into heads
    q = self._separate_heads(q, self.num_heads)
    k = self._separate_heads(k, self.num_heads)
    v = self._separate_heads(v, self.num_heads)
    
    #use basic attention implementation
    dropout_p = self.dropout_p if self.training else 0.0
    
    #fallback for AMD
    with torch.backends.cuda.sdp_kernel(enable_math=True):
        out = F.scaled_dot_product_attention(q, k, v, dropout_p=dropout_p)
    
    out = self._recombine_heads(out)
    out = self.out_proj(out)
    return out

I also set these environment variables, Im not sure if they are necessary but I use them for something else as well, so they are always active for me.

export PYTORCH_HIP_ALLOC_CONF=garbage_collection_threshold:0.8,max_split_size_mb:512

export HSA_OVERRIDE_GFX_VERSION=10.3.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"RuntimeError: No available kernel. Aborting execution." with cuda LayerMask: SAM2Ultra #473

"RuntimeError: No available kernel. Aborting execution." with cuda LayerMask: SAM2Ultra #473

ProtoBelisarius commented Jan 7, 2025

chflame163 commented Jan 8, 2025

ProtoBelisarius commented Jan 8, 2025

"RuntimeError: No available kernel. Aborting execution." with cuda LayerMask: SAM2Ultra #473

"RuntimeError: No available kernel. Aborting execution." with cuda LayerMask: SAM2Ultra #473

Comments

ProtoBelisarius commented Jan 7, 2025

chflame163 commented Jan 8, 2025

ProtoBelisarius commented Jan 8, 2025