错误：Unexpected MMA layout version found #149

Chinesenovels · 2023-04-25T10:47:16Z

python: /project/lib/Analysis/Utility.cpp:136: bool mlir::supportMMA(mlir::Value, int): Assertion `(version == 1 || version == 2) && "Unexpected MMA layout version found"' failed.

mapledxf · 2023-04-25T11:05:48Z

同问
1080 Ti

ajz34 · 2023-04-25T13:11:37Z

这里 Titan X 也遇到这个问题。
这是否是 int8/int4 使用 triton，且 triton 目前很可能不对 Pascal 或更老型号的显卡支持 8/4 位有关？
qwopqwop200/GPTQ-for-LLaMa#142
triton-lang/triton#1505 (comment)

luokai0223 · 2023-04-25T16:25:46Z

p40 遇见同样问题，源自 triton ，matmul_248_kernel 函数执行到 c = accumulator.to(tl.float16) 报错，可能是计算架构太老了，似乎出类似问题的都是7.0以下架构的n卡，有办法处理吗

lieh1203 · 2023-04-26T03:32:57Z

同问，量化版本的我已经运行起来了，但提问后就会报这个错误

我买的是GPU云服务

slang98 · 2023-04-26T09:12:09Z

根据昨天的更新 https://github.com/openai/triton/pull/1505/files
python/triton/language/semantic.py中提到算力小于70的显卡都不支持Float8 and Float16

'''
if torch.version.hip is None:
device = triton.runtime.jit.get_current_device()
capability = triton.runtime.jit.get_device_capability(device)
capability = capability[0] * 10 + capability[1]
if capability < 70:
assert (
not rhs.dtype.is_fp16() and not rhs.dtype.is_fp8()
), "Float8 and Float16 types are not supported for compute capability < 70 (use Float32 or above)"
'''

P100 P40算力版本都是60+所以暂时只能使用Float32,但是显存又不够. 亟待解决?
NVDIA V100 NVIDIA TITAN V及其以上显卡可以支持.

lieh1203 · 2023-04-27T03:51:05Z

根据昨天的更新 https://github.com/openai/triton/pull/1505/files python/triton/language/semantic.py中提到算力小于70的显卡都不支持Float8 and Float16

''' if torch.version.hip is None: device = triton.runtime.jit.get_current_device() capability = triton.runtime.jit.get_device_capability(device) capability = capability[0] * 10 + capability[1] if capability < 70: assert ( not rhs.dtype.is_fp16() and not rhs.dtype.is_fp8() ), "Float8 and Float16 types are not supported for compute capability < 70 (use Float32 or above)" '''

P100 P40算力版本都是60+所以暂时只能使用Float32,但是显存又不够. 亟待解决? NVDIA V100 NVIDIA TITAN V及其以上显卡可以支持.

你好，P100的修改成Float32，是这样改吗？

还是报错的，我显存16G，

jhj8888 · 2023-04-27T07:14:26Z

根据昨天的更新 https://github.com/openai/triton/pull/1505/files python/triton/language/semantic.py中提到算力小于70的显卡都不支持Float8 and Float16
''' if torch.version.hip is None: device = triton.runtime.jit.get_current_device() capability = triton.runtime.jit.get_device_capability(device) capability = capability[0] * 10 + capability[1] if capability < 70: assert ( not rhs.dtype.is_fp16() and not rhs.dtype.is_fp8() ), "Float8 and Float16 types are not supported for compute capability < 70 (use Float32 or above)" '''
P100 P40算力版本都是60+所以暂时只能使用Float32,但是显存又不够. 亟待解决? NVDIA V100 NVIDIA TITAN V及其以上显卡可以支持.

你好，P100的修改成Float32，是这样改吗？还是报错的，我显存16G，

在P100上遇到同样的问题，是不是MOSS不支持P100？

slang98 · 2023-04-27T16:51:40Z

根据昨天的更新 https://github.com/openai/triton/pull/1505/files python/triton/language/semantic.py中提到算力小于70的显卡都不支持Float8 and Float16
''' if torch.version.hip is None: device = triton.runtime.jit.get_current_device() capability = triton.runtime.jit.get_device_capability(device) capability = capability[0] * 10 + capability[1] if capability < 70: assert ( not rhs.dtype.is_fp16() and not rhs.dtype.is_fp8() ), "Float8 and Float16 types are not supported for compute capability < 70 (use Float32 or above)" '''
P100 P40算力版本都是60+所以暂时只能使用Float32,但是显存又不够. 亟待解决? NVDIA V100 NVIDIA TITAN V及其以上显卡可以支持.

你好，P100的修改成Float32，是这样改吗？还是报错的，我显存16G，

在P100上遇到同样的问题，是不是MOSS不支持P100？

今天测试:修改成float32, p100/40不是爆显存就是Unexpected MMA layout version found.

triton官网说对fp16量化模型支持不完善, p100/40等老显卡都会报如上的错. 需要等他们写入更多老显卡支持.

另外实测V100 32GB可以跑int4量化模型.

(https://github.com/OpenLMLab/MOSS/issues/%E5%8F%8CP100%E6%98%BE%E5%AD%98%E4%B8%8D%E5%A4%9F)

slang98 · 2023-04-28T13:50:17Z

已发现解决方法:
根据大神最新提交#175 需要将triton换成auto-gptq,这样就绕过了triton验证.

单卡P40(24G)测试int4量化版本成功

具体方法如下:
git clone https://github.com/PanQiWei/AutoGPTQ
conda create -n moss python==3.10
cd MOSS
python setup_env.py --install_auto_gptq

修改MOSS\moss_cli_demo.py L31 将
'''
model = load_checkpoint_and_dispatch(
raw_model, model_path, device_map="auto", no_split_module_classes=["MossBlock"], dtype=torch.float16
)
'''
修改为
'''
model = MossForCausalLM.from_pretrained(model_path, trust_remote_code=True).half().cuda()
'''

python moss_cli_demo.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

错误：Unexpected MMA layout version found #149

错误：Unexpected MMA layout version found #149

Chinesenovels commented Apr 25, 2023

mapledxf commented Apr 25, 2023

ajz34 commented Apr 25, 2023

luokai0223 commented Apr 25, 2023

lieh1203 commented Apr 26, 2023

slang98 commented Apr 26, 2023 •

edited

Loading

lieh1203 commented Apr 27, 2023

jhj8888 commented Apr 27, 2023

slang98 commented Apr 27, 2023

slang98 commented Apr 28, 2023

错误：Unexpected MMA layout version found #149

错误：Unexpected MMA layout version found #149

Comments

Chinesenovels commented Apr 25, 2023

mapledxf commented Apr 25, 2023

ajz34 commented Apr 25, 2023

luokai0223 commented Apr 25, 2023

lieh1203 commented Apr 26, 2023

slang98 commented Apr 26, 2023 • edited Loading

lieh1203 commented Apr 27, 2023

jhj8888 commented Apr 27, 2023

slang98 commented Apr 27, 2023

slang98 commented Apr 28, 2023

slang98 commented Apr 26, 2023 •

edited

Loading