How GGML is different from ONNX #3022

biswaroop1547 · 2023-09-05T06:33:02Z

I am looking to create an exhaustive pros and cons list for ONNX vs GGML, and would like some help if someone can describe or give pointers on how GGML is different from ONNX.

Currently I am aware that GGML supports 4bit-quantization and follows a no-dependency approach (as mentioned here), and the format in which it creates the computation graph and stores the weights with optimizations (if any) is different.

Apart from this what are the differentiating factors here?

wangpy1204 · 2023-09-05T08:55:12Z

I also want to ask this question.

casperdcl · 2023-09-05T09:40:37Z

found some related issues as well:

onnx ggml#273
torch & tf ggml#41
ncnn ggml#148
2 formats ggml#220
SD request ggml#303
- mentions symisc/sod

KerfuffleV2 · 2023-09-05T12:18:30Z

I think the question is a little ambiguous. GGML could mean the machine language library itself, the file format (now called GGUF) or maybe even an implementation based on GGML that can do stuff like run inference on models (llama.cpp).

From the GGML as a library side, there isn't really a "format" for the graph, there's an API you can use to construct the graph. Likewise for the weights, they don't have to come from a GGML/GGUF format file at all. Just for example, my little Rust RWKV implementation over here actually only loads models for PyTorch or SafeTensors format files and dynamically quantizes the tensors.

staviq · 2023-09-05T15:05:59Z

Glancing through ONNX GitHub readme, from what I understand ONNX is just a "model container" format without any specifics associated inference engine, whereas GGML/GGUF are part of an inference ecosystem together with ggml/llama.cpp.

So the difference would be roughly similar to a 3d model vs unreal engine asset.

biswaroop1547 · 2023-09-05T15:34:49Z

@staviq sorry for not being clear, but for inference onnx can use onnxruntime which has multiple backends/ optimisations support.

staviq · 2023-09-05T15:44:17Z

I see, thank you for clarification.

github-actions · 2024-04-05T01:06:21Z

This issue was closed because it has been inactive for 14 days since being marked as stale.

github-actions bot added the stale label Mar 21, 2024

github-actions bot closed this as completed Apr 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How GGML is different from ONNX #3022

How GGML is different from ONNX #3022

biswaroop1547 commented Sep 5, 2023

wangpy1204 commented Sep 5, 2023

casperdcl commented Sep 5, 2023

KerfuffleV2 commented Sep 5, 2023

staviq commented Sep 5, 2023

biswaroop1547 commented Sep 5, 2023

staviq commented Sep 5, 2023

github-actions bot commented Apr 5, 2024

How GGML is different from ONNX #3022

How GGML is different from ONNX #3022

Comments

biswaroop1547 commented Sep 5, 2023

wangpy1204 commented Sep 5, 2023

casperdcl commented Sep 5, 2023

KerfuffleV2 commented Sep 5, 2023

staviq commented Sep 5, 2023

biswaroop1547 commented Sep 5, 2023

staviq commented Sep 5, 2023

github-actions bot commented Apr 5, 2024