Is there a way to UN-gguf a model? #23
ColumbusAI
started this conversation in
General
Replies: 1 comment
-
You can dequantize GGUFs, but precision is lost and can't be fully recovered. For example, bf16 (safetensors) -> bf16 (GGUF) -> Q4_K_M (GGUF) -> Q6_K (GGUF) loses precision. To try multiple quantization types, you would normally keep the bf16 (GGUF), and quantize from that for maximum precision. I hacked together a quick Python script to test dequantizing GGUF models, although it doesn't recover precision: import json
import numpy as np
import torch
from safetensors.torch import save_file
import gguf
def gguf_to_safetensors(gguf_path, safetensors_path, dequantize=False, metadata_path=None):
"""
Converts a GGUF file to SafeTensors format.
Args:
gguf_path: Path to the GGUF file.
safetensors_path: Path to save the SafeTensors file.
dequantize: If True, dequantize tensors to float32.
metadata_path: Optional path to save metadata as a JSON file.
"""
reader = gguf.GGUFReader(gguf_path)
tensors = {}
metadata = {}
for tensor in reader.tensors:
if dequantize and tensor.tensor_type != gguf.GGMLQuantizationType.F32:
dequantized_data = tensor.data.astype(np.float32)
tensors[tensor.name] = torch.from_numpy(dequantized_data.reshape(tuple(reversed(tensor.shape)))) # Reshape
else:
tensors[tensor.name] = torch.from_numpy(np.array(tensor.data).reshape(tuple(reversed(tensor.shape)))) # Reshape
for field_name, field in reader.fields.items():
if field.data:
metadata[field_name] = field.parts[field.data[0]].tolist()
save_file(tensors, safetensors_path)
decoded_metadata = {}
for key, value in metadata.items():
if isinstance(value, list) and all(isinstance(item, int) for item in value):
decoded_value = ""
for item in value:
if 48 <= item <= 57: # Check if it's an ASCII code for a number
decoded_value += str(item - 48) # Convert to the actual number
elif 32 <= item <= 126: # Check if it's a printable ASCII character
decoded_value += chr(item)
else:
decoded_value += str(item) # Keep as it is if not a number or printable char.
decoded_metadata[key] = decoded_value
else:
decoded_metadata[key] = value
if metadata_path:
with open(metadata_path, "w") as f:
json.dump(decoded_metadata, f, indent=4)
# Example usage
gguf_file = "quantized.gguf"
safetensors_file = "dequantized_gguf.safetensors"
metadata_file = "config.gguf.json"
gguf_to_safetensors(gguf_file, safetensors_file, dequantize=True, metadata_path=metadata_file) |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I'm curious if this functionality could be added. This way, users wouldn't have to keep multiple quants of a model as they can try GGML/GGUF quant, then un-do and compare to other quantization types to see which works best for their setup.
Beta Was this translation helpful? Give feedback.
All reactions