70B quant is Censored (?), but 13B quant works perfectly #62

aleph65 · 2023-09-26T00:57:36Z

aleph65
Sep 26, 2023

Hello,

I quantized a (Uncensored) QLORA Merge of a Llama v2 model.

When I quantized 13B, it came out perfect (used 8.0 bpw)
But for 70B, the model came out totally censored and nothing like it was supposed to (no gibberish, but it's totally censored)

Can you help:

For --cal_dataset, I merged the QLORA's uncensored dataset in a single .parquet file. This seems to work really well for 13B. Did I do that incorrectly and just got lucky? Should I be using the wikitext-test.parquet to calibrate instead?
For 13B, I used default params and 8.0 bpw. For 70B, which came out censored, I used default params, and 5.0 bpw. Do you recommend a different set of params for the other args, like measurement_length, and the others?

I ran both quantized models on Oobabooga, same machine

I'm also redoing my work totally from scratch to make sure I didn't mix anything up - but would really appreciate some input here...

Thanks very much!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

70B quant is Censored (?), but 13B quant works perfectly #62

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

70B quant is Censored (?), but 13B quant works perfectly #62

aleph65 Sep 26, 2023

Replies: 0 comments

aleph65
Sep 26, 2023