We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent 3df890a commit 9733104Copy full SHA for 9733104
README.md
@@ -155,8 +155,8 @@ python3 -m pip install torch numpy sentencepiece
155
# convert the 7B model to ggml FP16 format
156
python3 convert-pth-to-ggml.py models/7B/ 1
157
158
-# quantize the model to 4-bits
159
-python3 quantize.py 7B
+# quantize the model to 4-bits (using method 2 = q4_0)
+./quantize ./models/7B/ggml-model-f16.bin ./models/7B/ggml-model-q4_0.bin 2
160
161
# run the inference
162
./main -m ./models/7B/ggml-model-q4_0.bin -n 128
quantize.py
0 commit comments