Skip to content

Commit 9733104

Browse files
committed
drop quantize.py (now that models are using a single file)
1 parent 3df890a commit 9733104

File tree

2 files changed

+2
-133
lines changed

2 files changed

+2
-133
lines changed

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -155,8 +155,8 @@ python3 -m pip install torch numpy sentencepiece
155155
# convert the 7B model to ggml FP16 format
156156
python3 convert-pth-to-ggml.py models/7B/ 1
157157

158-
# quantize the model to 4-bits
159-
python3 quantize.py 7B
158+
# quantize the model to 4-bits (using method 2 = q4_0)
159+
./quantize ./models/7B/ggml-model-f16.bin ./models/7B/ggml-model-q4_0.bin 2
160160

161161
# run the inference
162162
./main -m ./models/7B/ggml-model-q4_0.bin -n 128

quantize.py

Lines changed: 0 additions & 131 deletions
This file was deleted.

0 commit comments

Comments
 (0)