Closed
Description
Used this model: https://huggingface.co/meta-llama/Llama-2-70b
Used these commands:
$ convert-pth-to-ggml.py models/LLaMa2-70B-meta 1
$ ./quantize ./models/LLaMa2-70B-meta/ggml-model-f16.bin ./models/LLaMa2-70B-meta/ggml-model-q4_0.bin 2
7B and 11B models work without any problems. This is only when using the 70B model.
error loading model: llama.cpp: tensor 'layers.0.attention.wk.weight' has wrong shape; expected 8192 x 8192, got 8192 x 1024
llama.cpp: loading model from /Users/xyz/Desktop/llama.cpp/models/LLaMa2-70B-meta/ggml-model-q4_0.bin
llama_model_load_internal: format = ggjt v3 (latest)
llama_model_load_internal: n_vocab = 32000
llama_model_load_internal: n_ctx = 512
llama_model_load_internal: n_embd = 8192
llama_model_load_internal: n_mult = 256
llama_model_load_internal: n_head = 64
llama_model_load_internal: n_layer = 80
llama_model_load_internal: n_rot = 128
llama_model_load_internal: freq_base = 10000.0
llama_model_load_internal: freq_scale = 1
llama_model_load_internal: ftype = 2 (mostly Q4_0)
llama_model_load_internal: n_ff = 22016
llama_model_load_internal: model size = 65B
llama_model_load_internal: ggml ctx size = 0.19 MB
error loading model: llama.cpp: tensor 'layers.0.attention.wk.weight' has wrong shape; expected 8192 x 8192, got 8192 x 1024
llama_load_model_from_file: failed to load model
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
Cell In[2], line 1
----> 1 llm = Llama(model_path="/Users/xyz/Desktop/llama.cpp/models/LLaMa2-70B-meta/ggml-model-q4_0.bin", n_ctx=512, seed=43, n_threads=8, n_gpu_layers=1)
File /opt/homebrew/Caskroom/miniforge/base/envs/tensorflow_m1/lib/python3.11/site-packages/llama_cpp/llama.py:305, in Llama.__init__(self, model_path, n_ctx, n_parts, n_gpu_layers, seed, f16_kv, logits_all, vocab_only, use_mmap, use_mlock, embedding, n_threads, n_batch, last_n_tokens_size, lora_base, lora_path, low_vram, tensor_split, rope_freq_base, rope_freq_scale, verbose)
300 raise ValueError(f"Model path does not exist: {model_path}")
302 self.model = llama_cpp.llama_load_model_from_file(
303 self.model_path.encode("utf-8"), self.params
304 )
--> 305 assert self.model is not None
307 self.ctx = llama_cpp.llama_new_context_with_model(self.model, self.params)
309 assert self.ctx is not None
AssertionError:
Metadata
Metadata
Assignees
Labels
No labels