Closed
Description
I can't seem to be able to run 3B models (no issue with 7b and 13b). I'm getting the following error:
GGML_ASSERT: I:\llama-cpp\llama.cpp\ggml-cuda.cu:6709: ne00 == n_dims && "ne00 != n_dims is not implemented for CUDA yet"
I tried rocket-3b.Q5_K_M and akins-3b.Q6_K from thebloke
Tested with this on win10:
main.exe -ngl 35 -m rocket-3b.Q5_K_M.gguf -p Hello