Skip to content

Commit d7e71a5

Browse files
committed
llama: rwkv6: Add quantization tensor exclusion
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
1 parent 41d2a35 commit d7e71a5

File tree

1 file changed

+3
-0
lines changed

1 file changed

+3
-0
lines changed

src/llama.cpp

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16942,6 +16942,9 @@ static void llama_model_quantize_internal(const std::string & fname_inp, const s
1694216942
quantize &= name.find("ssm_x.weight") == std::string::npos;
1694316943
quantize &= name.find("ssm_dt.weight") == std::string::npos;
1694416944

16945+
// do not quantize RWKV's time_mix_first tensors
16946+
quantize &= name.find("time_mix_first.weight") == std::string::npos;
16947+
1694516948
// do not quantize relative position bias (T5)
1694616949
quantize &= name.find("attn_rel_b.weight") == std::string::npos;
1694716950

0 commit comments

Comments
 (0)