Skip to content

Commit 97df181

Browse files
committed
fix: Cleaner (maybe more correct?) splitting for gate/up
Branch: GraniteMoEShared Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
1 parent 0ee167e commit 97df181

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

convert_hf_to_gguf.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5685,7 +5685,7 @@ def modify_tensors(self, data_torch: Tensor, name: str, bid: int | None) -> Iter
56855685
if name.endswith("shared_mlp.input_linear.weight"):
56865686
ffn_dim = self.hparams["shared_intermediate_size"]
56875687
assert data_torch.shape[-2] == 2 * ffn_dim, "Merged FFN tensor size must be 2 * shared_intermediate_size"
5688-
gate, up = data_torch[..., :ffn_dim, :], data_torch[..., ffn_dim:, :]
5688+
gate, up = data_torch.split(ffn_dim, dim=-2)
56895689
return [
56905690
(self.format_tensor_name(gguf.MODEL_TENSOR.FFN_GATE_SHEXP, bid), gate),
56915691
(self.format_tensor_name(gguf.MODEL_TENSOR.FFN_UP_SHEXP, bid), up),

0 commit comments

Comments
 (0)