@@ -56,17 +56,17 @@ The llama.cpp CANN backend is designed to support Ascend NPU. It utilize the abi
56
56
57
57
## Model Supports
58
58
59
- | Model Name | FP16 | Q8_0 | Q4_0 |
59
+ | Model Name | FP16 | Q4_0 | Q8_0 |
60
60
| :----------------------------| :-----:| :----:| :----:|
61
61
| Llama-2 | √ | √ | √ |
62
62
| Llama-3 | √ | √ | √ |
63
63
| Mistral-7B | √ | √ | √ |
64
64
| Mistral MOE | √ | √ | √ |
65
- | DBRX | ? | ? | ? |
65
+ | DBRX | - | - | - |
66
66
| Falcon | √ | √ | √ |
67
67
| Chinese LLaMA/Alpaca | √ | √ | √ |
68
68
| Vigogne(French) | √ | √ | √ |
69
- | BERT | √ | √ | √ |
69
+ | BERT | x | x | x |
70
70
| Koala | √ | √ | √ |
71
71
| Baichuan | √ | √ | √ |
72
72
| Aquila 1 & 2 | √ | √ | √ |
@@ -80,7 +80,7 @@ The llama.cpp CANN backend is designed to support Ascend NPU. It utilize the abi
80
80
| Qwen models | √ | √ | √ |
81
81
| PLaMo-13B | √ | √ | √ |
82
82
| Phi models | √ | √ | √ |
83
- | PhiMoE | ? | ? | ? |
83
+ | PhiMoE | √ | √ | √ |
84
84
| GPT-2 | √ | √ | √ |
85
85
| Orion | √ | √ | √ |
86
86
| InternlLM2 | √ | √ | √ |
@@ -89,45 +89,45 @@ The llama.cpp CANN backend is designed to support Ascend NPU. It utilize the abi
89
89
| Mamba | √ | √ | √ |
90
90
| Xverse | √ | √ | √ |
91
91
| command-r models | √ | √ | √ |
92
- | Grok-1 | ? | ? | ? |
92
+ | Grok-1 | - | - | - |
93
93
| SEA-LION | √ | √ | √ |
94
94
| GritLM-7B | √ | √ | √ |
95
95
| OLMo | √ | √ | √ |
96
96
| OLMo 2 | √ | √ | √ |
97
- | OLMoE | ? | ? | ? |
97
+ | OLMoE | √ | √ | √ |
98
98
| Granite models | √ | √ | √ |
99
- | GPT-NeoX | ? | ? | ? |
99
+ | GPT-NeoX | √ | √ | √ |
100
100
| Pythia | √ | √ | √ |
101
- | Snowflake-Arctic MoE | ? | ? | ? |
101
+ | Snowflake-Arctic MoE | - | - | - |
102
102
| Smaug | √ | √ | √ |
103
103
| Poro 34B | √ | √ | √ |
104
104
| Bitnet b1.58 models | √ | x | x |
105
105
| Flan-T5 | √ | √ | √ |
106
- | Open Elm models | x | x | x |
106
+ | Open Elm models | x | √ | √ |
107
107
| chatGLM3-6B + ChatGLM4-9b + GLMEdge-1.5b + GLMEdge-4b | √ | √ | √ |
108
108
| GLM-4-0414 | √ | √ | √ |
109
109
| SmolLM | √ | √ | √ |
110
110
| EXAONE-3.0-7.8B-Instruct | √ | √ | √ |
111
111
| FalconMamba Models | √ | √ | √ |
112
- | Jais Models | ? | ? | ? |
112
+ | Jais Models | - | x | x |
113
113
| Bielik-11B-v2.3 | √ | √ | √ |
114
- | RWKV-6 | √ | √ | √ |
114
+ | RWKV-6 | - | √ | √ |
115
115
| QRWKV-6 | √ | √ | √ |
116
116
| GigaChat-20B-A3B | x | x | x |
117
117
| Trillion-7B-preview | √ | √ | √ |
118
118
| Ling models | √ | √ | √ |
119
119
120
120
121
121
** Multimodal**
122
- | LLaVA 1.5 models, LLaVA 1.6 models | ? | ? | ? |
123
- | BakLLaVA | ? | ? | ? |
124
- | Obsidian | ? | ? | ? |
125
- | ShareGPT4V | ? | ? | ? |
126
- | MobileVLM 1.7B/3B models | ? | ? | ? |
127
- | Yi-VL | ? | ? | ? |
122
+ | LLaVA 1.5 models, LLaVA 1.6 models | x | x | x |
123
+ | BakLLaVA | √ | √ | √ |
124
+ | Obsidian | √ | - | - |
125
+ | ShareGPT4V | x | - | - |
126
+ | MobileVLM 1.7B/3B models | - | - | - |
127
+ | Yi-VL | - | - | - |
128
128
| Mini CPM | √ | √ | √ |
129
129
| Moondream | √ | √ | √ |
130
- | Bunny | ? | ? | ? |
130
+ | Bunny | √ | - | - |
131
131
| GLM-EDGE | √ | √ | √ |
132
132
| Qwen2-VL | √ | √ | √ |
133
133
0 commit comments