@@ -245,6 +245,111 @@ Instructions for adding support for new models: [HOWTO-add-model.md](docs/develo
245
245
| [ CANN] ( docs/build.md#cann ) | Ascend NPU |
246
246
| [ OpenCL] ( docs/backend/OPENCL.md ) | Adreno GPU |
247
247
248
+ ## Software architecture
249
+
250
+ ``` mermaid
251
+ block-beta
252
+ columns 1
253
+
254
+ block:llamacpp
255
+ llamacpp["llama_cpp"]
256
+ style llamacpp fill:#3c3,color:#000,stroke:#000
257
+ end
258
+
259
+ block:ggml
260
+ ggml["GGML"]
261
+ style ggml fill:#3c3,color:#000,stroke:#000
262
+
263
+ ggml_cpu["ggml-cpu"]
264
+ ggml_metal["ggml-metal"]
265
+ ggml_sycl["ggml-sycl"]
266
+ ggml_cuda["ggml-cuda"]
267
+ ggml_hip["ggml-hip"]
268
+ ggml_vulkan["ggml-vulkan"]
269
+ ggml_cann["ggml-cann"]
270
+ ggml_opencl["ggml-opencl"]
271
+ ggml_qnn["ggml-qnn"]
272
+ ggml_nnpa["ggml-nnpa"]
273
+ ggml_ane["ggml-ane"]
274
+
275
+ style ggml_cpu fill:#888,color:#000,stroke:#000
276
+ style ggml_metal fill:#888,color:#000,stroke:#000
277
+ style ggml_sycl fill:#888,color:#000,stroke:#000
278
+ style ggml_cuda fill:#888,color:#000,stroke:#000
279
+ style ggml_hip fill:#888,color:#000,stroke:#000
280
+ style ggml_vulkan fill:#888,color:#000,stroke:#000
281
+ style ggml_cann fill:#888,color:#000,stroke:#000
282
+
283
+ style ggml_opencl fill:#cc3,color:#000,stroke:#000
284
+ style ggml_qnn fill:#cc3,color:#000,stroke:#000
285
+ style ggml_ane fill:#fff,color:#000,stroke:#f00,stroke-width:2,stroke-dasharray:5
286
+ style ggml_nnpa fill:#cc3,color:#000,stroke:#000
287
+ end
288
+
289
+ block:ggml_pal
290
+ ggml_pal["GGML Platform Abstraction Layer"]
291
+ style ggml_pal fill:#c33,color:#000,stroke:#000
292
+ end
293
+
294
+
295
+ block:OS
296
+ Windows
297
+ Linux
298
+ Android
299
+ QNX
300
+ IBM_z/OS
301
+ end
302
+
303
+ block:hardware_vendors
304
+ Intel
305
+ AMD
306
+ Apple
307
+ Nvidia
308
+ Huawei
309
+ Loongson
310
+ Qualcomm
311
+ IBM
312
+
313
+ ggml_metal --> Apple
314
+ ggml_cuda --> Nvidia
315
+ ggml_hip --> AMD
316
+ ggml_cann --> Huawei
317
+ ggml_sycl --> Intel
318
+ ggml_opencl --> Qualcomm
319
+ ggml_qnn --> Qualcomm
320
+ ggml_ane --> Apple
321
+ ggml_nnpa --> IBM
322
+ end
323
+
324
+ block:hardware_types
325
+ CPU
326
+ GPU
327
+ NPU
328
+ end
329
+
330
+ block:hardware_archs
331
+ x86
332
+ arm
333
+ risc
334
+ dsp
335
+ loongson
336
+ end
337
+ ```
338
+
339
+ ``` mermaid
340
+ %%{init: {"flowchart": {"htmlLabels": false, 'nodeSpacing': 30, 'rankSpacing': 30}} }%%
341
+ flowchart LR
342
+ classDef EXIST fill:#888,color:#000,stroke:#000
343
+ classDef DONE fill:#3c3,color:#000,stroke:#000
344
+ classDef WIP fill:#cc3,color:#000,stroke:#000
345
+ classDef TODO fill:#c33,color:#000,stroke:#000
346
+ classDef NEW fill:#fff,color:#000,stroke:#f00,stroke-width:2,stroke-dasharray:5
347
+ subgraph Legend
348
+ direction LR
349
+ EXIST:::EXIST ~~~ TODO:::TODO ~~~ WIP:::WIP ~~~ DONE:::DONE ~~~ NEW:::NEW
350
+ end
351
+ ```
352
+
248
353
## Building the project
249
354
250
355
The main product of this project is the ` llama ` library. Its C-style interface can be found in [ include/llama.h] ( include/llama.h ) .
0 commit comments