Skip to content

Commit 943d20b

Browse files
authored
musa : update doc (#9856)
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
1 parent 9677640 commit 943d20b

File tree

2 files changed

+10
-2
lines changed

2 files changed

+10
-2
lines changed

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ variety of hardware - locally and in the cloud.
3131
- Apple silicon is a first-class citizen - optimized via ARM NEON, Accelerate and Metal frameworks
3232
- AVX, AVX2 and AVX512 support for x86 architectures
3333
- 1.5-bit, 2-bit, 3-bit, 4-bit, 5-bit, 6-bit, and 8-bit integer quantization for faster inference and reduced memory use
34-
- Custom CUDA kernels for running LLMs on NVIDIA GPUs (support for AMD GPUs via HIP)
34+
- Custom CUDA kernels for running LLMs on NVIDIA GPUs (support for AMD GPUs via HIP and Moore Threads MTT GPUs via MUSA)
3535
- Vulkan and SYCL backend support
3636
- CPU+GPU hybrid inference to partially accelerate models larger than the total VRAM capacity
3737

@@ -413,7 +413,7 @@ Please refer to [Build llama.cpp locally](./docs/build.md)
413413
| [BLAS](./docs/build.md#blas-build) | All |
414414
| [BLIS](./docs/backend/BLIS.md) | All |
415415
| [SYCL](./docs/backend/SYCL.md) | Intel and Nvidia GPU |
416-
| [MUSA](./docs/build.md#musa) | Moore Threads GPU |
416+
| [MUSA](./docs/build.md#musa) | Moore Threads MTT GPU |
417417
| [CUDA](./docs/build.md#cuda) | Nvidia GPU |
418418
| [hipBLAS](./docs/build.md#hipblas) | AMD GPU |
419419
| [Vulkan](./docs/build.md#vulkan) | GPU |

docs/build.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -198,6 +198,8 @@ The following compilation options are also available to tweak performance:
198198
199199
### MUSA
200200
201+
This provides GPU acceleration using the MUSA cores of your Moore Threads MTT GPU. Make sure to have the MUSA SDK installed. You can download it from here: [MUSA SDK](https://developer.mthreads.com/sdk/download/musa).
202+
201203
- Using `make`:
202204
```bash
203205
make GGML_MUSA=1
@@ -209,6 +211,12 @@ The following compilation options are also available to tweak performance:
209211
cmake --build build --config Release
210212
```
211213
214+
The environment variable [`MUSA_VISIBLE_DEVICES`](https://docs.mthreads.com/musa-sdk/musa-sdk-doc-online/programming_guide/Z%E9%99%84%E5%BD%95/) can be used to specify which GPU(s) will be used.
215+
216+
The environment variable `GGML_CUDA_ENABLE_UNIFIED_MEMORY=1` can be used to enable unified memory in Linux. This allows swapping to system RAM instead of crashing when the GPU VRAM is exhausted.
217+
218+
Most of the compilation options available for CUDA should also be available for MUSA, though they haven't been thoroughly tested yet.
219+
212220
### hipBLAS
213221

214222
This provides BLAS acceleration on HIP-supported AMD GPUs.

0 commit comments

Comments
 (0)