Closed as not planned
Description
The LLAMAFILE
SGEMM routines are currently called directly from within ggml-cpu.c
based on compile-time conditionals:
In order to simplify the logic and reduce the coupling of the different BLAS implementations, the LLAMAFILE
code should be moved into a ggml
backend, similar to the other BLAS implementations.
Not sure if it has to be a new backend, or if we can move it in the existing ggml-blas
backend - TBD.