Skip to content

ggml : move LLAMAFILE/tinyBLAS into a backend #10183

Closed as not planned
Closed as not planned
@ggerganov

Description

@ggerganov

The LLAMAFILE SGEMM routines are currently called directly from within ggml-cpu.c based on compile-time conditionals:

https://github.com/ggerganov/llama.cpp/blob/a9e8a9a0306a8093eef93b0022d9f45510490072/ggml/src/ggml-cpu.c#L7454-L7481

In order to simplify the logic and reduce the coupling of the different BLAS implementations, the LLAMAFILE code should be moved into a ggml backend, similar to the other BLAS implementations.

Not sure if it has to be a new backend, or if we can move it in the existing ggml-blas backend - TBD.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions