Open
Description
As per recent discussions (e.g. #10144 (review)), we should split the large ggml-cpu.c
implementation into smaller modules - similar to how the CUDA backend is organized. We should utilize C++11 C++ to reduce code duplication.
As per recent discussions (e.g. #10144 (review)), we should split the large ggml-cpu.c
implementation into smaller modules - similar to how the CUDA backend is organized. We should utilize C++11 C++ to reduce code duplication.