Closed
Description
There is functionality around llama_sampling_context
currently part of common
. We should move it into llama
. Pretty much the entire API from common/sampling.h
except llama_sampling_params
and llama_sampling_sample
can be integrated into the library.
This would probably require to also merge the grammar parser into the llama
lib implementation.
The llama_sampling_params
and llama_sampling_sample
will stay in common
since they are very example-specific and not general-purpose enough to be merged.