-
Notifications
You must be signed in to change notification settings - Fork 11.9k
Issues: ggml-org/llama.cpp
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
server : pad small embedding batches
examples
server
#13692
opened May 21, 2025 by
ggerganov
Loading…
GGML_ASSERT(seq_id < n_tokens && "seq_id cannot be larger than n_tokens with pooling_type == MEAN") failed
bug
Something isn't working
embeddings
embedding related topics
server
#13689
opened May 21, 2025 by
slaren
webui: Allow editing file attachments when editing messages.
examples
server
#13645
opened May 20, 2025 by
nauful
Loading…
mtmd : add ultravox audio input
documentation
Improvements or additions to documentation
examples
python
python script changes
server
#13623
opened May 18, 2025 by
ngxson
Loading…
server : separate the notion of position and KV tokens, remove prompt truncation
breaking change
Changes that break ABIs, APIs, file formats, or other forms of backwards compatibility.
examples
python
python script changes
server
#13576
opened May 15, 2025 by
ngxson
Loading…
webui: Add editing assistant messages (#11849)
examples
server
#13522
opened May 14, 2025 by
lr1729
Loading…
feat(server): Add tool call support to WebUI (LLama Server)
examples
server
#13501
opened May 13, 2025 by
samolego
Loading…
Break down main function in llama-server
examples
server
#13425
opened May 10, 2025 by
ericcurtin
Loading…
Support start strings, the opposite of stop tokens.
examples
python
python script changes
server
#13214
opened Apr 30, 2025 by
matteoserva
•
Draft
Support jinja extra template kwargs (Qwen3 enable_thinking feature), from command line and from client
examples
server
#13196
opened Apr 29, 2025 by
matteoserva
Loading…
set b = ub when b > ub with embedding
examples
server
#12940
opened Apr 14, 2025 by
ahmedshakill
Loading…
server : crash when -b > -ub with embeddings
bug
Something isn't working
embeddings
embedding related topics
good first issue
Good for newcomers
server
#12836
opened Apr 8, 2025 by
ggerganov
WIP: Add support for CogAgent
examples
python
python script changes
server
#12679
opened Mar 31, 2025 by
Tianyue-Zhao
•
Draft
llama-server : implement universal assisted decoding
examples
server
#12635
opened Mar 28, 2025 by
g2mt
Loading…
server
: streaming of tool calls and thoughts when --jinja
is on
documentation
tool-call
: Phi-4 support
android
#12288
opened Mar 9, 2025 by
jpohhhh
Loading…
server webui easy config selection
demo
Demonstrate some concept or idea, not intended to be merged
examples
server
#12031
opened Feb 22, 2025 by
poulphunter
Loading…
server (webui): Fix issue with muliple
<think>
tags in response
examples
server
#11779
opened Feb 10, 2025 by
stduhpf
Loading…
Previous Next
ProTip!
Updated in the last three days: updated:>2025-05-18.