Skip to content

Eval bug: llama-cli, Qwen3 jinja template will break CLI multiturn conversation #13404

Open
@matteoserva

Description

@matteoserva

Name and Version

611aa91

Operating systems

Linux

GGML backends

CUDA

Hardware

NVIDIA

Models

Qwen3

Problem description & steps to reproduce

I don't know if it's too soon but I'm opening this to keep track of the issue.
The original qwen3 template is not supported but the bug can be tested by using a modified template

The qwen3 template contains the following check (stripped down to the relevant section):

{%- if loop.index0 > ns.last_query_index %} 
{%- if loop.last or (not loop.last and reasoning_content) %} 
 KEEP REASONING TOKENS

Meaning that in the common case the tokens are kept when the last role is assistant and the tokens are discarded when the last role is user.

The problem is that at the start of the turn, the following pseudo-code is executed:

- messages.append(user_message)
- fmt_past_msg = apply_chat_template(messages)
- messages.append(assistant_messages)
- fmt_new_msg = apply_chat_template(messages)
- diff = fmt_new_msg - fmt_past_msg

The diff is not computed correctly since the assistant message used in v1 has the thinking tokens preserved and the assistant message in v2 has the thinking tokens removed.

Relevant section of the code:

ss << fmt_new_msg.substr(fmt_past_msg.size(), fmt_new_msg.size() - fmt_past_msg.size());

First Bad Commit

No response

Relevant log output

std::string common_chat_format_single(...) {
[...]
fmt_past_msg = common_chat_templates_apply(tmpls, inputs).prompt;
[...]
inputs.messages.push_back(new_msg);
[...]
auto fmt_new_msg = common_chat_templates_apply(tmpls, inputs).prompt;
// get the diff part
ss << fmt_new_msg.substr(fmt_past_msg.size(), fmt_new_msg.size() - fmt_past_msg.size());

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions