Skip to content

llama.android: add field formatChat to control whether to parse special tokens when send message #11270

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 17, 2025

Conversation

codezjx
Copy link
Contributor

@codezjx codezjx commented Jan 17, 2025

Fix issue #11264
If the user uses the chat template to send a message, the message needs to be tokenized with parse_special = true in common_tokenize() method. Add field formatChat to control whether to parse special tokens when send message in LLamaAndroid.send(). The app can set formatChat according to their actual situation.

Sample Code:

val msg = "<|im_start|>system\n" +
    "You are a helpful AI assistant<|im_end|>\n" +
    "<|im_start|>user\n" +
    "$text<|im_end|>\n" +
    "<|im_start|>assistant\n"
viewModelScope.launch {
    // Explicitly set formatChat = true
    llamaAndroid.send(msg, true)
        .catch {
            Log.e(tag, "send() failed", it)
            messages += it.message!!
        }
        .collect { messages = messages.dropLast(1) + (messages.last() + it) }
}

Before

image

After

image

Copy link
Member

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this basic example this is OK, but keep in mind that applying parse_special = true to the full context is susceptible to special token injection from the user. The correct way is to use llama_chat_apply_template.

@codezjx
Copy link
Contributor Author

codezjx commented Jan 17, 2025

@ggerganov Thanks for your reminder!

@ggerganov ggerganov merged commit 3edfa7d into ggml-org:master Jan 17, 2025
1 check passed
@ggerganov
Copy link
Member

Yup, I actually initially incorrectly thought that we handle this special token injection in the existing chat examples, but we don't. It's something to fix in the future.

@codezjx
Copy link
Contributor Author

codezjx commented Jan 17, 2025

@ggerganov Is the special token injection you just mentioned similar to the following behavior? Tested llama-cli in conversation mode and the user's input special token <|endoftext|> is encoded as 0. It seems that the user's input content needs to be tokenized separately (applying parse_special = false).🤔

waiting for user input

> special token <|endoftext|> injection test
buffer: 'special token <|endoftext|> injection test
'
formatted: '
<|im_start|>user
special token <|endoftext|> injection test
<|im_end|>
<|im_start|>assistant
'
input tokens: [ '':198, '<|im_start|>':1, 'user':4093, '':198, 'special':19159, ' token':9624, ' ':216, '<|endoftext|>':0, ' injection':12852, ' test':1028, '':198, '<|im_end|>':2, '':198, '<|im_start|>':1, 'ass':520, 'istant':9531, '':198 ]

tinglou pushed a commit to tinglou/llama.cpp that referenced this pull request Feb 13, 2025
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Feb 26, 2025
mglambda pushed a commit to mglambda/llama.cpp that referenced this pull request Mar 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
android Issues specific to Android examples
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants