Skip to content

Dynamically intterupt token generation #599

Open
@Bartvelp

Description

@Bartvelp

Is your feature request related to a problem? Please describe.
During the generation of tokens I would like to stop when I encounter some condition that changes during the runtime, using Stream=True.
E.g. I would like to stop generation after 5 lines of generation.

Describe the solution you'd like
I would like a method on llm called stop(), or interrupt(), that forces the model to stop after the next token is generated, similar to CTRL+C in the regular llama.cpp

Describe alternatives you've considered
I have considered adding a newline as stop token, but I think this is not performant. Another way I can think of is changing the stop list after passing it the generation method, but that feels hacky.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions