Skip to content

llama : create llamax library #5215

Open
@ggerganov

Description

@ggerganov

Depends on: #5214

The llamax library will wrap llama and expose common high-level functionality. The main goal is to ease the integration of llama.cpp into 3rd party projects. Ideally, most projects would interface through the llamax API for all common use cases, while still have the option to use the low-level llama API for more uncommon applications that require finer control of the state.

A simple way to think about llamax is that it will simplify all of the existing examples in llama.cpp by hiding the low-level stuff, such as managing the KV cache and batching requests.

Roughly, llamax will require it's own state object and a run-loop function.

The specifics of the API are yet to be determined - suggestions are welcome.

Metadata

Metadata

Assignees

No one assigned

    Labels

    refactoringRefactoringroadmapPart of a roadmap project

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions