Skip to content

Clean up server code #5762

Closed
Closed
@ngxson

Description

@ngxson

Motivation

As seen on #4216 , one of the important task is to refactor / clean up the server code so that it's easier to maintain. However, without a detailed plan, personally I feel like it's unlikely to be archived.

This issue is created so that we can discuss about how to refactor or clean up the code.

The goal is to help existing and new contributors to easily find out where to work in the code base.

Current architecture

The current server implementation has 2 thread: one for HTTP part and one for inference.

image

  • The direction from HTTP ==> inference thread is done by llama_server_queue.post(task)
  • The direction from inference ==> HTTP thread is done by llama_server_response.send(result)

Ideas

Feel free to suggest any ideas that you find helpful (please keep in mind that we do not introduce new features here, just to re-write the code):

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions