llama : refactor the llm.build_xxx functions

Now that we support a large amount of architectures, we can clearly see the patterns when constructing the compute graphs - i.e. optional biases, different norm types, QKV vs Q+K+V, etc.

We should deduplicate the copy-paste portions in functions such as `llm.build_llama()`, `llm.build_falcon()`, etc.

The advantage of the current code is that it is easy to look into the graph of a specific architecture. When we refactor this, we will lose this convenience to some extend. So we should think about making this refactoring in such a way that we don't completely obscure which parts of the graph belong to which architectures

Open for ideas and suggestions how to do this best

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

llama : refactor the llm.build_xxx functions #5239

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

llama : refactor the llm.build_xxx functions #5239

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions