llama : switch to floating-point token positions #5679

ggerganov · 2024-02-23T10:35:39Z

Change llama_pos from int32_t to float

This change might seem unnecessary at first as we are used to think about token positions as integers, but technically nothing prevents these to be floats. Also, I'm having some ideas for KV cache compression / context extension tricks and having float positions could turn out to be useful.

Still contemplating if we should merge this, so for now just a draft

ggml-ci

ngxson · 2024-02-23T12:52:04Z

+1 For this, I'm wondering if it helps simplifying the code of group attention (self-extend)

ggerganov · 2024-02-23T13:41:39Z

Not sure if it will become simpler, but one of the things I want to investigate is to apply floating-point division in llama_kv_cache_seq_div() instead of the current integer division. Intuitively, I expect to improve the recall quality

The other idea I want to explore is to merge KV cells into one another via averaging both of the positions and the KV values. Wondering if this can be applied to compress the KV cache data into fewer cells

ggml-ci

llama : switch to floating-point token positions

fc77536

ggml-ci

ggerganov force-pushed the gg/float-pos branch from 28ad0a6 to 06f9220 Compare February 23, 2024 12:15

ggml : add I32 <-> F32 conversion

8772658

ggml-ci

ggerganov force-pushed the gg/float-pos branch from 06f9220 to 8772658 Compare February 23, 2024 12:25

ggerganov added 2 commits February 23, 2024 16:15

batched.swift : fix build

fff1e8a

ggml-ci

swift : fix build

608f449

ggml-ci

compilade mentioned this pull request Mar 7, 2024

llama : support Mamba Selective State Space Models #5328

Merged

8 tasks

mofosyne added refactoring Refactoring Review Complexity : High Generally require indepth knowledge of LLMs or GPUs labels May 10, 2024

ggerganov added the demo Demonstrate some concept or idea, not intended to be merged label Jul 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

llama : switch to floating-point token positions #5679

llama : switch to floating-point token positions #5679

ggerganov commented Feb 23, 2024

Uh oh!

ngxson commented Feb 23, 2024 •

edited

Loading

Uh oh!

ggerganov commented Feb 23, 2024

Uh oh!

Uh oh!

llama : switch to floating-point token positions #5679

Are you sure you want to change the base?

llama : switch to floating-point token positions #5679

Conversation

ggerganov commented Feb 23, 2024

Uh oh!

ngxson commented Feb 23, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ggerganov commented Feb 23, 2024

Uh oh!

Uh oh!

ngxson commented Feb 23, 2024 •

edited

Loading