mirror of
https://github.com/ollama/ollama.git
synced 2026-04-24 01:35:49 +02:00
Signature changes from Update(k, v) to Update(batch, k, v) returning (k, v, KVHistory). KVCache returns a real page table mapping positions to buffer slots. RecurrentCache returns empty KVHistory from Update. Replace Cache.Offset() with Offsets() returning per-sequence offsets. Add KVHistory type to mlx package.
14 lines
452 B
Go
14 lines
452 B
Go
package mlx
|
|
|
|
// KVHistory carries sequence metadata alongside K/V buffers for SDPA.
|
|
// Page table and seq lens travel together — SDPA always needs both.
|
|
type KVHistory struct {
|
|
// PageTable maps (seqIdx, position) → slot index in the K/V buffer.
|
|
// Shape: [numSeqs, maxSeqLen], int32. Unused entries are 0.
|
|
PageTable *Array
|
|
|
|
// SeqLens is the history length per sequence (number of valid
|
|
// entries in each row of PageTable).
|
|
SeqLens []int
|
|
}
|