Files
ollama/x/mlxrunner/mlx/sdpa.go
Jesse Gross b7b2aa5d4e mlxrunner: Cache.Update takes ForwardBatch and returns KVHistory
Signature changes from Update(k, v) to Update(batch, k, v) returning
(k, v, KVHistory). KVCache returns a real page table mapping positions
to buffer slots. RecurrentCache returns empty KVHistory from Update.

Replace Cache.Offset() with Offsets() returning per-sequence offsets.
Add KVHistory type to mlx package.
2026-04-03 19:50:41 -07:00

14 lines
452 B
Go

package mlx
// KVHistory carries sequence metadata alongside K/V buffers for SDPA.
// Page table and seq lens travel together — SDPA always needs both.
type KVHistory struct {
// PageTable maps (seqIdx, position) → slot index in the K/V buffer.
// Shape: [numSeqs, maxSeqLen], int32. Unused entries are 0.
PageTable *Array
// SeqLens is the history length per sequence (number of valid
// entries in each row of PageTable).
SeqLens []int
}