ollama

starred/ollama

Fork 0

mirror of https://github.com/ollama/ollama.git synced 2026-04-23 17:29:54 +02:00

Commit Graph

Author	SHA1	Message	Date
Jesse Gross	24e038d56a	mlxrunner: add logprobs support Match the ollamarunner and OpenAI semantics: raw, full-vocab log-softmax with the top-K ranked by probability. Skipped on the GPU when the request doesn't ask for logprobs so decode doesn't pay for it otherwise.	2026-04-20 17:43:00 -07:00
Daniel Hiltgen	ff23dd343f	mlx: apply repeat penalties in sampler (#15631 )	2026-04-18 07:49:38 -07:00
Patrick Devine	d126467d5d	x/mlxrunner: replace sampler interface chain with single stateful Sampler (#14652 ) - Collapse MLX sampling state into a single sample.Sampler struct (options + history). - Replace interface-based sampler chain (TopP, TopK, penalty, etc.) with function-based transforms. - Update request/pipeline wiring to use *sample.Sampler, seed history from prompt tokens, and append generated tokens each step. - Implement top_p, min_p, repeat_penalty, and frequency_penalty	2026-03-07 17:50:57 -08:00

Author

SHA1

Message

Date

Jesse Gross

24e038d56a

mlxrunner: add logprobs support

Match the ollamarunner and OpenAI semantics: raw, full-vocab log-softmax
with the top-K ranked by probability. Skipped on the GPU when the request
doesn't ask for logprobs so decode doesn't pay for it otherwise.

2026-04-20 17:43:00 -07:00

Daniel Hiltgen

ff23dd343f

mlx: apply repeat penalties in sampler (#15631 )

2026-04-18 07:49:38 -07:00

Patrick Devine

d126467d5d

x/mlxrunner: replace sampler interface chain with single stateful Sampler (#14652 )

- Collapse MLX sampling state into a single sample.Sampler struct (options + history).
- Replace interface-based sampler chain (TopP, TopK, penalty, etc.) with function-based transforms.
- Update request/pipeline wiring to use *sample.Sampler, seed history from prompt tokens, and append generated tokens each step.
- Implement top_p, min_p, repeat_penalty, and frequency_penalty

2026-03-07 17:50:57 -08:00

3 Commits