mirror of
https://github.com/ollama/ollama.git
synced 2026-04-28 03:39:48 +02:00
Match the ollamarunner and OpenAI semantics: raw, full-vocab log-softmax with the top-K ranked by probability. Skipped on the GPU when the request doesn't ask for logprobs so decode doesn't pay for it otherwise.
5.8 KiB
5.8 KiB