ollama

mirror of https://github.com/ollama/ollama.git synced 2026-04-23 01:05:47 +02:00

Files

jmorganca 7449b539ab llm,server: route Ollama-format gemma3 blobs through llama/compat

Two tiny Go-side changes that let the llama/compat shim take over gemma3:

1. llm/llama_server.go: when the GGUF has embedded v.* tensors and no
   projector layer is declared, pass the model file itself as --mmproj.
   The in-process compat layer translates the same file into both a
   text-only view (for --model) and a clip-mmproj view (for --mmproj).

2. server/model_resolver.go: drop library/gemma3 from compatModelRedirects.
   The compat layer handles it directly, so no dhiltgen/ republish is
   needed. Other arches stay in the redirect list until they get their
   own handler in llama/compat/llama-ollama-compat.cpp.

End-to-end verified: `ollama run gemma3` answers text and image prompts
against the existing library/gemma3 blob with no re-download.

2026-04-20 09:29:34 -07:00

llama_server_test.go

runner: Remove CGO engines, use llama-server exclusively for GGML models

2026-04-20 08:44:02 -07:00

llama_server.go

llm,server: route Ollama-format gemma3 blobs through llama/compat

2026-04-20 09:29:34 -07:00

llm_darwin.go

Optimize container images for startup (#6547 )

2024-09-12 12:10:30 -07:00

llm_linux.go

Optimize container images for startup (#6547 )