ollama

mirror of https://github.com/ollama/ollama.git synced 2026-04-21 16:25:42 +02:00

Files

Daniel Hiltgen ea3c6a3cbe gemma4: add Gemma 4 GGML model support

Add full Gemma 4 model family support (E2B, E4B, 26B MoE, 31B Dense)
for the GGML backend including text, vision, converter, parser, and
renderer.

Text model features:
- Sliding window + full attention with per-layer patterns
- KV sharing across layers with donor map
- Per-layer embeddings (PLE) with learned projections
- MoE routing with RMSNorm + learned scale
- Proportional RoPE with freq_factors for global attention
- Final logit softcapping

Vision model features:
- SigLIP vision encoder with 2D RoPE
- ClippableLinear with input/output clamping via packed v.clamp_data
- Adaptive average pooling with nMerge kernel
- Multi-modal projection with unweighted RMSNorm

Converter:
- Safetensors to GGUF with vision tensor renaming
- Fused MoE gate_up_proj splitting
- Vision patch embedding reshape (HF to Conv2D layout)
- Packed clamp data tensor for ClippableLinear bounds
- Proportional RoPE freq_factors generation

Also includes:
- BackendGet() on ml.Tensor for reading weight tensor data
- Q6_K CUDA get_rows kernel support
- MoE-aware ffn_down quantization layer counting
- Gemma4 parser with tool calling and thinking support
- Gemma4 renderer with structured tool format
- Architecture-based auto-detection of renderer/parser/stop tokens
- Integration test gemma4 model list additions

2026-04-01 15:23:10 -07:00

ggml

gemma4: add Gemma 4 GGML model support

2026-04-01 15:23:10 -07:00

gguf

Reapply "feat: incremental gguf parser (#10822 )" (#11114 ) (#11119 )

2025-06-20 11:11:40 -07:00

util/bufioutil

next ollama runner (#7913 )

2025-02-13 16:31:21 -08:00

config.go

Add experimental MLX backend and engine with imagegen support (#13648 )

2026-01-08 16:18:59 -08:00