ollama-ollama

mirror of https://github.com/ollama/ollama.git synced 2026-04-17 15:53:27 +02:00

Files

Daniel Hiltgen e823bff873 gemma4: enable flash attention (#15378 )

Backport GGML kernels so we can enable flash attention for the gemma 4 model on
Metal and CUDA.

2026-04-07 08:12:36 -07:00

2026-04-07 08:12:36 -07:00

2025-12-15 17:30:33 -08:00

backend.go

2026-04-02 11:33:33 -07:00

device.go

2025-12-12 13:27:19 -08:00

path.go

2025-10-31 14:37:29 -07:00