ollama-ollama/llm/server.go at main

mirror of https://github.com/ollama/ollama.git synced 2026-04-17 15:53:27 +02:00

Files

Daniel Hiltgen dde09129d1 gemma4: Disable FA on older GPUs where it doesn't work (#15403 )

CUDA older than 7.5 lack the support to enable flash attention for the model.

2026-04-07 14:54:25 -07:00

View Raw