ollama

mirror of https://github.com/ollama/ollama.git synced 2026-04-17 21:54:08 +02:00

Files

Jesse Gross e1e3cec8d0 models: fuse MLP activation functions via mlx_compile

Converts SiLU/GELUApprox to compiled kernels and adds SwiGLU,
matching upstream mlx/mlx_lm's activations pattern. Routes llama,
qwen3, qwen3_5 (dense + MoE), and glm4_moe_lite MLP paths through
mlx.SwiGLU so each MLP invocation runs as one fused Metal/CUDA
kernel rather than a chain of per-op launches.

2026-04-14 16:38:32 -07:00

agent

x/cmd: enable web search and web fetch with flag (#13690 )

2026-01-12 13:59:40 -08:00

cmd

Reapply "don't require pulling stubs for cloud models" again (#14608 )

2026-03-06 14:27:47 -08:00

create

Gemma4 on MLX (#15244 )

2026-04-13 16:36:51 -07:00

imagegen

pull/push: refine safetensors (#14946 )