Add MLX runner with GLM4-MoE-Lite model support (#14185)

This change adds a new MLX based runner which includes:

  * Method-based MLX bindings
  * Subprocess-based MLX runner (x/mlxrunner)
  * KV cache with tree management
  * A basic sampler

The GLM4-MoE-Lite model has been ported to use the new bindings.

---------

Co-authored-by: Michael Yang <git@mxy.ng>
This commit is contained in:
Patrick Devine
2026-02-10 14:57:57 -08:00
committed by GitHub
parent db493d6e5e
commit 44bdd9a2ef
42 changed files with 14900 additions and 9 deletions

3
x/mlxrunner/mlx/.gitignore vendored Normal file
View File

@@ -0,0 +1,3 @@
_deps
build
dist