Commit Graph

3 Commits

Author SHA1 Message Date
Jeffrey Morgan
da70c3222e model: support for qwen3.5 architecture (#14378) 2026-02-24 20:08:05 -08:00
Jeffrey Morgan
255579aaa7 qwen3next: fix issue in delta net (#14075)
gDiffExp was being broadcast across the wrong axis when multiplying with k. This fix reshapes gDiffExp to [1, chunkSize, nChunks, ...]
2026-02-04 13:40:38 -08:00
Jeffrey Morgan
77eb2ca619 model: add qwen3-next architecture (#14051) 2026-02-03 23:27:21 -08:00