Jeffrey Morgan
|
da70c3222e
|
model: support for qwen3.5 architecture (#14378)
|
2026-02-24 20:08:05 -08:00 |
|
Michael Yang
|
f1373193dc
|
move tokenizers to separate package (#13825)
|
2026-02-05 17:44:11 -08:00 |
|
Michael Yang
|
603ceefaa6
|
refactor rope
change to a flatter directory structure and group the options with the
function
update models to call rope in one place
|
2025-12-08 14:42:22 -08:00 |
|
Michael Yang
|
333203d871
|
chore: update models to use slice/chunk/chunksections (#12934)
* use slice/chunks
* bert
* llama4
* gemma3n
* gptoss
* mistral3
* qwen3vl
* qwen25vl
* deepseek2
* remove unused ops
|
2025-11-13 15:20:12 -08:00 |
|
Daniel Hiltgen
|
544b6739dd
|
ggml update to b6840 (#12791)
|
2025-11-06 10:19:22 -08:00 |
|
Michael Yang
|
f67a6df110
|
interleaved mrope (#12807)
* ml(ggml): mrope
* interleave mrope
|
2025-10-30 11:29:00 -07:00 |
|
Michael Yang
|
d432ade714
|
fix: qwen2.5vl, qwen3vl composite image (#12841)
this change fixes images with an alpha channel by overlaying the image
onto a white background
|
2025-10-30 10:33:19 -07:00 |
|
Michael Yang
|
7d25b9e194
|
feat(model): add qwen3vl (#12665)
|
2025-10-28 17:39:47 -07:00 |
|