Align Qwen parser behavior with Transformers serve by allowing <tool_call> parsing while still in thinking collection.
Changes:
- qwen3vl: detect <tool_call> before </think> in thinking state and transition to tool parsing
- qwen3: same thinking-state tool detection and partial-tag overlap handling
- tests: update qwen3vl thinking/tool interleaving expectations
- tests: add qwen3 cases for tool call before </think> and split <tool_call> streaming
This change:
* fixes rope scaling in the mistral converter
* updates ministral to include llama4 scaling
* includes a new ministral parser for parsing reasoning and tool calling
---------
Co-authored-by: jmorganca <jmorganca@gmail.com>
* changing initial status to take into consideration prefill
* Add seperate strings for content and thinking builder
* thinking tests
* remove white space from string before closing think tag
* working (other than tool call is the incorrect order) for tool calls and tools
* Tests work, other than image tags (tests do not go through server) and tools (not in the correct order, but contents are the same)
* testing for qwen3vl parser - toolparser is working
* made changes to JSON tool parser, wraps the TollCallFunction with a TollCall object
* Working parser for thinking models - assumes state of thinking, emits unambiguous content in thinking, does not call tool call in thinking
* changed the parser to start with collecting content
* thinking prefill
* add hasThinkingSupport parameter to parser
* qwen3-vl -> qwen3-vl-instruct for renderer/parser
* Add hasThinkingSupport=false to QwenVLParser
---------
Co-authored-by: Devon Rifkin <drifkin@drifkin.net>