ollama

mirror of https://github.com/ollama/ollama.git synced 2026-04-23 09:15:44 +02:00

Files

ParthSareen 59928c536b x/cmd: add context-aware tool output truncation for LLM

Implement dual-limit tool output truncation to prevent context overflow:
- 4k tokens (~16k chars) for local models on local servers
- 10k tokens (~40k chars) for cloud models or remote servers

This helps preserve context window for local models with smaller
context windows while allowing larger outputs for cloud services.

2026-01-06 15:36:03 -08:00

run_test.go

x/cmd: add context-aware tool output truncation for LLM

2026-01-06 15:36:03 -08:00

run.go

x/cmd: add context-aware tool output truncation for LLM

2026-01-06 15:36:03 -08:00