Replace custom normalizePath function with stdlib path.Clean.
Use path.IsAbs and path.Dir for cleaner, more robust code.
Add sibling escape detection to prevent traversal attacks like
"tools/a/b/../../../etc" which normalizes to "etc" (a sibling).
When server returns a 500 error (often due to tool parsing failures),
instead of failing, send the error message and the model's response
back to the model so it can learn and retry.
- Includes both error message and model's failed response
- Limits to 3 consecutive retries to prevent infinite loops
- Resets retry counter on successful responses
Add a -y/--yolo flag that skips all interactive tool approval prompts.
Dangerous command patterns (rm -rf, sudo, etc.) are still blocked.
Usage: ollama run model --experimental -y
When client.Chat() returns a 401 AuthorizationError, prompt the user
to sign in instead of just showing "Error: 401 Unauthorized".
This handles the case where users need to authenticate to use cloud
models, not just web search.
Reject any path containing ".." from creating allowlist prefixes.
This prevents attacks where approving "cat tools/file.go" would allow
"cat tools/../../etc/passwd" via the hierarchical prefix matching.
Commands with ".." now require individual approval each time.
Also reject absolute paths from prefix creation.
Added tests for path traversal scenarios.
Implement dual-limit tool output truncation to prevent context overflow:
- 4k tokens (~16k chars) for local models on local servers
- 10k tokens (~40k chars) for cloud models or remote servers
This helps preserve context window for local models with smaller
context windows while allowing larger outputs for cloud services.
Normalize backslashes to forward slashes in extractBashPrefix to ensure
consistent cross-platform behavior. Use string-based path splitting
instead of filepath.Dir to avoid platform-specific behavior.
Add cross-platform test for Windows-style backslash paths.
- Add tests for OLLAMA_AGENT_DISABLE_WEBSEARCH/BASH env vars
- Add tests for ErrWebSearchAuthRequired error type
- Add tests for isLocalModel, isLocalServer, truncateToolOutputForLocalModel
- Add Ctrl+O toggle to expand/collapse tool output inline
- Show tools available in grey text at startup
- Add interactive signin flow when web search returns 401:
prompts user, shows signin URL, polls until auth completes
- Truncate tool output for local models to prevent context overflow
- Update help text with Ctrl+O keyboard shortcut
- Add hierarchical prefix matching for bash commands: if "cat:tools/"
is approved, subdirectories like "cat:tools/subdir/" are also allowed
- Show "Uses internet via ollama.com" notice in web_search approval popup
- Add PromptYesNo function for interactive yes/no prompts
- Add tests for hierarchical prefix matching
* preserve tool definition and call JSON ordering
This is another iteration of
<https://github.com/ollama/ollama/pull/12518>, but this time we've
simplified things by relaxing the competing requirements of being
compatible AND order-preserving with templates (vs. renderers). We
maintain backwards compatibility at the cost of not guaranteeing order
for templates. We plan on moving more and more models to renderers,
which have been updated to use these new data types, and additionally
we could add an opt-in way of templates getting an order-preserved list
(e.g., via sibling template vars)
* orderedmap_test: remove testify
The normalize function now checks for NaN and Inf values in the
embedding vector before processing. This prevents JSON encoding
failures when models produce invalid floating-point values.
Fixes#13572
Signed-off-by: majiayu000 <1835304752@qq.com>
The tool calling example used "get_temperature" for tool_calls but
defined the tool as "get_weather". Also removed trailing commas that
made the JSON invalid.
Fixes#13031
On the llama engine, when we compute the memory layout, we reserve
a buffer to allow for some flexibility for incorrect estimates.
This is subtracted from GPU free memory and on GPUs with limited
memory, it may underflow.
Fixes#13494
* Revert "add support for NVIDIA Nemotron 3 Nano"
This reverts commit e7d2ae9d69.
* GGML update to 380b4c984
Remove MaskBatchPadding as GGML_KQ_MASK_PAD is no longer present (no
padding required)
* update to c45f89d55
* ec98e2002
solar pro needed more adjusting - needs verification
* review comments
Refactored the ConfigV2 and RootFS types from server/images.go to a new types/model/config.go file under the model package. Updated all references to use model.ConfigV2 and model.RootFS. This allows for use in other projects without worrying about compiling the c code in the llama package.
The ggml/src/CMakeLists.txt uses GGML_VERSION_MAJOR for the shared
library SOVERSION property, but these variables were not defined when
building from ollama's CMakeLists.txt.
This caused libggml-base.so to be named with a literal "SOVERSION"
suffix (libggml-base.so.SOVERSION) instead of the actual version
number (libggml-base.so.0).
The fix adds the required GGML_VERSION_* variables before including
the ggml subdirectory.
Fixes#13436
* flash attn: add auto mode for llama engine
If the user does not specify fa in the environment, use auto-mode.
* review comments
* ensure kv cache quantized types have FA explicitly enabled
additional review comments
This changes the default behavior to use the Ollama engine for supported
models, while retaining the ability to disable the Ollama engine and
fall back to the Llama engine. Models in the OllamaEngineRequired list
will always run on the Ollama engine.