Commit Graph

4931 Commits

Author SHA1 Message Date
ParthSareen
44179b7e53 x/agent: use stdlib path package for path normalization
Replace custom normalizePath function with stdlib path.Clean.
Use path.IsAbs and path.Dir for cleaner, more robust code.
Add sibling escape detection to prevent traversal attacks like
"tools/a/b/../../../etc" which normalizes to "etc" (a sibling).
2026-01-06 18:09:10 -08:00
ParthSareen
359be5b658 x/cmd: handle 500 errors by informing model and retrying
When server returns a 500 error (often due to tool parsing failures),
instead of failing, send the error message and the model's response
back to the model so it can learn and retry.

- Includes both error message and model's failed response
- Limits to 3 consecutive retries to prevent infinite loops
- Resets retry counter on successful responses
2026-01-06 16:55:08 -08:00
ParthSareen
820e51e144 x/cmd: add --yolo/-y flag to skip tool approval prompts
Add a -y/--yolo flag that skips all interactive tool approval prompts.
Dangerous command patterns (rm -rf, sudo, etc.) are still blocked.

Usage: ollama run model --experimental -y
2026-01-06 16:47:26 -08:00
ParthSareen
8470c25fa9 x/cmd: handle 401 from Chat API with sign-in prompt
When client.Chat() returns a 401 AuthorizationError, prompt the user
to sign in instead of just showing "Error: 401 Unauthorized".

This handles the case where users need to authenticate to use cloud
models, not just web search.
2026-01-06 15:43:11 -08:00
ParthSareen
c8b599bd44 x/agent: fix path traversal vulnerability in hierarchical prefix matching
Reject any path containing ".." from creating allowlist prefixes.
This prevents attacks where approving "cat tools/file.go" would allow
"cat tools/../../etc/passwd" via the hierarchical prefix matching.

Commands with ".." now require individual approval each time.
Also reject absolute paths from prefix creation.

Added tests for path traversal scenarios.
2026-01-06 15:41:57 -08:00
ParthSareen
59928c536b x/cmd: add context-aware tool output truncation for LLM
Implement dual-limit tool output truncation to prevent context overflow:
- 4k tokens (~16k chars) for local models on local servers
- 10k tokens (~40k chars) for cloud models or remote servers

This helps preserve context window for local models with smaller
context windows while allowing larger outputs for cloud services.
2026-01-06 15:36:03 -08:00
ParthSareen
0b4850812f x/agent: fix hierarchical prefix matching for Windows paths
Normalize backslashes to forward slashes in extractBashPrefix to ensure
consistent cross-platform behavior. Use string-based path splitting
instead of filepath.Dir to avoid platform-specific behavior.

Add cross-platform test for Windows-style backslash paths.
2026-01-06 15:16:28 -08:00
ParthSareen
9383082070 x: add tests for tool disabling, auth error, and helper functions
- Add tests for OLLAMA_AGENT_DISABLE_WEBSEARCH/BASH env vars
- Add tests for ErrWebSearchAuthRequired error type
- Add tests for isLocalModel, isLocalServer, truncateToolOutputForLocalModel
2026-01-06 14:51:27 -08:00
ParthSareen
85e48af46a x/cmd: add tool output toggle and interactive signin flow
- Add Ctrl+O toggle to expand/collapse tool output inline
- Show tools available in grey text at startup
- Add interactive signin flow when web search returns 401:
  prompts user, shows signin URL, polls until auth completes
- Truncate tool output for local models to prevent context overflow
- Update help text with Ctrl+O keyboard shortcut
2026-01-06 14:48:03 -08:00
ParthSareen
aa9a1477b3 x/agent: improve approval UX with hierarchical matching and signin prompt
- Add hierarchical prefix matching for bash commands: if "cat:tools/"
  is approved, subdirectories like "cat:tools/subdir/" are also allowed
- Show "Uses internet via ollama.com" notice in web_search approval popup
- Add PromptYesNo function for interactive yes/no prompts
- Add tests for hierarchical prefix matching
2026-01-06 14:47:22 -08:00
ParthSareen
aed714a676 x/tools: use Ollama key signing for web search authentication
Replace OLLAMA_API_KEY environment variable with Ollama's native key
signing mechanism (~/.ollama/id_ed25519). Add ErrWebSearchAuthRequired
error type for handling 401 responses.
2026-01-06 14:45:08 -08:00
ParthSareen
064c6a984e x/tools: add environment variables to disable tools
Add OLLAMA_AGENT_DISABLE_WEBSEARCH and OLLAMA_AGENT_DISABLE_BASH
environment variables to selectively disable tools in the agent loop.
2026-01-06 14:44:18 -08:00
ParthSareen
3aaa8d5564 readline: add Ctrl+O support for expanding tool output
Add CharCtrlO constant and ErrExpandOutput error to enable Ctrl+O
as a keyboard shortcut for expanding truncated tool output in the
agent loop.
2026-01-06 14:44:04 -08:00
Parth Sareen
76912c062a x: add experimental agent loop (#13628) 2026-01-05 23:38:40 -08:00
Devon Rifkin
6c3faafed2 olmo3: fix flaky test (#13629)
I introduced this in <https://github.com/ollama/ollama/pull/13525>
2026-01-05 22:37:20 -08:00
Devon Rifkin
e51dead636 preserve tool definition and call JSON ordering (#13525)
* preserve tool definition and call JSON ordering

This is another iteration of
<https://github.com/ollama/ollama/pull/12518>, but this time we've
simplified things by relaxing the competing requirements of being
compatible AND order-preserving with templates (vs. renderers). We
maintain backwards compatibility at the cost of not guaranteeing order
for templates. We plan on moving more and more models to renderers,
which have been updated to use these new data types, and additionally
we could add an opt-in way of templates getting an order-preserved list
(e.g., via sibling template vars)

* orderedmap_test: remove testify
2026-01-05 18:03:36 -08:00
Harry V. Kiselev
d087e46bd1 docs/capabilities/vision: fix curl related code snippet (#13615) 2026-01-03 17:27:46 -05:00
lif
37f6f3af24 server: return error when embedding contains NaN or Inf values (#13599)
The normalize function now checks for NaN and Inf values in the
embedding vector before processing. This prevents JSON encoding
failures when models produce invalid floating-point values.

Fixes #13572

Signed-off-by: majiayu000 <1835304752@qq.com>
2026-01-03 02:20:12 -05:00
Nhan Nguyen
e1bdc23dd2 docs: fix tool name mismatch and trailing commas in api.md example (#13559)
The tool calling example used "get_temperature" for tool_calls but
defined the tool as "get_weather". Also removed trailing commas that
made the JSON invalid.

Fixes #13031
2026-01-03 02:14:53 -05:00
lif
2e78653ff9 app/ui: add swift syntax highlighting support (#13574)
Fixes #13476

Signed-off-by: majiayu000 <1835304752@qq.com>
2026-01-03 02:12:08 -05:00
lif
f5f74e12c1 docs: add version note for /v1/responses API (#13596)
Signed-off-by: majiayu000 <1835304752@qq.com>
2026-01-03 01:58:20 -05:00
Vallabh Mahajan
18fdcc94e5 docs: fix broken .md links and render issues (#13550) 2025-12-23 12:44:55 -05:00
Daniel Hiltgen
7ad036992f amd: use GTT on iGPUs on linux (#13196)
On Linux, look at the GTT memory information for iGPUs.
2025-12-23 09:30:05 -08:00
Jesse Gross
172b5924af llm: Avoid integer underflow on llama engine memory layout
On the llama engine, when we compute the memory layout, we reserve
a buffer to allow for some flexibility for incorrect estimates.
This is subtracted from GPU free memory and on GPUs with limited
memory, it may underflow.

Fixes #13494
2025-12-19 15:48:15 -08:00
Jeffrey Morgan
8852220f59 add REQUIRES command to Modelfile (#13361) 2025-12-18 13:21:29 -08:00
Parth Sareen
7325791599 parsers/renderers: functiongemma (#13521) v0.13.5-rc1 v0.13.5 2025-12-18 07:55:37 -08:00
Grace
522c11a763 Revert "Omit args and params in tool function def and calls (#13516)" (#13518)
This reverts commit 0fadeffaee.
2025-12-17 19:06:56 -08:00
Grace
0fadeffaee Omit args and params in tool function def and calls (#13516) 2025-12-17 18:42:21 -08:00
Daniel Hiltgen
49a9c9ba6a GGML update to ec98e2002 (#13451)
* Revert "add support for NVIDIA Nemotron 3 Nano"

This reverts commit e7d2ae9d69.

* GGML update to 380b4c984

Remove MaskBatchPadding as GGML_KQ_MASK_PAD is no longer present (no
padding required)

* update to c45f89d55

* ec98e2002

solar pro needed more adjusting - needs verification

* review comments
v0.13.5-rc0
2025-12-17 13:13:55 -08:00
Parth Sareen
1c094038bc types: add nested property support for tool definitions (#13508) 2025-12-17 11:54:09 -08:00
Grace
a013693f80 DeepseekV3 Family Parser (#13484) 2025-12-16 18:56:30 -08:00
Michael Yang
f6a016f49d revert granite-embedding (#13505) 2025-12-16 15:44:52 -08:00
Bruce MacDonald
45c4739374 types: ConfigV2 and RootFS (#13504)
Refactored the ConfigV2 and RootFS types from server/images.go to a new types/model/config.go file under the model package. Updated all references to use model.ConfigV2 and model.RootFS. This allows for use in other projects without worrying about compiling the c code in the llama package.
2025-12-16 15:18:17 -08:00
Michael Yang
2dd029de12 remove unnecessary code (#13502)
slog is already lazily evaluated so this code is completely redundant
2025-12-16 15:11:26 -08:00
Michael Yang
903b1fc97f use ollama engine for bert models (#13501)
register bpe tokenizer which enables granite-embedding
2025-12-16 11:29:19 -08:00
Parth Sareen
89eb795293 parsers/renderers: use think from user for nemotron (#13492) v0.13.4-rc2 v0.13.4 2025-12-15 18:55:17 -08:00
Parth Sareen
7e3ea813c1 llama/parsers/renderers: nemotron 3 nano (#13489)
---------

Co-authored-by: Daniel Hiltgen <daniel@ollama.com>
2025-12-15 18:00:08 -08:00
Grace
7b95087b9d Adding tool definitions to DeepseekV3 renderer (#13491) 2025-12-15 17:57:06 -08:00
Michael Yang
971d62595a fix: qwen2.5 vl rope (#13486)
* qwen25vl: bump max pixels

* qwen25vl: mrope

fix qwen2.5vl window

* qwen25vl: vision rope
2025-12-15 17:30:33 -08:00
Parth Sareen
ffbe8e076d model: add olmo3 and olmo3.1 (#13415) 2025-12-15 15:20:04 -08:00
Grace
2c639431b1 DeepseekV3 family renderer (#13180) 2025-12-15 14:50:52 -08:00
Nhan Nguyen
aacd1cb394 fix: define GGML_VERSION variables for proper SOVERSION expansion (#13469)
The ggml/src/CMakeLists.txt uses GGML_VERSION_MAJOR for the shared
library SOVERSION property, but these variables were not defined when
building from ollama's CMakeLists.txt.

This caused libggml-base.so to be named with a literal "SOVERSION"
suffix (libggml-base.so.SOVERSION) instead of the actual version
number (libggml-base.so.0).

The fix adds the required GGML_VERSION_* variables before including
the ggml subdirectory.

Fixes #13436
2025-12-15 14:42:15 -08:00
Parth Sareen
e3731fb160 renderers: add olmo3.1 and olmo3 fixes (#13447) 2025-12-15 11:26:43 -08:00
Eva H
8dbc9e7b68 app/ui: handle unspecified bind addresses and wait for server in ollama proxy (#13159) 2025-12-15 13:33:09 -05:00
Daniel Hiltgen
abe67acf8a Revert "Enable Ollama engine by default" (#13481)
This reverts commit 56f754f46b.
2025-12-15 09:55:45 -08:00
Jeffrey Morgan
4ff8a691bc model: default gemma 3 rope scale to 1.0, apply corrections based on layer counts (#13453) v0.13.4-rc1 2025-12-12 17:51:56 -08:00
Jeffrey Morgan
1b308e1d2a model: fix global layer rope scale values for gemma 3 (#13452) v0.13.4-rc0 2025-12-12 16:29:01 -08:00
Daniel Hiltgen
bd6c1d6b49 flash attn: add auto mode for llama engine (#13052)
* flash attn: add auto mode for llama engine

If the user does not specify fa in the environment, use auto-mode.

* review comments

* ensure kv cache quantized types have FA explicitly enabled

additional review comments
2025-12-12 13:27:19 -08:00
Jeffrey Morgan
3af5d3b738 model: force rope factor 1.0 for Gemma 3 (#13445) 2025-12-12 13:27:08 -08:00
Daniel Hiltgen
7730895158 Enable Ollama engine by default (#13443)
This changes the default behavior to use the Ollama engine for supported
models, while retaining the ability to disable the Ollama engine and
fall back to the Llama engine.  Models in the OllamaEngineRequired list
will always run on the Ollama engine.
2025-12-12 11:48:43 -08:00