Compare commits

...

329 Commits

Author SHA1 Message Date
Eva Ho
7a3ed0a1b4 adding test 2026-04-14 15:28:40 -07:00
Eva Ho
03f9e57274 add test 2026-04-14 15:28:40 -07:00
Eva Ho
30d9100fff launch: add thinking capability detection to opencode 2026-04-14 15:28:40 -07:00
Eva H
698e04a14b launch: OpenCode inline config (#15586) 2026-04-14 15:08:42 -07:00
Eva H
1d9537bc33 launch/openclaw: fix --yes flag behaviour to skip channels configuration (#15589) 2026-04-14 13:57:35 -07:00
Eva H
120424d832 Revert "launch/opencode: use inline config (#15462)" (#15568) 2026-04-13 18:40:17 -07:00
Eva H
5818001610 launch: skip unchanged integration rewrite configration (#15491) 2026-04-13 17:18:56 -07:00
Daniel Hiltgen
2cba7756c5 Gemma4 on MLX (#15244)
* gemma4: implement Gemma 4 model for MLX (text-only runtime)

* gemma4: two MoE + SWA prefill perf fixes

Two performance optimizations in the gemma4 forward pass

1. Memoize the sliding-window prefill mask across layers.
2. Softmax only over the selected experts in Router.Forward.

* review comments
2026-04-13 16:36:51 -07:00
Devon Rifkin
bf2a421727 gemma4: restore e2b-style nothink prompt (#15560)
Gemma 4 prompts differ when thinking is disabled for different sized
models: 26b/31b emit an empty thought block, while e2b/e4b do not.

Before #15490, our shared Gemma 4 renderer effectively matched the
e2b behavior. #15490 changed it to always emit the empty thought block,
which regressed e2b/e4b nothink behavior and led to #15536 (and possibly

This change restores the previous shared behavior by removing the empty
trailing thought block. It also renames the checked-in upstream chat
templates so the e2b and 31b fixtures are tracked separately.

A follow-up will split Gemma 4 rendering by model size.

Fixes: #15536
2026-04-13 14:26:15 -07:00
Eva H
f3cf6b75fb launch/opencode: use inline config (#15462) 2026-04-13 13:41:31 -07:00
Devon Rifkin
5dfac387a6 Revert "gemma4: fix nothink case renderer (#15553)" (#15556)
This reverts commit 4d75f5da03.
2026-04-13 13:12:18 -07:00
Daniel Hiltgen
a99e5d9c22 mac: prevent generate on cross-compiles (#15120)
For some versions of Xcode, cmake builds are failing due to header problems in
cross-compiling during the generate phase.  Since generate is producing arch
independent generated output, we can skip this during cross-compiling.
2026-04-13 13:04:58 -07:00
Daniel Hiltgen
0abf3aca36 cgo: suppress deprecated warning to quiet down go build (#15438) 2026-04-13 13:04:11 -07:00
Devon Rifkin
ee0266462a Revert "gemma4: add nothink renderer tests (#15554)" (#15555)
This reverts commit 1b70bb8a10.
2026-04-13 13:00:59 -07:00
Daniel Hiltgen
c88fb286ec mlx: add op wrappers for Conv2d, Pad, activations, trig, and masked SDPA (#14913)
* mlx: add op wrappers for Conv2d, Pad, activations, trig, and masked SDPA

Add Conv2d, flexible Pad (with axes/mode), PadConstant, Maximum,
Minimum, Softplus, ReLU, GLU, Clamp, Sin, Cos, Clip,
ScaledDotProductAttentionMasked, and RoPEWithFreqs. Refactor
RoPEWithBase to delegate to RoPEWithFreqs.

* review comments

* mlx: fix ScaledDotProductAttentionMasked to consult the mask argument
2026-04-13 11:43:24 -07:00
Daniel Hiltgen
d3da29cbfc mlx: mixed-precision quant and capability detection improvements (#15409)
Improve the MLX model creation pipeline with several model-agnostic changes:

- Rewrite supportsVision to use vision_config instead of architecture name
- Add supportsAudio for audio encoder detection
- Add alignment checking (isAligned) for quantization group sizes
- Support per-projection mixed quantization in MoE expert packing
- Record per-tensor quant metadata in safetensors blobs
- Parse per-tensor quant metadata at model load time
- Validate quantize output is non-empty before storing
- Fix pin/unpin cleanup in expert group quantization
- Promote v_proj/k_proj/down_proj to INT8 for INT4 base quant
- Add MetalIsAvailable() utility
- Skip audio encoder tensors from quantization
2026-04-13 11:43:07 -07:00
Devon Rifkin
1b70bb8a10 gemma4: add nothink renderer tests (#15554)
Meant to include in #15553
2026-04-13 11:38:19 -07:00
Daniel Hiltgen
ec29ce4ce3 gemma4: fix compiler error on metal (#15550)
On some systems, the metal runtime compiler is failing due to an
uninitialized variable from #15378.

Fixes #15548
2026-04-13 11:32:00 -07:00
Devon Rifkin
4d75f5da03 gemma4: fix nothink case renderer (#15553)
Regressed in #15490

Fixes: #15536
2026-04-13 11:23:19 -07:00
saman-amd
798fd09bfe Update to ROCm 7.2.1 (#15483)
Co-authored-by: Samiii777 <58442200+Samiii777@users.noreply.github.com>
2026-04-12 12:11:58 -07:00
Devon Rifkin
9330bb9120 gemma4: be less strict about whitespace before bare keys (#15494) 2026-04-11 16:30:27 -07:00
Devon Rifkin
40a1317dfd gemma4: update renderer to match new jinja template (#15490)
* gemma4: update renderer to match new jinja template

Google has updated their jinja template for gemma4, and so this change
gives us parity with the new template. The parsing also slightly changed
upstream, so we make a small change to our parser as well.

I've also corrected a few probably existing edge cases, especially
around type unions. The upstream output format is weird (a stringified
array), but in practice the models seem to understand it well.

* gemma4: special case simple `AnyOf`s

The upstream template doesn't handle `AnyOf`s, but since in the previous
commit we saw type unions work reasonably well, I'm now treating very
simple `AnyOf`s as type unions to help in cases where they might be used

* fix lint

* gemma4: prefer empty instead of `None`

We can't currently distinguish between a result being not-present vs.
empty. The empty case seems more important (e.g., a legitimately empty
tool call)

* gemma4: be more careful for tool results with missing IDs
2026-04-10 15:45:27 -07:00
Devon Rifkin
fdfe9cec98 model/parsers: fix missing parallel tool call indices (#15467)
We were missing setting the function index for several models that can
make parallel tool calls.

In the future we may want to consider putting some sort of post-parse
hook and relieve the parsers of this duty.

Fixes: #15457
2026-04-10 15:23:21 -07:00
Matteo Celani
9517864603 app/ui: re-validate image attachments when selected model changes (#15272) 2026-04-10 14:03:51 -07:00
Bruce MacDonald
8e6d86dbe3 docs: add hermes agent integration guide (#15488)
Update cloud and local model recommendations to match current
models.go: add qwen3.5:cloud and glm-5.1:cloud, replace glm-4.7-flash
with gemma4 and qwen3.5 as local options.

Add documentation for Hermes Agent by Nous Research, covering
installation, Ollama setup via custom endpoint, messaging configuration,
and recommended models.
2026-04-10 13:13:36 -07:00
Parth Sareen
80d3744c5d launch: update openclaw channel message (#15463) 2026-04-09 15:20:30 -07:00
Eva H
2a94f03823 launch: add re-run hint to dependency error message (#15439) 2026-04-09 09:51:34 -07:00
Patrick Devine
eb97274e5c modelfiles: fix /save command and add shortname for safetensors based models (#15413)
This change fixes two issues with Modelfiles:

  1. If a user uses `ollama show --modelfile` to show a safetensors based
     model, the Model would leave the "FROM" field blank which won't allow
     a user to recreate the model. This change adds the model's current
     canonical short name to the FROM field.
  2. If a user uses the `/save` command in the CLI any messages which were
     saved in a previous model wouldn't get saved (only the set of messages
     from the current session).
2026-04-08 21:05:39 -07:00
Daniel Hiltgen
6b5db12aa2 mlx: remove stale x86 libmlx library (#15443)
Fixes #15433
2026-04-08 20:51:47 -07:00
7. Sun
612f0a17d3 fix: improve error message for unknown input item type in responses API (#15424)
The default branch in unmarshalResponsesInputItem had two issues:
- It referenced typeField.Type instead of itemType; these differ when the
  shorthand role-based format promotes an empty type to "message", meaning
  an unhandled type would show the wrong value in the error string.
- It used %s formatting, so an empty type field produced the unhelpful
  message "unknown input item type: " with no indication what was missing.

Fix by using itemType (the resolved value) with %q quoting, and add a
dedicated message when itemType is empty (both type and role absent):
"input item missing required 'type' field".

Tests added for the empty-type and missing-type cases.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 17:41:12 -07:00
Parth Sareen
673726fa0e app: restore launch default and refine launch sidebar open for app (#15437) 2026-04-08 16:59:21 -07:00
Daniel Hiltgen
b5918f9785 pull/push: refine safetensors (#14946)
* pull: refine safetensors pull

 - Body drain in resolve() — drain response body before close so Go's HTTP
   client can reuse TCP connections instead of opening a new one per blob
   (1,075 extra TCP+TLS handshakes eliminated)
 - Skip speed recording for tiny blobs (<100KB) — prevents
   HTTP-overhead-dominated transfer times from poisoning the median, which the
   stall detector uses to cancel "too slow" downloads
 - Resume support for large blobs (>=64MB) — on failure, preserves partial .tmp
   files; on retry, re-hashes existing datak and sends Range header to download
   only remaining bytes; gracefully falls back to full download if server returns
   200 instead of 206; SHA256 verification catches corrupt partials

* harden push

- Prevents killing TCP connections after every request.
- Stronger backoff to handle server back-pressure and rate limiting
- Larger buffered reads for improve safetensor upload performance
- Better error message handling from server
- Handle 201 if server says blob exists
- Fix progress reporting on already uploaded blobs
- Trace logging to help troubleshoot and tune going forward

* review comments

* review comments
2026-04-08 14:15:39 -07:00
Eva H
d17f482d50 launch/opencode: detect curl installed opencode at ~/.opencode/bin (#15197) 2026-04-08 13:54:51 -07:00
Parth Sareen
4e16f562c0 launch: add openclaw channels setup (#15407) 2026-04-08 13:25:27 -07:00
Parth Sareen
55308f1421 launch: update ctx length for glm-5.1 and gemma4 (#15411)
Also adds glm-5.1 in recommended models
2026-04-08 12:11:50 -07:00
Eva H
d64812eb5d cmd: improve multi-select sorting and selection status (#15200) 2026-04-08 10:39:18 -07:00
Devon Rifkin
f86a969f27 responses: add support for fn call output arrays (#15406)
In addition to strings (which we already supported), OpenResponses
supports arrays of text content, image content, or file content (see
<https://www.openresponses.org/reference#object-FunctionCallOutput-title>).
We were missing support for these arrays, which caused unmarshal errors
like

```
json: cannot unmarshal array into Go struct field ResponsesFunctionCallOutput.output of type string
```

This change adds support for text content and image content, as those
are more straightforwardly mappable to Ollama message formats (though
image and text interleaving is lost), but it's less clear what to do for
files. In the future we can partially support this by inlining
reasonably sized text files, but wanted to get this change out first.

Fixes: #15250
2026-04-07 16:47:30 -07:00
Matteo Celani
9fa80a1660 app/ui: fix lint errors for unused vars, prefer-const, and empty catch (#15282) 2026-04-07 16:28:36 -07:00
Daniel Hiltgen
dde09129d1 gemma4: Disable FA on older GPUs where it doesn't work (#15403)
CUDA older than 7.5 lack the support to enable flash attention for the model.
2026-04-07 14:54:25 -07:00
Patrick Devine
780556c4d0 mlx: use default http client (#15405) 2026-04-07 14:53:23 -07:00
Daniel Hiltgen
dfae363b5b gemma4: add missing file (#15394)
File accidentally omitted from #15378
2026-04-07 09:18:01 -07:00
Daniel Hiltgen
30fdd229a4 create: Clean up experimental paths, fix create from existing safetensor model (#14679)
* create:  Clean up experimental paths

This cleans up the experimental features, and adds both unit and integration test coverage to verify no regressions.

* create: preserve config and layer names when creating from safetensors models

When creating a model FROM an existing safetensors model, ModelFormat,
Capabilities, and layer Name fields were lost. ModelFormat stayed empty
because it's only set from GGML layers (which safetensors models lack),
and layer names weren't copied in parseFromModel. This caused derived
models to fail loading ("config.json not found in manifest").

* review comments
2026-04-07 08:12:57 -07:00
Daniel Hiltgen
e823bff873 gemma4: enable flash attention (#15378)
Backport GGML kernels so we can enable flash attention for the gemma 4 model on
Metal and CUDA.
2026-04-07 08:12:36 -07:00
Daniel Hiltgen
8968740836 mlx: Improve M5 performance with NAX (#15345)
* mlx: Improve M5 performance with NAX

This modifies the Mac release to now have 2 builds of MLX for broader
compatibility while supporting the latest M5 hardware features.  NAX requires
building with xcode 26.2 and targetting support only for OS v26 and up.  Since
we want to support older MacOS versions as well, we now need 2 different MLX
builds and runtime detection logic to select the optimal version.  The newer
build will detect NAX missing at runtime, so it is safe to run on pre M5 macs.

* mac: prevent generate on cross-compiles

For some versions of Xcode, cmake builds are failing due to header problems in
cross-compiling during the generate phase.  Since generate is producing arch
independent generated output, we can skip this during cross-compiling.
2026-04-07 08:12:24 -07:00
Devon Rifkin
8c8f8f3450 model/parsers: add gemma4 tool call repair (#15374)
The existing strict gemma4 tool parser is still the primary path, but if
this fails, we try to repair by fixing some of the most commonly seen
mistakes these models seem to make in practice.

We repair by building up a set of candidates, and use the first candidate
that parses.

Repairs cover:

- missing Gemma string delimiters
- single-quoted string values, including a dangling Gemma delimiter
- raw terminal string values (if the corresponding tool schema indicates
  it should be a string)
- missing object close only after a concrete repair

Add regression coverage for malformed tool calls from issue #15315 and
focused unit tests for the individual repair helpers and candidate
pipeline.
2026-04-06 18:47:17 -07:00
Parth Sareen
82f0139587 launch/openclaw: patch approvedScopes baseline for TUI pairing (#15375) 2026-04-06 18:00:12 -07:00
Bruce MacDonald
26a58b294c app: update featured models (#15373)
Featured models in the app are out of date. Update them to a more recent list of models.
2026-04-06 16:35:35 -07:00
Devon Rifkin
34a790a2e6 model/parsers: suppress extra gemma4 closing tool tags (#15370)
We've observed Gemma 4 occasionally emitting extra <tool_call|> tags
after a valid tool call. We suppress leading close tags in this
immediate post-tool-call state so the extra close tags do not leak into
assistant content. The tradeoff is that if the model intentionally
begins its next content span with the literal string "<tool_call|>", we
will erroneously treat it as noise and drop it.
2026-04-06 12:41:33 -07:00
Jeffrey Morgan
4589fa2cf5 app: default app home view to new chat instead of launch (#15312) 2026-04-03 21:50:55 -07:00
Daniel Hiltgen
4bc2728047 Revert "enable flash attention for gemma4 (#15296)" (#15311)
This reverts commit c8e0878814.
2026-04-03 17:44:44 -07:00
Devon Rifkin
49d5fd5a3e model/parsers: rework gemma4 tool call handling (#15306)
Replace the custom Gemma4 argument normalizer with a stricter
reference-style conversion: preserve Gemma-quoted strings, quote bare
keys, and then unmarshal the result as JSON.

This keeps quoted scalars as strings, preserves typed unquoted values,
and adds test coverage for malformed raw-quoted inputs that the
reference implementation rejects.
2026-04-03 14:35:00 -07:00
Jesse Gross
3cd2b03a5e ggml: fix ROCm build for cublasGemmBatchedEx reserve wrapper
Add missing cublasGemmAlgo_t to hipblasGemmAlgo_t type mapping and
cast away const qualifiers that hipblasGemmBatchedEx doesn't accept.
2026-04-03 14:22:46 -07:00
Daniel Hiltgen
c8e0878814 enable flash attention for gemma4 (#15296) 2026-04-03 12:46:18 -07:00
Jesse Gross
bb0c58e134 ggml: skip cublasGemmBatchedEx during graph reservation
cublasGemmBatchedEx fails during graph capture when pool allocations
return fake pointers. This is triggered when NUM_PARALLEL is greater
than 1 for models like gemma4 that use batched matmuls. Skip it
during reservation since the memory tracking is already handled by
the pool allocations.

Fixes #15249
2026-04-03 12:41:09 -07:00
Devon Rifkin
036ed1b9b5 model/parsers: fix gemma4 arg parsing when quoted strings contain " (#15254)
* model/parsers: fix gemma4 arg parsing when quoted strings contain "

Fixes: #15241

* add more tests, be careful about what we escape

We want Windows-style paths to not get misinterpreted

* fix backslash-quote case, it really should be a literal backslash

h/t to @chathaway-codes for pointing this out!

Co-Authored-By: Charles H <2773397+chathaway-codes@users.noreply.github.com>

---------

Co-authored-by: Charles H <2773397+chathaway-codes@users.noreply.github.com>
2026-04-02 22:52:51 -07:00
Daniel Hiltgen
3536ef58f6 bench: add prompt calibration, context size flag, and NumCtx reporting (#15158)
Add --num-ctx flag to set context size, and report NumCtx in model info
header. Calibrate tokens-per-word ratio during warmup using actual
tokenization metrics from the model, replacing the fixed 1.3 heuristic.
This produces more accurate prompt token counts for --prompt-tokens.

Also add fetchContextLength() to query running model context via /api/ps.
2026-04-02 14:23:53 -07:00
Daniel Hiltgen
de9673ac3f tokenizer: add byte fallback for SentencePiece BPE encoding (#15232)
* tokenizer: add byte fallback for SentencePiece BPE encoding

When BPE merging produces tokens not in the vocabulary, fall back to
encoding each UTF-8 byte as <0xHH> byte tokens instead of silently
dropping the character. Also teach Decode to convert <0xHH> tokens
back to raw bytes.

Fixes #15229, fixes #15231

* tokenizer fixes
2026-04-02 13:04:45 -07:00
Daniel Hiltgen
96b202d34b Add support for gemma4 (#15214)
* bench: add prompt calibration, context size flag, and NumCtx reporting

Add --num-ctx flag to set context size, and report NumCtx in model info
header. Calibrate tokens-per-word ratio during warmup using actual
tokenization metrics from the model, replacing the fixed 1.3 heuristic.
This produces more accurate prompt token counts for --prompt-tokens.

Also add fetchContextLength() to query running model context via /api/ps.

* integration: improve vision test robustness and add thinking tests

Add skipIfNoVisionOverride() to skip vision tests when OLLAMA_TEST_MODEL
is set to a non-vision model. Add Think:false to context exhaustion test
to prevent thinking models from using all context before the test can
measure it. Add third test image (ollama homepage) and replace OCR test
with ImageDescription test using it. Relax match strings for broader
model compatibility. Add TestThinkingEnabled and TestThinkingSuppressed
to verify thinking output and channel tag handling.

* gemma4: add Gemma 4 GGML model support

Add full Gemma 4 model family support (E2B, E4B, 26B MoE, 31B Dense)
for the GGML backend including text, vision, converter, parser, and
renderer.

Text model features:
- Sliding window + full attention with per-layer patterns
- KV sharing across layers with donor map
- Per-layer embeddings (PLE) with learned projections
- MoE routing with RMSNorm + learned scale
- Proportional RoPE with freq_factors for global attention
- Final logit softcapping

Vision model features:
- SigLIP vision encoder with 2D RoPE
- ClippableLinear with input/output clamping via packed v.clamp_data
- Adaptive average pooling with nMerge kernel
- Multi-modal projection with unweighted RMSNorm

Converter:
- Safetensors to GGUF with vision tensor renaming
- Fused MoE gate_up_proj splitting
- Vision patch embedding reshape (HF to Conv2D layout)
- Packed clamp data tensor for ClippableLinear bounds
- Proportional RoPE freq_factors generation

Also includes:
- BackendGet() on ml.Tensor for reading weight tensor data
- Q6_K CUDA get_rows kernel support
- MoE-aware ffn_down quantization layer counting
- Gemma4 parser with tool calling and thinking support
- Gemma4 renderer with structured tool format
- Architecture-based auto-detection of renderer/parser/stop tokens
- Integration test gemma4 model list additions

* gemma4: add audio support with USM conformer encoder

Add audio encoding for Gemma 4 using the USM conformer architecture:
- Converter: audio tensor mapping, SSCP/conformer/embedder name replacements,
  softplus repacker for per_dim_scale, F32 enforcement for conv weights
- GGML backend: Conv1DDW and PadExt tensor ops
- Audio encoder: SSCP Conv2D, 12 conformer blocks (FFW + block-local
  attention with relative position embeddings + LightConv1d + FFW),
  output projection, audio-to-text embedding projector
- Audio preprocessing: WAV decode, mel spectrogram, FFT (pure Go)
- Model wiring: WAV detection, audio token handling, unified PostTokenize

Correctly transcribes "why is the sky blue" from test audio.

* integration: add gemma4 audio tests including OpenAI API coverage

Test audio transcription and response via the Ollama native API, plus
two new tests exercising the OpenAI-compatible endpoints:
- /v1/audio/transcriptions (multipart form upload)
- /v1/chat/completions with input_audio content type

All tests use capability checks and skip models without audio support.

* gemma4: add OpenAI audio API support and capability detection

- Add CapabilityAudio and detect from audio.block_count in GGUF
- Add /v1/audio/transcriptions endpoint with TranscriptionMiddleware
- Add input_audio content type support in /v1/chat/completions
- Add TranscriptionRequest/Response types in openai package

* gemma4: add audio input support for run command

- /audio toggle in interactive mode for voice chat
- Platform-specific microphone recording (AVFoundation on macOS,
  PulseAudio/ALSA on Linux, WASAPI on Windows)
- Space to start/stop recording, automatic chunking for long audio

* gemma4: add transcribe command (ollama transcribe MODEL)

- Interactive mode with readline prompt and slash commands
- Non-interactive mode for piped audio or record-until-Ctrl+C
- Chunked streaming transcription for long recordings
- Word-wrapped output matching run command style

* gemma4: add parser, renderer, and integration test plumbing

* gemma4: fix renderer to emit BOS token

* gemma4: add OpenAI audio transcription API and input_audio support

* gemma4: update converter for new weight drop naming

* gemma4: add per_expert_scale to MoE router and fix moe_intermediate_size config

* gemma4: rewrite renderer to match HF Jinja2 template exactly

Fix 8 bugs found by building 55 reference tests verified against the
HF Jinja2 chat template (VERIFY_JINJA2=1 shells out to Python):

- Tool responses use separate <|turn>tool turns (not inline tags)
- Tool calls emitted before content in assistant messages
- Thinking content stripped from assistant history (strip_thinking)
- User, tool, and system content trimmed (template does | trim)
- Empty system message still emits system turn (check role, not content)
- Nested object properties rendered recursively with required field
- Array items specification rendered for array-type properties
- OBJECT/ARRAY type-specific rendering comma logic matches template

Also adds Required field to api.ToolProperty for nested object schemas,
replaces old gemma4_test.go with comprehensive gemma4_reference_test.go,
and commits the Jinja2 template as testdata for verification.

* gemma4: fix MoE fused gate_up split and multiline tool-call arg parsing

- Text MoE: split `ffn_gate_up_exps` into contiguous `[gate|up]` halves instead of stride-2 slices.
- Parser: escape control characters in `<|"|>...<|"|>` string literals when converting tool-call args to JSON.
- Fixes warnings like `invalid character '\n' in string literal` for multiline tool arguments.
- Add Gemma4 parser regressions for multiline tool-call args and `gemma4ArgsToJSON`.

* cmd: simplify audio input to dropped file attachments

* gemma4: use full SWA memory for better cache reuse

* gemma4: initialize clamps after backend load

* convert: align gemma4 audio tensor renames with llama.cpp

* Remove redundant comments in gemma4 vision model

* Format Gemma4 MoE block field alignment

* use 4096 kvcache.NewSWAMemCache

* convert: support new Gemma4 audio_tower tensor naming (#15221)

Co-authored-by: jmorganca <jmorganca@gmail.com>

* fix integration test defaults for audio

* review comments and lint fixes

* remove unused audio/video files

---------

Co-authored-by: jmorganca <jmorganca@gmail.com>
2026-04-02 11:33:33 -07:00
Devon Rifkin
79865e6c5a app: use the same client for inference and other requests (#15204)
Previously we were accidentally using different clients/UAs depending on
whether it was an inference call or a different call. This change makes
them consistent, other than the timeout being different.
2026-04-02 11:07:50 -07:00
Parth Sareen
5ab10d347a app: add launch page for a simple way to launch integrations (#15182) 2026-04-02 10:31:19 -07:00
Eva H
a8292dd85f launch: replace deprecated OPENAI_BASE_URL with config.toml profile for codex (#15041) 2026-04-01 11:43:23 -04:00
Daniel Hiltgen
cb0033598e tokenizer: add SentencePiece-style BPE support (#15162)
* tokenizer: add SentencePiece-style BPE support

Add WithSentencePieceNormalizer option to BytePairEncoding for models
that use BPE with SentencePiece-style space markers (space to/from
U+2581).

NewBytePairEncoding is unchanged; the new NewBytePairEncodingWithOptions
constructor accepts BPEOption functions. Decoding handles the reverse
mapping of U+2581 back to spaces.

* review comments
2026-03-31 17:00:36 -07:00
Daniel Hiltgen
4d14b0ff92 mlx: respect tokenizer add_bos_token setting in pipeline (#15185)
Replace hardcoded Encode(prompt, true) with
Encode(prompt, r.Tokenizer.AddBOS()) so the pipeline respects each
model's tokenizer configuration.

Models with add_bos_token=true (gemma3, llama): unchanged, tokenizer
still prepends BOS.

Models with bos_token=null (qwen3, qwen3.5): unchanged, the BOS
guard (vocab.BOS >= 0) already prevented prepending regardless of
the flag.

This aligns the pipeline with the /v1/tokenize endpoint which already
uses Tokenizer.AddBOS().
2026-03-31 16:46:30 -07:00
Parth Sareen
d9cb70c270 docs: update pi docs (#15152) 2026-03-31 16:37:55 -07:00
Jeffrey Morgan
31f968fe1f cmd: set OpenCode default model in config (#15127) 2026-03-29 12:11:36 -07:00
Jeffrey Morgan
b7bda92d52 model: add qwen3-next compatibility for legacy ssm_in projections (#15133) 2026-03-29 11:50:47 -07:00
Parth Sareen
8e54823fd3 revert context length warnings change (#15121) 2026-03-28 16:43:59 -07:00
Parth Sareen
7c8da5679e launch: improve multi-select for already added models (#15113) 2026-03-28 13:44:40 -07:00
Parth Sareen
6214103e66 launch: auto-install pi and manage web-search lifecycle (#15118) 2026-03-28 13:06:20 -07:00
Patrick Devine
9e7cb9697e mlx: fix vision capability + min version (#15106) 2026-03-27 17:09:28 -07:00
Bruce MacDonald
3824e380a8 server: preserve raw manifest bytes during pull (#15104)
pullModelManifest unmarshals the registry response into a Go struct
then re-marshals with json.Marshal before writing to disk. When the
registry's JSON formatting or field ordering differs from Go's
output, the local SHA256 won't match the registry's
Ollama-Content-Digest header, causing false "out of date" warnings.

Preserve the raw bytes from the registry response and write them
directly to disk so the local manifest is byte-for-byte identical
to what the registry serves.
2026-03-27 15:42:31 -07:00
Devon Rifkin
c9b2dcfc52 anthropic: fix empty inputs in content blocks (#15105)
* anthropic: fix empty inputs in content blocks

When we switched to `api.ToolCallFunctionArguments`, `omitempty` stopped
doing what we were relying on it for before. This would cause non-tool
content blocks to have an `"input": {}` field, which doesn't match our
old behavior.

* use omitzero instead
2026-03-27 15:41:27 -07:00
Parth Sareen
b00bd1dfd4 launch: skip context length warning for MLX models and show model name (#15102) 2026-03-27 15:01:33 -07:00
Jesse Gross
ac83ac20c4 anthropic: fix KV cache reuse degraded by tool call argument reordering
Use typed structs for tool call arguments instead of map[string]any to
preserve JSON key order, which Go maps do not guarantee.
2026-03-27 14:30:16 -07:00
Bruce MacDonald
e7ccc129ea app: fix false "out of date" model warnings (#15101)
The staleness check compared the local manifest digest (SHA256 of the
file on disk) against the registry's Ollama-Content-Digest header.
These never matched because PullModel re-serializes the manifest JSON
before writing, producing different bytes than the registry's original.

The fallback comparison (local modified_at vs upstream push time) was
also broken: the generated TypeScript Time class discards the actual
timestamp value, so Date parsing always produced NaN.

Fix by moving the staleness comparison server-side where we have
reliable access to both the local manifest file mtime and the upstream
push time. The /api/v1/model/upstream endpoint now returns a simple
`stale` boolean instead of raw digests for the frontend to compare.

Also adds User-Agent to the CORS allowed headers for dev mode.
2026-03-27 14:15:10 -07:00
Jeffrey Morgan
69ed0c2729 parsers: qwen3.5 streaming tool-call parsing and add regression test (#15098) 2026-03-27 14:04:14 -07:00
Alfredo Matas
1cefa749aa model/parsers: close think block if tool block starts in Qwen3.5 (#15022) 2026-03-27 11:28:34 -07:00
Daniel Hiltgen
aec2fef95d ci: harden cuda include path handling (#15093)
On windows we can get multiple include dirs, so find where the headers are then
copy from that location.
2026-03-27 07:57:07 -07:00
Eva H
366625a831 launch: warn when server context length is below 64k for local models (#15044)
A stop-gap for now to guide users better. We'll add more in-depth recommendations per integration as well.

---------

Co-authored-by: Parth Sareen <parth.sareen@ollama.com>
2026-03-27 00:15:53 -07:00
Daniel Hiltgen
516ebd8548 ci: include mlx jit headers on linux (#15083)
* ci: include mlx jit headers on linux

* handle CUDA JIT headers
2026-03-26 23:10:07 -07:00
Parth Sareen
f567abc63f tui: update chat title (#15082) 2026-03-26 18:06:53 -07:00
Eva H
1adfc27f04 launch/vscode: prefer known vs code paths over code on PATH (#15073) 2026-03-26 18:06:28 -04:00
Parth Sareen
4a2b9f9dbc launch: hide cline integration (#15080) 2026-03-26 14:33:43 -07:00
Parth Sareen
e46b67a6cc launch: hide vs code (#15076) 2026-03-26 13:52:50 -07:00
Eva H
c000afe76c doc: update vscode doc (#15064)
---------

Co-authored-by: ParthSareen <parth.sareen@ollama.com>
2026-03-26 13:45:48 -07:00
Jesse Gross
9d7b18f81e mlxrunner: combine setStateRaw and setStateDetached into setState 2026-03-26 13:32:11 -07:00
Jesse Gross
4f5999fd3f mlxrunner: schedule periodic snapshots during prefill
Add periodic snapshots every 8k tokens and near the end of the prompt
so that long prompts can be partially restored and thinking/generation
can be retried without full reprocessing.
2026-03-26 13:32:11 -07:00
Jesse Gross
ac5f0dbb6a mlxrunner: improve eviction and LRU tracking
Update LRU last used time just on the nodes that actually used
during processing rather than all snapshots along the path. This
allows eviction to remove nodes more accurately so we can avoid
other heuristics to auto-merge nodes.
2026-03-26 13:32:11 -07:00
Jesse Gross
d1151e18a1 mlx: fix KV cache snapshot memory leak
mlx.Copy shares the backing buffer with its source (via
copy_shared_buffer) rather than allocating independent storage.
When used to snapshot a slice of the KV cache, the snapshot array
holds the entire original cache buffer alive through the shared
data pointer — even after eval detaches the computation graph.

Replace Copy with Contiguous in Snapshot and Split. Contiguous
allocates a compact buffer when the source buffer is significantly
larger than the logical slice (Contiguous::eval checks
buffer_size > nbytes + 16384), which is always the case for KV
cache slices.
2026-03-25 17:26:34 -07:00
rick
ebbce136c7 ggml: force flash attention off for grok 2026-03-25 16:15:49 -07:00
Devon Rifkin
26b9f53f8e api/show: overwrite basename for copilot chat (#15062)
Copilot Chat prefers to use `general.basename` in the built-in Ollama
integration, but this name isn't usually shown directly to users (and
there may be many models that share this name). Instead we pass back
`req.Model`, which for this extension is the value that we return from
`/api/tags`
2026-03-25 14:02:22 -07:00
Eva H
7575438366 cmd: ollama launch vscode (#15060)
Co-authored-by: Parth Sareen <parth.sareen@ollama.com>
2026-03-25 16:37:02 -04:00
Eva H
7d7c90d702 tui: add left arrow back navigation in model selector (#14940) 2026-03-25 11:53:48 -07:00
Daniel Hiltgen
4fda69809a ci: fix windows cgo compiler error (#15046) 2026-03-24 16:45:36 -07:00
Daniel Hiltgen
c9b5da6b0c integration: improve ability to test individual models (#14948)
* integration: improve ability to test individual models

Add OLLAMA_TEST_MODEL env var to run integration tests against a
single model.

Enhance vision tests: multi-turn chat with cached image tokens, object
counting, spatial reasoning, detail recognition, scene understanding, OCR, and
multi-image comparison.

Add tool calling stress tests with complex agent-style prompts, large
system messages, and multi-turn tool response handling.

* review comments
2026-03-24 14:28:23 -07:00
Patrick Devine
de5cb7311f mlx: add mxfp4/mxfp8/nvfp4 importing (#15015)
This change allows importing bf16 and converting to mxfp4/mxfp8/nvfp4
and also importing fp8 and converting directly to mxfp8.
2026-03-24 13:45:44 -07:00
Jesse Gross
95ee7fbd29 mlxrunner: panic on double unpin 2026-03-23 17:44:19 -07:00
Jesse Gross
ec55536734 mlxrunner: show time since last used in cache dump tree 2026-03-23 17:44:19 -07:00
Jesse Gross
77491439c2 mlxrunner: support partial match on pure transformer caches
Previously, a partial match within a node's edge would truncate the path
to the parent snapshot - effectively making all cache types behave as
recurrent caches. Caches with only transformer layers can rewind to
arbitrary boundary so this restores this capability to improve cache
hits
2026-03-23 17:44:19 -07:00
Parth Sareen
b166b36cd2 docs: update Claude Code with Telegram guide (#15026) 2026-03-23 16:31:21 -07:00
Daniel Hiltgen
c2b0bb7a52 mlx: update as of 3/23 (#14789)
* mlx: update to HEAD on 3/23

Also fixes a few misc vendoring bugs uncovered with this first update.
This also renames the version files to make them clearer.

* CUDA Fast Gated Delta kernel

* mlx: detect eval errors and panic

On model errors or missing kernels, don't mask the error, bubble it up.
2026-03-23 11:28:44 -07:00
Bruce MacDonald
22c2bdbd8a docs: nemoclaw integration (#14962)
---------

Co-authored-by: ParthSareen <parth.sareen@ollama.com>
2026-03-20 15:27:37 -07:00
Bruce MacDonald
6df6d097d9 launch: skip openclaw gateway health check when no daemon install (#14984) 2026-03-20 15:20:14 -07:00
Jesse Gross
d7c176ab91 llm, mlxrunner: fix done channel value consumed by first receiver
Receiving from a buffered chan error consumes the value, so only the
first caller (WaitUntilRunning, HasExited, or Close) sees the signal.
Subsequent receivers block or take the wrong branch. Replace with a
closed chan struct{} which can be received from any number of times,
and store the error in a separate field.
2026-03-19 17:44:28 -07:00
Jesse Gross
0ff7d724ff mlx: fix subprocess log deadlock
The stderr reader used bufio.Scanner which has a 64KB max line size.
If the subprocess wrote a line exceeding this limit, the scanner would
stop reading, the OS pipe buffer would fill, and the subprocess would
deadlock.

Replace the scanner with a statusWriter that wraps io.Copy. The writer
forwards all stderr to os.Stderr while capturing the last short line
(≤256 bytes) for error reporting, avoiding both the deadlock and the
need to buffer arbitrarily long lines.
2026-03-19 17:44:28 -07:00
Devon Rifkin
46cb7795e1 add ability to turn on debug request logging (#14106)
If `OLLAMA_DEBUG_LOG_REQUESTS` is set, then on server startup a temp
folder will be created. Upon any inference request, the body will be
logged to a file in this folder, as well as a small shell script to
"replay" the request using cURL.

This is just intended for debugging scenarios, not as something to turn
on normally.
2026-03-19 17:08:17 -07:00
Bruce MacDonald
126d8db7f3 parsers: robust xml tool repair (#14961)
Previous xml repair for glm was a good start, but we need to go further and repair any incorrect open or closing tags

Co-authored-by: Dongluo Chen <dongluo.chen@gmail.com>
2026-03-19 11:24:48 -07:00
Eva H
3f3a24b418 app: fix desktop app stuck loading when OLLAMA_HOST is an unspecified bind address (#14885) 2026-03-19 12:57:57 -04:00
Jesse Gross
96e36c0d90 mlxrunner: share KV cache across conversations with common prefixes
Enable multiple conversations to reuse cached computations when they
share token prefixes (e.g. the same system prompt). A prefix trie
tracks shared regions so switching between conversations only
recomputes tokens that diverge. Inactive conversation state is paged
from active GPU memory to other memory and restored on demand, with LRU
eviction to keep memory usage bounded.
2026-03-18 16:06:33 -07:00
Jesse Gross
6f8ddbb26b mlxrunner: fix Slice(0, 0) returning full dimension instead of empty
Slice used cmp.Or to resolve a zero stop value to the dimension size,
intended to support open-ended slices like a[i:]. This made Slice(0, 0)
indistinguishable from Slice(), so any slice with a zero stop would
silently include the entire dimension instead of being empty.

Replace cmp.Or with an explicit End sentinel and resolve negative
indices against the dimension size, matching Python/PyTorch semantics.
2026-03-18 16:06:33 -07:00
Eva H
b5e7888414 cmd/launch: skip redundant config writes when model unchanged (#14941) 2026-03-18 17:36:52 -04:00
Parth Sareen
eab4d22269 docs: update claude code and openclaw for web search (#14922) 2026-03-18 14:18:49 -07:00
Bruce MacDonald
5759c2d2d2 launch: fix openclaw not picking up newly selected model (#14943)
Sessions with a stale model field were not updated when the primary
changed, so the old model continued to be used.
2026-03-18 13:20:10 -07:00
Bruce MacDonald
42b1c2642b docs: update minimax-m2.5 references to m2.7 (#14942) 2026-03-18 12:59:28 -07:00
Bruce MacDonald
727d69ddf3 tui: fix signin on headless Linux systems (#14627)
Defensively handle environments without a display server to ensure signin remains usable on headless VMs and SSH sessions.

- Skip calling xdg-open when neither DISPLAY nor WAYLAND_DISPLAY is set, preventing silent failures or unexpected browser handlers
- Render the signin URL as plain text instead of wrapping it in OSC 8 hyperlink escape sequences, which can be garbled or hidden by terminals that don't support them
2026-03-18 11:11:17 -07:00
Jesse Gross
f622b0c5fc launch: disable claude attribution header to preserve KV cache
Claude Code sends an x-anthropic-billing-header that changes on every
request. This is embedded in the system prompt and consequently
breaks the KV cache for every request. Given the size of the prompts
that Claude Code usees, this has significant performance impact.
2026-03-17 20:48:03 -07:00
Bruce MacDonald
5d0000634c cmd/launch: check for both npm and git before installing OpenClaw (#14888)
The OpenClaw installer requires git in addition to npm. Update the
dependency check to detect both and provide specific install guidance
for whichever dependencies are missing.
2026-03-17 18:20:05 -07:00
Parth Sareen
676d9845ba launch: register websearch for openclaw (#14914) 2026-03-17 15:03:15 -07:00
Devon Rifkin
e37a9b4c01 cloud_proxy: for the web_search legacy path, flush on newlines (#14897)
`WebSearchAnthropicWriter` expects a single object per write. The new
transparent proxy will instead send it whatever bytes it sees. This
cloud-model + local-orchestration + cloud-search is a temporary code
path, so instead of making the web search code more robust to this, I
put an adapter in the middle that will flush line-by-line to preserve
the old behavior.
2026-03-17 13:30:17 -07:00
Patrick Devine
d727aacd04 mlx: quantized embeddings, fast SwiGLU, and runtime fixes (#14884)
Add QuantizedEmbedding and EmbeddingLayer interface so models can
use quantized embedding weights and expose tied output projections.
This change updates gemma3, glm4_moe_lite, llama, qwen3, and qwen3_5
to use the new interface.
2026-03-17 11:21:38 -07:00
Patrick Devine
fa69b833cd mlx: add prequantized tensor packing + changes for qwen35 (#14878)
This change adds a tensorImportTransform interface for model-specific
tensor transformations during safetensors import. This allows importing
and modifying the standard HF based weights as well as the mlx-community
derived pre-quantized safetensors repos to be directly
imported into `ollama create`. Right now this only works with Qwen3.5
importing which does tensor renaming, norm weight shifting (it
adds +1 to each value of the norm vectors), conv1d transposition,
and casts to BF16s for F32 based vectors.
2026-03-17 11:21:18 -07:00
Jesse Gross
bbbad97686 sched: Model eviction for MLX
MLX runners (image generation and LLM) previously bypassed the
scheduler's standard load path via a separate loadMLX method. This meant
they skipped VRAM fitting checks and couldn't participate in model
eviction.

Now all model types flow through the same load function. Model eviction
for MLX is based on weights as KV cache and compute graph are dynamic.
This means that eviction does not take into account the worst case
memory and models can still compete for memory but it is a significant
improvement.
2026-03-16 17:40:29 -07:00
Parth Sareen
bcf6d55b54 launch: fix web search, add web fetch, and enable both for local (#14886) 2026-03-16 16:26:19 -07:00
easonysliu
810d4f9c22 runner: fix swallowed error in allocModel graph reservation
In allocModel(), the first call to reserveWorstCaseGraph(true) had its
error silently discarded — `return nil` was used instead of `return err`.

This meant that if the prompt-sized graph reservation failed (e.g. due
to insufficient memory), the error was swallowed, allocModel reported
success, and the model appeared to load correctly. Subsequent inference
would then fail in unexpected ways because the worst-case graph was
never properly reserved.

Fix: return the actual error so the caller can handle the failure
(retry with reduced parallelism, report OOM, etc.).

Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
2026-03-16 15:48:45 -07:00
Bruce MacDonald
856c047a6c cmd/launch: skip --install-daemon when systemd is unavailable (#14883)
In container environments without systemd, `openclaw onboard
--install-daemon` exits non-zero because it cannot create a systemd
user service. This causes `ollama launch openclaw` to abort even
though the gateway can be started as a foreground child process.

Only pass --install-daemon when systemd user services are reachable
(Linux with /run/systemd/system present and XDG_RUNTIME_DIR set).
On all other platforms the flag is still included by default.
2026-03-16 13:50:04 -07:00
Daniel Hiltgen
79c1e93c00 bench: improve benchmarking tool (#14240)
New features:
- Warmup phase to eliminate cold-start outliers
- time-to-first-token measured in each epoch
- VRAM/memory tracking to identify CPU spillover
- Controlled prompt length
- Defaults to 6 epochs and 200 tokens max

Benchstat fixes:
- ns/request instead of ns/op — non-standard unit created a separate group instead of grouping with timing metrics
- Token count as the N field — benchstat interprets N as iteration count for statistical weighting, not as a token count
2026-03-15 11:47:31 -07:00
Parth Sareen
f8b657c967 cmd/launch: add guards for headless mode (#14837) 2026-03-14 00:10:02 -07:00
Bruce MacDonald
10fefe0d57 config: use native OpenClaw Ollama onboarding (#14829)
OpenClaw now accepts the Ollama onboarding flags directly upstream, so rely on its wizard state instead of the legacy integration onboarding flag.

Update first-run setup to pass the Ollama auth and model flags during onboarding, perform a best-effort update before onboarding when needed, and drop the stale test that asserted persistence of the old onboarding flag.
2026-03-13 16:28:40 -07:00
Daniel Hiltgen
2f9a68f9e9 rocm: doc driver constraints (#14833) 2026-03-13 15:53:35 -07:00
Bruce MacDonald
3980c0217d server: decompress zstd request bodies in cloud passthrough middleware (#14827)
When a zstd-compressed request (e.g. from Codex CLI) hits /v1/responses
with a cloud model the request failed.

Fix by decompressing zstd bodies before
model extraction, so cloud models are detected and proxied directly
without the writer being wrapped.
2026-03-13 15:06:47 -07:00
Parth Sareen
870599f5da launch: remove warning for default policy (#14830) 2026-03-13 15:01:38 -07:00
Bruce MacDonald
abf8e8e9c8 middleware: handle non-JSON error responses gracefully (#14828)
writeError in both OpenAI and Anthropic middleware writers would return
a raw json.SyntaxError when the error payload wasn't valid JSON (e.g.
"invalid character 'e' looking for beginning of value"). Fall back to
using the raw bytes as the error message instead.

Also use the actual HTTP status code rather than hardcoding 500, so
error types map correctly
2026-03-13 14:50:49 -07:00
Shivam Tiwari
f3f31a8192 anthropic: close thinking block before tool_use when no text in between (#14825)
Root cause: StreamConverter.Process() only incremented contentIndex when
closing a thinking block if text content was present. When a model emitted
thinking followed directly by a tool_use block (no text in between),
thinkingDone was never set and contentIndex was not incremented, causing the
tool_use content_block_start to reuse index 0. Clients expecting sequential
indices would then fail to find the tool content block.

Fix: In the tool call loop, close any open thinking block (thinkingStarted &&
!thinkingDone) and increment contentIndex before opening the tool_use block,
mirroring the existing logic that closes an open text block.

Fixes #14816
2026-03-13 13:12:05 -07:00
Devon Rifkin
9e7ba835da cmd: still populate ollama ls when using ollama run <model:cloud> (#14824)
This is temporary until `api/tags` supports cloud natively
2026-03-13 12:24:45 -07:00
Parth Sareen
347f17b8d1 launch: add compact window for claude code (#14823) 2026-03-13 12:09:23 -07:00
Devon Rifkin
081b9eb423 api/create: always propagate :cloud source for cloud models (#14822)
Otherwise, using `/save` would try to run the local model instead
2026-03-13 11:58:00 -07:00
Parth Sareen
bb867c6fdb launch: fix headless --yes integration flow and policy scoping (#14815) 2026-03-13 11:45:36 -07:00
Cadu
81f4506a61 docs: document reasoning_effort support in OpenAI-compatible API (#14821)
Add reasoning_effort and reasoning to the supported features and
request fields for /v1/chat/completions. These fields control
thinking on thinking-capable models but were previously undocumented.

Closes #14820
2026-03-13 10:57:14 -07:00
Parth Sareen
76925f1284 cmd: TUI model ordering (#14814) 2026-03-13 10:19:22 -07:00
Devon Rifkin
f676231de9 server: remove experimental aliases support (#14810) 2026-03-12 20:27:24 -07:00
Parth Sareen
af5f7c0a9e cmd: refactor tui and launch (#14609) 2026-03-12 18:39:06 -07:00
Daniel Hiltgen
a6b27d776b ci: fix missing windows zip file (#14807)
Use 7z compression (better compression rate) if found in path.  That
alone isn't sufficient to get us under 2G, so MLX is now split out as a
discrete download.  Fix CI so it will fail if artifacts fail to upload.
2026-03-12 16:14:00 -07:00
Daniel Hiltgen
539741199e mlx: perf improvements (#14768)
* mlx: perf improvements

Fix nn.go to call mlx_fast_layer_norm instead of manually implementing (mean,
subtract, variance, rsqrt, multiply, add — 6 ops)

Fix llama.go, gemma3.go to remove RepeatKV to tile K/V tensors to match the Q
head count, since scaled_dot_product_attention natively handles GQA (it just
requires n_q_heads % n_kv_heads == 0)

* review comments
2026-03-12 12:01:28 -07:00
Eva H
8f45236d09 middleware: enable local tool model for web search (#14787) 2026-03-11 17:51:39 -04:00
Parth Sareen
97013a190c openai: split mixed thinking stream chunks via ToChunks (#14648) 2026-03-11 14:21:29 -07:00
Daniel Hiltgen
c222735c02 mlx: only log load errors when MLX is needed (#14764)
This suppresses irrelevant/noisy errors in the GGML runner.
2026-03-11 10:31:31 -07:00
Daniel Hiltgen
87d21c7fc0 MLX: harden for init failures (#14777)
The CLI now links to the lazy-load MLX code, but that still happens in
init functions.  On internal MLX errors, the CLI exits before it has a
chance to start.  This change re-wires the MLX error handling so it
doesn't exit by default.  The MLX based runners currently expect exits
on failure, so they re-initialize the default error handling.  We can
refine error handling for better go stack traces in the future.
2026-03-10 22:52:23 -07:00
Jeffrey Morgan
54e05172a0 Revert "runner: add token history sampling parameters to ollama runner (#14537)" (#14776)
This reverts commit 86513cb697.
2026-03-10 21:07:52 -07:00
Parth Sareen
464186e995 config: qwen3.5 recommendations (#14758) 2026-03-10 18:04:57 -07:00
Devon Rifkin
8c4d5d6c2f cloud_proxy: send ollama client version (#14769)
This was previously included in the user agent, and we've made use of it
in the past to hotpatch bugs server-side for particular Ollama versions.
2026-03-10 15:53:25 -07:00
Parth Sareen
bc72b14016 docs: update claude code docs (#14770) 2026-03-10 15:52:41 -07:00
Parth Sareen
61086083eb server: add experimental web search and web fetch routes (#14753) 2026-03-09 21:52:12 -07:00
Daniel Hiltgen
62d1f01ab4 ci: Fix windows build (#14754)
Instead of relying on sh for wildcard, do it in Go for better windows
compatibility.
2026-03-09 19:27:59 -07:00
Daniel Hiltgen
10e51c5177 MLX: add header vendoring and remove go build tag (#14642)
* prefer rocm v6 on windows

Avoid building with v7 - more changes are needed

* MLX: add header vendoring and remove go build tag

This switches to using a vendoring approach for the mlx-c headers so that Go
can build without requiring a cmake first.  This enables building the new MLX
based code by default.  Every time cmake runs, the headers are refreshed, so we
can easily keep them in sync when we bump mlx versions.  Basic Windows
and Linux support are verified.

* ci: harden for flaky choco repo servers

CI sometimes fails due to choco not actually installing cache.  Since it just speeds up the build, we can proceed without.

* review comments
2026-03-09 17:24:45 -07:00
Patrick Devine
3e06bde643 mlx: get parameters from modelfile during model creation (#14747) 2026-03-09 15:33:24 -07:00
Eva H
6be2de8214 app: auto update should be enabled when reset to defaults (#14741) 2026-03-09 15:02:36 -04:00
Daniel Hiltgen
ebb1b9ec14 rocm: update linux to v7.2 (#14391)
* rocm: update linux to v7.2

* review comments
2026-03-09 08:26:55 -07:00
Patrick Devine
d126467d5d x/mlxrunner: replace sampler interface chain with single stateful Sampler (#14652)
- Collapse MLX sampling state into a single sample.Sampler struct (options + history).
- Replace interface-based sampler chain (TopP, TopK, penalty, etc.) with function-based transforms.
- Update request/pipeline wiring to use *sample.Sampler, seed history from prompt tokens, and append generated tokens each step.
- Implement top_p, min_p, repeat_penalty, and frequency_penalty
2026-03-07 17:50:57 -08:00
Devon Rifkin
afb4c62fbf cloud_proxy: handle stream disconnects gracefully (#14685)
Previously we were printing out bad errors for expected cases like
clients disconnecting. Now we only debug log when that happens (which
still might help in cases where we're figuring out why an integration
isn't working). For other errors, we print out a proper warning now
2026-03-06 19:18:52 -08:00
Patrick Devine
e790dc435b mlx: int4 groupsize 64 (#14682)
Change affine 4bit integers to use groupsize 64
2026-03-06 16:39:47 -08:00
Daniel Hiltgen
288077c3a3 build: smarter docker parallelism (#14653)
Our Dockerfile leverages parallel stages for more efficient builds.  However,
our old parallel settings were naive and lead to under/over utilization
depending on the capabilities of your build system.

This change switches to using Ninja for all our docker cmake builds to leverage
its smarter parallel logic.  We tell Ninja to target a load of nproc so each of
the build stages will share the load on the system aiming for full CPU use
without oversaturation.

The GPU parallelism settings are also adjusted to 4 to avoid a long-tail for
the last few GPU targets as they work through the long list of GPU
architectures.

This also fixes the Dockerfile to move Vulkan install to just the stage that
needs it instead of blocking most other GPU installs.  This should speed up CI
which always has a clean build cache.
2026-03-06 16:36:22 -08:00
Daniel Hiltgen
4425c54eda create: fix localhost handling (#14681) 2026-03-06 16:35:58 -08:00
Michael Yang
778899a5d2 docs: format compat docs (#14678) 2026-03-06 14:53:17 -08:00
Jeffrey Morgan
4eab60c1e2 Reapply "don't require pulling stubs for cloud models" again (#14608)
* Revert "Revert "Reapply "don't require pulling stubs for cloud models"" (#14606)"

This reverts commit 39982a954e.

* fix test + do cloud lookup only when seeing cloud models

---------

Co-authored-by: ParthSareen <parth.sareen@ollama.com>
2026-03-06 14:27:47 -08:00
Bruce MacDonald
1af850e6e3 parsers: repair unclosed arg_value tags in GLM tool calls (#14656)
GLM models sometimes omits </arg_value> closing tags in tool call XML, causing xml.Unmarshal to fail with "element <arg_value> closed by </tool_call>".

This is a known issue across the GLM family.

Sanitize the input to fix closing arg_key values so encoding/xml can handle it.
2026-03-06 14:08:34 -08:00
Parth Sareen
9b0c7cc7b9 cmd: override stale entries for context window pi (#14655) 2026-03-05 16:30:24 -08:00
Daniel Hiltgen
6928630601 mlx: prevent remote creation mismatch (#14651)
If the user is pointing at a remote OLLAMA_HOST, fail experimental safetensor
based create operations as we only support local creation currently.
2026-03-05 14:59:00 -08:00
Parth Sareen
9896e3627f cmd/config: fix cloud model limit lookups in integrations (#14650) 2026-03-05 13:57:28 -08:00
Bruce MacDonald
15732f0ea7 cmd: use native Ollama API endpoint for OpenClaw (#14649)
Remove the /v1 suffix from the OpenClaw provider baseUrl so it uses
the native Ollama API instead of the OpenAI-compatible endpoint. The
/v1 endpoint my break tool calling in OpenClaw.
2026-03-05 13:29:17 -08:00
Parth Sareen
562c76d7cc cmd: add qwen3.5 context length for launch (#14626) 2026-03-04 14:10:52 -08:00
Parth Sareen
122c68c151 server: loosen thinking level constraint (#14625) 2026-03-04 13:42:18 -08:00
Jeffrey Morgan
82848a7806 model: fix renderer and parser for qwen3.5 (#14605) 2026-03-03 20:58:29 -08:00
Jeffrey Morgan
39982a954e Revert "Reapply "don't require pulling stubs for cloud models"" (#14606)
This reverts commit 799e51d419.
2026-03-03 20:56:10 -08:00
Patrick Devine
e9f6ea232f Add qwen3.5-next-moe support to MLX runner and models (#14417)
This change adds support for qwen3.5-next-moe models (qwen3-next/qwen3.5-next/qwen3-coder) to the MLX runner. It also:

* introduces recurrent cache support and related MLX ops
* updates pipeline/runner integration and adds tests
* properly quantizes stacked expert tensors
* a Gated Delta Metal kernel for fast SSM inference
* adds new MLX calls for Conv1d, DepthwideConv1d, Contiguous, Exp, Log, SoftmaxAxis
2026-03-03 16:39:22 -08:00
Patrick Devine
110eff01a9 chore: remove old imagegen LLMs models (#14597)
These models are implemented in the x/mlxrunner instead.
2026-03-03 13:23:40 -08:00
Jeffrey Morgan
799e51d419 Reapply "don't require pulling stubs for cloud models"
This reverts commit 97d2f05a6d.
2026-03-03 13:17:10 -08:00
Victor-Quqi
e8fcb29586 model/renderers: fix glm-ocr image tags in renderer prompts (#14584) 2026-03-03 12:51:34 -08:00
Jeffrey Morgan
97d2f05a6d Revert "don't require pulling stubs for cloud models (#14574)" (#14596)
This reverts commit 8207e55ec7.
2026-03-03 12:51:23 -08:00
Devon Rifkin
8207e55ec7 don't require pulling stubs for cloud models (#14574)
* don't require pulling stubs for cloud models

This is a first in a series of PRs that will better integrate Ollama's
cloud into the API and CLI. Previously we used to have a layer of
indirection where you'd first have to pull a "stub" model that contains
a reference to a cloud model. With this change, you don't have to pull
first, you can just use a cloud model in various routes like `/api/chat`
and `/api/show`. This change respects
<https://github.com/ollama/ollama/pull/14221>, so if cloud is disabled,
these models won't be accessible.

There's also a new, simpler pass-through proxy that doesn't convert the
requests ahead of hitting the cloud models, which they themselves
already support various formats (e.g., `v1/chat/completions` or Open
Responses, etc.). This will help prevent issues caused by double
converting (e.g., `v1/chat/completions` converted to `api/chat` on the
client, then calling cloud and converting back to a
`v1/chat/completions` response instead of the cloud model handling the
original `v1/chat/completions` request first).

There's now a notion of "source tags", which can be mixed with existing
tags. So instead of having different formats like`gpt-oss:20b-cloud` vs.
`kimi-k2.5:cloud` (`-cloud` suffix vs. `:cloud`), you can now specify
cloud by simply appending `:cloud`. This PR doesn't change model
resolution yet, but sets us up to allow for things like omitting the
non-source tag, which would make something like `ollama run
gpt-oss:cloud` work the same way that `ollama run gpt-oss` already works
today.

More detailed changes:

- Added a shared model selector parser in `types/modelselector`:
  - supports `:cloud` and `:local`
  - accepts source tags in any position
  - supports legacy `:<tag>-cloud`
  - rejects conflicting source tags
- Integrated selector handling across server inference/show routes:
  - `GenerateHandler`, `ChatHandler`, `EmbedHandler`,
    `EmbeddingsHandler`, `ShowHandler`
- Added explicit-cloud passthrough proxy for ollama.com:
  - same-endpoint forwarding for `/api/*`, `/v1/*`, and `/v1/messages`
  - normalizes `model` (and `name` for `/api/show`) before forwarding
  - forwards request headers except hop-by-hop/proxy-managed headers
  - uses bounded response-header timeout
  - handles auth failures in a friendly way
- Preserved cloud-disable behavior (`OLLAMA_NO_CLOUD`)
- Updated create flow to support `FROM ...:cloud` model sources (though
  this flow uses the legacy proxy still, supporting Modelfile overrides
  is more complicated with the direct proxy approach)
- Updated CLI/TUI/config cloud detection to use shared selector logic
- Updated CLI preflight behavior so explicit cloud requests do not
  auto-pull local stubs

What's next?

- Cloud discovery/listing and cache-backed `ollama ls` / `/api/tags`
- Modelfile overlay support for virtual cloud models on OpenAI/Anthropic
  request families
- Recommender/default-selection behavior for ambiguous model families
- Fully remove the legacy flow

Fixes: https://github.com/ollama/ollama/issues/13801

* consolidate pull logic into confirmAndPull helper

pullIfNeeded and ShowOrPull shared identical confirm-and-pull logic.
Extract confirmAndPull to eliminate the duplication.

* skip local existence checks for cloud models

ModelExists and the TUI's modelExists both check the local model list,
which causes cloud models to appear missing. Return true early for
explicit cloud models so the TUI displays them beside the integration
name and skips re-prompting the model picker on relaunch.

* support optionally pulling stubs for newly-style names

We now normalize names like `<family>:<size>:cloud` into legacy-style
names like `<family>:<size>-cloud` for pulling and deleting (this also
supports stripping `:local`). Support for pulling cloud models is
temporary, once we integrate properly into `/api/tags` we won't need
this anymore.

* Fix server alias syncing

* Update cmd/cmd.go

Co-authored-by: Parth Sareen <parth.sareen@ollama.com>

* address comments

* improve some naming

---------

Co-authored-by: ParthSareen <parth.sareen@ollama.com>
2026-03-03 10:46:33 -08:00
Jesse Gross
ad16bffc7d mlx: Remove peak memory from the API
This is still in flux so it is better to just log it for now.
2026-03-02 15:56:18 -08:00
Jesse Gross
c1e3ef4bcc mlxrunner: Refcount pinned tensors
Otherwise, it is error prone to manage multiple components working
with the same tensor.
2026-03-02 15:56:06 -08:00
Parth Sareen
a3093cd5e5 cmd/opencode: rename provider from "Ollama (local)" to "Ollama" (#14566)
The "(local)" qualifier is unnecessary since there's only one Ollama
provider. Existing configs with the old name are migrated automatically;
custom names are left unchanged.
2026-03-02 14:17:18 -08:00
Bruce MacDonald
23d4cad1a2 server: verify digest is not empty on create (#14555)
An empty digest is not a valid digest for an incoming create request. Reject empty digests at the api level.
2026-03-02 13:43:35 -08:00
Jeffrey Morgan
86513cb697 runner: add token history sampling parameters to ollama runner (#14537) 2026-03-01 19:16:07 -08:00
Jeffrey Morgan
3490e9590b model/qwen3next: avoid crash in in DeltaNet when offloading (#14541)
Co-authored-by: Yossi Ovadia <jabadia@gmail.com>
2026-03-01 18:44:04 -08:00
Jeffrey Morgan
8da09b1e7e qwen3next: add compatibility with imported GGUF models (#14517) 2026-02-28 14:21:42 -08:00
Jesse Gross
a60b9adcce mlxrunner: Fix prompt eval timing and count metrics
Only the last token's processing time is included in prompt processing,
giving an artificially high rate. In addition, the number of tokens
only included the tokens that miss the cache, instead of our historic
total tokens.
2026-02-27 17:29:47 -08:00
Jesse Gross
a16f96658b mlxrunner: Enforce model context limit
Currently, context length is unbounded - the cache will keep
growing forever independent of the model's trained context
length. This caps it and enforces semantics similar to most
cloud services:
 - Long prompts will result in an error, not truncation.
 - Generation that exceeds the context will be stopped
2026-02-27 17:29:47 -08:00
Jesse Gross
18ab09b431 mlxrunner: Propagate pipeline errors to client via api.StatusError
Errors that occur during pipeline processing are currently only
logged but not sent back to the client. Rather than using HTTP
status codes as we have historically done, this serializes errors
as messages to allow sending them at any time during the stream.
2026-02-27 17:29:47 -08:00
Jesse Gross
638faeac54 mlxrunner: Report actual memory usage from runner
The MLX runner previously reported a static VRAM estimate that was
computed at load time and consisted only of the weights. This is
strictly less than the actual memory usage, as it does not include
the KV cache or compute graph.
2026-02-27 17:29:47 -08:00
Jesse Gross
dd5eb6337d mlxrunner: Fix panic on full KV cache hit
When the entire prompt was already cached (e.g. repeated prompt),
findRemaining returned an empty slice, causing FromValues to panic
on an index-out-of-range accessing a zero-length byte slice.

Fix by always keeping at least one token to re-evaluate so the
pipeline can seed token generation. Also reject empty prompts
early rather than panicking.
2026-02-27 11:07:03 -08:00
Patrick Devine
79917cf80b show peak memory usage (#14485) 2026-02-26 18:38:27 -08:00
Parth Sareen
cc90a035a0 model/parsers: add stable tool call indexing for glm47 and qwen3 parsers (#14484) 2026-02-26 18:14:29 -08:00
Jeffrey Morgan
d98dda4676 model: fix qwen3 tool calling in thinking (#14477)
Align Qwen parser behavior with Transformers serve by allowing <tool_call> parsing while still in thinking collection.

Changes:

- qwen3vl: detect <tool_call> before </think> in thinking state and transition to tool parsing

- qwen3: same thinking-state tool detection and partial-tag overlap handling

- tests: update qwen3vl thinking/tool interleaving expectations

- tests: add qwen3 cases for tool call before </think> and split <tool_call> streaming
2026-02-26 16:13:18 -08:00
Eva H
d69ddc1edc fix: window app crash on startup when update is pending (#14451) 2026-02-26 16:47:12 -05:00
Eva H
9bf41969f0 app: fix first update check delayed by 1 hour (#14427) 2026-02-25 18:29:55 -05:00
Jesse Gross
0f23b7bff5 mlxrunner: Cancel in-flight requests when the client disconnects
Currently, a canceled request can result in computation continuing
in the background to completion. It can also trigger a deadlock
when there is nobody to read the output tokens and the pipeline
cannot continue to the next request.
2026-02-25 14:00:42 -08:00
Jesse Gross
4e57d2094e mlxrunner: Simplify pipeline memory and cache management
Particularly in error cases, it can be difficult to ensure that
all pinned memory is unpinned, MLX buffers are released and cache
state is consistent. This encapsulates those pieces and sets up
proper deferrals so that this happens automatically on exit.
2026-02-25 14:00:42 -08:00
Jeffrey Morgan
7f9efd53df model: add support for qwen3.5-27b model (#14415) 2026-02-25 01:09:58 -08:00
Jeffrey Morgan
da70c3222e model: support for qwen3.5 architecture (#14378) 2026-02-24 20:08:05 -08:00
Bruce MacDonald
9d902d63ce ggml: ensure tensor size is valid (#14406)
When quantizing tensors during model creation validate that the resulting sizes match what is expected based on the shape.
2026-02-24 21:52:44 -04:00
Daniel Hiltgen
f4f0a4a471 update mlx-c bindings to 0.5.0 (#14380)
* chore: update mlx-c bindings to 0.5.0 (#14303)

* linux: use gcc 11

---------

Co-authored-by: Patrick Devine <patrick@infrahq.com>
2026-02-23 16:44:29 -08:00
Eva H
3323c1d319 app: add upgrade configuration to settings page (#13512) 2026-02-23 18:08:52 -05:00
Jesse Gross
f20dc6b698 mlx: don't default to affine quantization for unquantized models
Otherwise the BF16 version of models trigger segfaults when they
call into quantized kernels.
2026-02-23 15:03:53 -08:00
Jeffrey Morgan
4b2ac1f369 model: improvements to LFM architectures (#14368) 2026-02-23 14:38:10 -08:00
Jesse Gross
8daf47fb3a mlxrunner: Fix duplicate log prefixes and reduce log noise
Pass subprocess stdout/stderr through to the parent's stderr directly
instead of re-wrapping each line with slog. The subprocess already
writes structured slog output, so the re-wrapping produced nested
timestamps, levels, and message fields that were hard to read.

Also downgrade verbose KV cache debug logs to trace level.
2026-02-23 14:09:20 -08:00
Eva H
6c980579cd ui: use capability-based detection for web search (#14336) 2026-02-23 15:00:09 -05:00
Jesse Gross
5c73c4e2ee mlxrunner: Simplify KV cache to single-entry prefix matching
The KV cache previously used a tree structure which could
store multiple divergent sequences, which is good for cache
reuse. However, this is typically used in conjunction with
paged attention so each node in the tree can store just a
chunk of the KV cache and they can be stitched together later.
We don't currently do this, so the cache was storing copies of
the full cache for each past sequence.

This redundancy plus the lack of resource limits, caused significant
memory use as a conversation grew. Instead, this changes to store
a single entry for the cache, which can be prefix matched. Although
it is less ideal for multiple users, it largely matches Ollama's
current behavior. It can be improved as additional pieces are fleshed
out.
2026-02-23 09:50:07 -08:00
Jesse Gross
5daf59cc66 mlxrunner: Fix memory leaks with pin/sweep lifecycle management
The previous approach tracked array lifecycles through reference
counting, where each array recorded its inputs and a reference count
that was decremented as dependents were freed. This is not really
necessary as MLX tracks references internally. It is also error
prone as it is easy to create new arrays and forget to free them
when the Go variable goes out of scope.

Instead, we can pin just the arrays we want (typically outputs and
specific intermediates, like the cache). All other arrays are freed
by default when we run sweep. This avoids most causes of memory leaks
while still giving the freedom to save what we want.
2026-02-23 09:50:07 -08:00
Jeffrey Morgan
0ade9205cc models: add nemotronh architecture support (#14356) 2026-02-22 15:09:14 -08:00
Parth Sareen
06edabdde1 cmd/config: install web search plugin to user-level extensions dir (#14362) 2026-02-22 02:17:03 -08:00
Jeffrey Morgan
8b4e5a82a8 mlx: remove noisy error output from dynamic library loading (#14346)
The recent change in #14322 added tryLoadByName() which attempts to
load libmlxc.dylib via rpath before searching directories. This is an
optimization for Homebrew installations where rpath is correctly set.

However, when rpath isn't set (which is the common case for app bundle
installations), dlopen fails and the CHECK macro prints an error to
stderr:

  ERROR - dynamic.c:21 - CHECK failed: handle->ctx != NULL

This error is misleading because it's an expected failure path - the
code correctly falls back to searching the executable directory and
loads the library successfully. The error message causes user confusion
and makes it appear that something is broken.

Replace the CHECK macro with a simple return code so the C code fails
silently. The Go code already handles error logging appropriately:
tryLoadByName() fails silently (intentional fallback), while
tryLoadFromDir() logs via slog.Error() when explicit path loading fails.
2026-02-20 23:46:07 -08:00
Parth Sareen
3445223311 cmd: openclaw onboarding (#14344) 2026-02-20 19:08:38 -08:00
Jeffrey Morgan
fa6c0127e6 app: expose server's default context length to UI (#14037)
Parse the default_num_ctx from the server's "vram-based default context"
log line and expose it through the inference compute API. This eliminates
duplicate VRAM tier calculation logic in the frontend.

- Add InferenceInfo struct with Computes and DefaultContextLength
- Rename GetInferenceComputer to GetInferenceInfo
- Handle missing default context line gracefully (older servers)
- Add DefaultContextLength to InferenceComputeResponse
- Update Settings UI to use server's default, disable slider while loading
- Add disabled prop to Slider component (grays out + hides handle)
- Migrate existing users with context_length=4096 to 0 (auto mode)
2026-02-20 18:56:30 -08:00
Patrick Devine
97323d1c68 consolidate the tokenizer (#14327)
This change adds a new x/tokenizer package which includes:
  * New BPE and SentencePiece tokenizers
  * Removing the dependency on the imagegen tokenizers
  * Fixes to multibyte decoding in the pipeline
  * Various correctness and benchmark tests

Not included in this PR is the WordPiece tokenizer for BERT models which will be
added when we add embedding models. The imagegen tokenizers will also be removed in
a follow-up PR.
2026-02-19 15:55:45 -08:00
natl-set
458dd1b9d9 mlx: try loading library via rpath before searching directories (#14322)
The existing code manually searches directories for libmlxc.* and passes
full paths to dlopen, bypassing the binary's rpath. This means MLX
libraries installed via package managers (e.g., Homebrew) aren't found
even when rpath is correctly set at link time.

This change adds a fallback that tries loading via rpath first (using
just the library name), before falling back to the existing directory
search. This follows standard Unix/macOS conventions and works with any
installation that sets rpath.

Fixes library loading on macOS with Homebrew-installed mlx-c without
requiring OLLAMA_LIBRARY_PATH environment variable.

Co-authored-by: Natl <nat@MacBook-Pro.local>
2026-02-19 10:55:02 -08:00
Bruce MacDonald
9d02d1d767 install: prevent partial download script execution (#14311)
Wrap script in main function so that a truncated partial download doesn't end up executing half a script.
2026-02-18 18:32:45 -08:00
Bruce MacDonald
1a636fb47a cmd: set codex env vars on launch and handle zstd request bodies (#14122)
The Codex runner was not setting OPENAI_BASE_URL or OPENAI_API_KEY, this prevents Codex from sending requests to api.openai.com instead of the local Ollama server. This mirrors the approach used by the Claude runner.

Codex v0.98.0 sends zstd-compressed request bodies to the /v1/responses endpoint. Add decompression support in ResponsesMiddleware with an 8MB max decompressed size limit to prevent resource exhaustion.
2026-02-18 17:19:36 -08:00
Patrick Devine
0759fface9 Revert "chore: update mlx-c bindings to 0.5.0 (#14303)" (#14316)
This reverts commit f01a9a7859.
2026-02-18 17:01:25 -08:00
Parth Sareen
325b72bc31 cmd/tui: default to single-select for editor integrations (#14302) 2026-02-17 18:17:27 -08:00
Patrick Devine
f01a9a7859 chore: update mlx-c bindings to 0.5.0 (#14303) 2026-02-17 16:48:16 -08:00
Patrick Devine
9aefd2dfee model: add qwen3 support to mlxrunner (#14293) 2026-02-17 13:58:49 -08:00
Patrick Devine
d07e4a1dd3 bugfix: better mlx model scheduling (#14290)
This fixes a bug with current MLX based models which don't get loaded/unloaded correctly. The first model currently gets loaded and then subsequent model starts get shunted to the first runner which results in the wrong model being run.
2026-02-17 13:57:05 -08:00
Parth Sareen
8a257ec00a docs: make integrations more discoverable (#14301)
* docs: add Pi integration page

* docs: flatten integration sidebar with expanded subheadings

* docs: add OpenClaw and Claude Code to quickstart
2026-02-17 13:27:25 -08:00
Parth Sareen
2f4de1acf7 cmd: ollama launch always show model picker (#14299) 2026-02-17 12:02:14 -08:00
Parth Sareen
ec95c45f70 cmd/config: ollama launch cline CLI (#14294) 2026-02-17 11:37:53 -08:00
Patrick Devine
3a88f7eb20 bugfix: add missing linear layer factory (#14289) 2026-02-16 17:22:20 -08:00
Patrick Devine
0d5da826d4 bugfix: display the parameter count correctly in mlx for ollama show (#14285) 2026-02-16 13:03:34 -08:00
Patrick Devine
9b795698b8 model: add llama3 architecture to mlxrunner (#14277) 2026-02-15 23:06:28 -08:00
Patrick Devine
041fb77639 model: add gemma3 to the mlxrunner (#14276)
This change adds the gemma3 model to the mlxrunner and simplifies some of the quantization
code for loading weights.
2026-02-15 22:47:59 -08:00
Saumil Shah
8224cce583 readme: update download link for macOS (#1) (#14271) 2026-02-15 15:25:15 -08:00
Patrick Devine
d18dcd7775 mlxrunner fixes (#14247)
* load glm4_moe_lite from the mlxrunner

* fix loading diffusion models

* remove log lines

* fix --imagegen flag
2026-02-13 22:30:42 -08:00
Parth Sareen
5f5ef20131 anthropic: enable websearch (#14246) 2026-02-13 19:20:46 -08:00
Parth Sareen
f0a07a353b cmd/tui: fix powershell search (#14242) 2026-02-13 15:53:11 -08:00
Devon Rifkin
948de6bbd2 add ability to disable cloud (#14221)
* add ability to disable cloud

Users can now easily opt-out of cloud inference and web search by
setting

```
"disable_ollama_cloud": true
```

in their `~/.ollama/server.json` settings file. After a setting update,
the server must be restarted.

Alternatively, setting the environment variable `OLLAMA_NO_CLOUD=1` will
also disable cloud features. While users previously were able to avoid
cloud models by not pulling or `ollama run`ing them, this gives them an
easy way to enforce that decision. Any attempt to run a cloud model when
cloud is disabled will fail.

The app's old "airplane mode" setting, which did a similar thing for
hiding cloud models within the app is now unified with this new cloud
disabled mode. That setting has been replaced with a "Cloud" toggle,
which behind the scenes edits `server.json` and then restarts the
server.

* gate cloud models across TUI and launch flows when cloud is disabled

Block cloud models from being selected, launched, or written to
integration configs when cloud mode is turned off:

- TUI main menu: open model picker instead of launching with a
  disabled cloud model
- cmd.go: add IsCloudModelDisabled checks for all Selection* paths
- LaunchCmd: filter cloud models from saved Editor configs before
  launch, fall through to picker if none remain
- Editor Run() methods (droid, opencode, openclaw): filter cloud
  models before calling Edit() and persist the cleaned list
- Export SaveIntegration, remove SaveIntegrationModel wrapper that
  was accumulating models instead of replacing them

* rename saveIntegration to SaveIntegration in config.go and tests

* cmd/config: add --model guarding and empty model list fixes

* Update docs/faq.mdx

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* Update internal/cloud/policy.go

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* Update internal/cloud/policy.go

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* Update server/routes.go

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* Revert "Update internal/cloud/policy.go"

This reverts commit 8bff8615f9.

Since this error shows up in other integrations, we want it to be
prefixed with Ollama

* rename cloud status

* more status renaming

* fix tests that weren't updated after rename

---------

Co-authored-by: ParthSareen <parth.sareen@ollama.com>
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
2026-02-12 15:47:00 -08:00
Parth Sareen
598b74d42c cmd/config: add minimax-m2.5 (#14223) 2026-02-12 14:29:50 -08:00
Jeffrey Morgan
935a48ed1a scripts: skip macOS symlink creation if already correct (#14142) 2026-02-12 12:44:42 -08:00
Daniel Hiltgen
de39e24bf7 win: progress reporting on install download (#14219)
* win: progress reporting on install download

Downloading Ollama...
  [####################################    ] 91%  1106.6 / 1204.2 MB

* review comments
2026-02-12 12:06:56 -08:00
Eva H
519b11eba1 site: update readme (#14217) 2026-02-12 12:14:13 -05:00
Eva H
379fd64fa8 Revert "update README (#14213)" (#14215) 2026-02-12 12:06:00 -05:00
frob
59c019a6fb x: configurable model load timeout (#14204)
Co-authored-by: rick <rick@frob.com.au>
2026-02-12 09:05:42 -08:00
Eva H
fad3bcccb2 update README (#14213) 2026-02-12 11:59:42 -05:00
Bruce MacDonald
bd6697ad95 docs: update quickstart for tui (#14208) 2026-02-12 08:44:33 -08:00
SamareshSingh
f8dc7c9f54 docs: fix openapi schema for /api/ps and /api/tags endpoints (#14210) 2026-02-11 17:37:40 -08:00
Patrick Devine
4a3741129d bug: fix loading non-mlx models when ollama is built with mlx support (#14211)
This change fixes an issue where GGML based models (for either the Ollama runner or
the legacy llama.cpp runner) would try to load the mlx library. That would panic
and the model fails to start.
2026-02-11 14:48:33 -08:00
Parth Sareen
77ba9404ac cmd/tui: improve model picker UX (#14209) 2026-02-11 14:36:54 -08:00
Patrick Devine
0aaf6119ec feature: add ctrl-g to allow users to use an editor to edit their prompt (#14197) 2026-02-11 13:04:41 -08:00
Parth Sareen
f08427c138 cmd: TUI UX improvements (#14198) 2026-02-11 10:18:41 -08:00
Maternion
2dbb000908 update context length format. 2026-02-10 17:06:05 -08:00
Maternion
c980e19995 Fix formatting of context length notes in documentation 2026-02-10 17:06:05 -08:00
Maternion
6162374ca9 Update context-length.mdx 2026-02-10 17:06:05 -08:00
Patrick Devine
44bdd9a2ef Add MLX runner with GLM4-MoE-Lite model support (#14185)
This change adds a new MLX based runner which includes:

  * Method-based MLX bindings
  * Subprocess-based MLX runner (x/mlxrunner)
  * KV cache with tree management
  * A basic sampler

The GLM4-MoE-Lite model has been ported to use the new bindings.

---------

Co-authored-by: Michael Yang <git@mxy.ng>
2026-02-10 14:57:57 -08:00
Michael
db493d6e5e docs: update broken links on FAQ and quick cleanup (#14194)
docs: update broken links on FAQ and quick cleanup
2026-02-10 16:52:20 -05:00
Bruce MacDonald
75695f16a5 docs: integration overview (#13831)
Group integrations into high-level types
2026-02-10 11:41:09 -08:00
Patrick Devine
a0407d07fa safetensors quantization for mlx (#14184)
This change includes:
  - changes to the safetensors metadata format
  - changes to the create command to properly create the blobs with the new format
  - changes to load the new format
  - fixes ollama show to properly show each tensor
2026-02-10 11:29:17 -08:00
Jeffrey Morgan
9ec733e527 cmd: make 'ollama login' and 'ollama logout' aliases for 'ollama signin' and 'ollama signout' respectively (#14144) 2026-02-09 19:12:42 -08:00
Parth Sareen
5ef04dab52 cmd: ollama launch pi (#14084) 2026-02-09 19:07:41 -08:00
Daniel Hiltgen
aea316f1e9 win: add curl-style install script (#14178)
This adds a new powershell install script suitable for running via

  irm https://ollama.com/install.ps1 | iex

If you download the script and run '-?' it reports basic usage
information, as well as usage examples for common customization
options.  The script is signed as part of the release process
to ensure it can run on a typically configured Windows system.

This does not include doc updates - we can merge those after a release
ships to avoid user confusion.
2026-02-09 15:28:11 -08:00
Patrick Devine
235ba3df5c cmd: ollama menu and launch improvements (#14038) 2026-02-09 11:30:16 -08:00
Jeffrey Morgan
099a0f18ef build: fix Dockerfile mlx directory (#14131) 2026-02-06 17:08:53 -08:00
Richard Lyons
fff696ee31 docs: increased RAM requirement for parallelism 2026-02-06 15:49:39 -08:00
Jeffrey Morgan
2e3ce6eab3 anthropic: do not count image tokens for now (#14127) 2026-02-06 15:33:18 -08:00
Parth Sareen
9e2003f88a cmd/config: offer to pull missing models instead of erroring (#14113) 2026-02-06 10:19:47 -08:00
Parth Sareen
42e1d49fbe cmd: fix context limits for droid and add qwen3-coder-next ctx (#14112) 2026-02-05 22:29:53 -08:00
Michael Yang
814630ca60 Revert "move tokenizers to separate package (#13825)" (#14111) 2026-02-05 20:49:08 -08:00
Parth Sareen
87cf187774 cmd: set claude code env vars on launch (#14109)
Set ANTHROPIC_DEFAULT_OPUS_MODEL, ANTHROPIC_DEFAULT_SONNET_MODEL,
ANTHROPIC_DEFAULT_HAIKU_MODEL, and CLAUDE_CODE_SUBAGENT_MODEL when
launching Claude Code so all model tiers route through Ollama.
2026-02-05 19:04:53 -08:00
Michael Yang
6ddd8862cd chore: move x/mlxrunner into x/imagegen (#14100) 2026-02-05 18:25:56 -08:00
Michael Yang
f1373193dc move tokenizers to separate package (#13825) 2026-02-05 17:44:11 -08:00
Parth Sareen
8a4b77f9da cmd: set context limits for cloud models in opencode (#14107) 2026-02-05 16:36:46 -08:00
Parth Sareen
5f53fe7884 cmd: ollama launch improvements (#14099) 2026-02-05 15:08:17 -08:00
Bruce MacDonald
7ab4ca0e7f scripts: add macOS support to install.sh (#14060)
Allow installing Ollama on MacOS directly from the command line. This is in line with other CLI tools and results in a more streamlined experience when the user is looking to use the CLI specifically.
2026-02-05 14:59:01 -08:00
Jeffrey Morgan
e36f389e82 scheduler: default parallel=1 for qwen3next/lfm (#14103) 2026-02-05 12:48:25 -08:00
Jesse Gross
c61023f554 ollamarunner: Fix off by one error with numPredict
When numPredict is set, the user will receive one less token
than the requested limit. In addition, the stats will incorrectly
show the number of tokens returned as the limit. In cases where
numPredict is not set, the number of tokens is reported correctly.

This occurs because numPredict is checked when setting up the next
batch but hitting the limit will terminate the current batch as well.
Instead, is is better to check the limit as we actually predict them.
2026-02-04 17:14:24 -08:00
Jeffrey Morgan
d25535c3f3 qwen3next: avoid inplace sigmoid for shared gate (#14077) 2026-02-04 15:50:02 -08:00
Bruce MacDonald
c323161f24 cmd: helpful error message for remote models (#14057)
When trying to use cloud model with OLLAMA_HOST="ollama.com" while not signed in a helpful error message is displayed when the user is not signed in telling them they must sign in to use cloud models. This should be the same experience for models which specify a remote instance.
2026-02-04 14:55:11 -08:00
Jeffrey Morgan
255579aaa7 qwen3next: fix issue in delta net (#14075)
gDiffExp was being broadcast across the wrong axis when multiplying with k. This fix reshapes gDiffExp to [1, chunkSize, nChunks, ...]
2026-02-04 13:40:38 -08:00
Jeffrey Morgan
f7102ba826 runner: discard compute results if sequence replaced mid-batch (#14072)
If a sequence is replaced in s.seqs while a batch is computing, the old logits can be decoded into the new sequence. This change rechecks the sequence pointer after compute and skips decoding for replaced entries, preventing stale results from being applied.
2026-02-04 13:19:48 -08:00
Jeffrey Morgan
cefabd79a8 Revert "cmd: claude launch improvements (#14064)" (#14071)
This reverts commit ee25219edd.
2026-02-04 09:10:37 -08:00
Jeffrey Morgan
df70249520 server: optimize chatPrompt to reduce tokenization calls (#14040)
Change the truncation algorithm to start with all messages and remove
from the front until it fits, rather than adding messages one at a time
from the back. This reduces tokenization calls from O(n) to O(1) in the
common case where all messages fit in context.
2026-02-04 01:21:31 -08:00
Jeffrey Morgan
77eb2ca619 model: add qwen3-next architecture (#14051) 2026-02-03 23:27:21 -08:00
Parth Sareen
ee25219edd cmd: claude launch improvements (#14064) 2026-02-03 19:33:58 -08:00
Jeffrey Morgan
b1fccabb34 Revert "Update vendored llama.cpp to b7847" (#14061) 2026-02-03 18:39:36 -08:00
Bruce MacDonald
a6355329bf cmd: open browser on ollama signin when available (#14055)
When a browser is available open it to the connect URL automatically when running the `ollama signin` command. Browser is not opened in any other unauthorized scenario.
2026-02-03 16:42:09 -08:00
Parth Sareen
0398b24b42 cmd: launch defaults (#14035) 2026-02-02 23:19:11 -08:00
Parth Sareen
75b1dddf91 cmd: launch extra params (#14039) 2026-02-03 02:03:33 -05:00
Parth Sareen
e1e80ffc3e cmd/config: move config location (#14034) 2026-02-02 22:48:51 -05:00
Aleksandr Vukmirovich
71896485fd anthropic: add InputTokens to streaming response (#13934)
---------

Co-authored-by: ParthSareen <parth.sareen@ollama.com>
2026-02-02 18:29:37 -08:00
Jeffrey Morgan
ef00199fb4 Update vendor ggml code to a5bb8ba4 (#13832)
Co-authored-by: Daniel Hiltgen <daniel@ollama.com>
Co-authored-by: Gabe Goodhart <ghart@us.ibm.com>
Co-authored-by: Shalini Salomi Bodapati <Shalini.Salomi.Bodapati@ibm.com>
2026-02-02 17:31:59 -08:00
Jeffrey Morgan
8f4a008139 Add GLM-OCR vision model support (#14024) 2026-02-02 15:39:18 -08:00
Patrick Devine
d8cc798c2b glm 4.7 flash support on experimental engine (#13838) 2026-02-02 15:22:11 -08:00
Richard Lyons
6582f6da5c llm: Make "do load request" error message more informative 2026-02-02 11:13:21 -08:00
Jesse Gross
0334ffa625 server: use tiered VRAM-based default context length
Replace binary low VRAM mode with tiered VRAM thresholds that set
default context lengths for all models:

- < 24 GiB VRAM: 4,096 context
- 24-48 GiB VRAM: 32,768 context
- >= 48 GiB VRAM: 262,144 context
2026-02-02 10:47:09 -08:00
Jesse Gross
d11fbd2c60 server: fix ollama ps showing configured instead of actual context length
When context length is clamped to the model's trained context length,
ollama ps now shows the actual clamped value instead of the originally
configured value.
2026-02-02 10:47:09 -08:00
Jeffrey Morgan
6a7c3f188e openclaw: run onboarding for fresh installs (#14006)
When launching OpenClaw without prior onboarding, run the onboarding
wizard instead of going straight to gateway. This ensures proper
gateway configuration (mode, token, etc.) before first use.

- Add onboarded() to check for wizard.lastRunAt marker in config
- Run onboard with --auth-choice skip --gateway-token ollama for fresh installs
- Existing installs (onboarding completed) run gateway directly
2026-02-01 13:46:45 -08:00
Jeffrey Morgan
427e2c962a docs: add redirect from clawdbot to openclaw (#14004) 2026-01-31 20:50:42 -08:00
Thanh Nguyen
27db7f806f cmd/config: rename integration to openclaw (#13979)
---------

Co-authored-by: ParthSareen <parth.sareen@ollama.com>
2026-01-31 18:31:13 -05:00
Dhiraj Lochib
3590fbfa76 runner: fix typo 'baackend' -> 'backend' in error messages (#13645)
Fix typo in three error messages where 'baackend' was written instead
of 'backend' in the /health endpoint handler when initializing the
dummy model load.
2026-01-31 13:26:20 -08:00
noureldin-azzab
cd0094f772 added stakpak to web & desktop (#13961) 2026-01-31 13:04:34 -08:00
Louis Beaumont
06bc8e6712 docs: add Screenpipe to Community Integrations (#13906)
Screenpipe is a 24/7 screen & mic recording tool that uses Ollama
for local LLM-powered search and AI features. 16k+ GitHub stars.
2026-01-31 12:49:52 -08:00
frob
fc5f9bb448 docs: remove unsupported quantizations (#13982) 2026-01-31 12:46:20 -08:00
frob
a0740f7ef7 docs: add GB10 to supported devices (#13987) 2026-01-31 12:45:27 -08:00
Parth Sareen
a0923cbdd0 cmd: ollama launch add placeholder text for selector (#13966) 2026-01-29 09:48:49 -08:00
Seokrin Taron Sung
f92e362b2e cmd: capitalize Ollama in serve command help text (#13965) 2026-01-29 09:47:53 -08:00
Tincho
aa23d8ecd2 docs: update installation command for OpenCode CLI (#13971) 2026-01-29 09:47:02 -08:00
Gabe Goodhart
7b62c41060 cmd/config: use envconfig.Host() for base API in launch config packages (#13937) 2026-01-27 13:30:00 -08:00
Parth Sareen
26acab64b7 docs: add clawdbot (#13925) 2026-01-26 18:32:54 -08:00
Gyungrai Wang
e0f03790b1 parsers/ministral: fix nested tool call parsing by counting brace nesting (#13905)
* parsers/ministral: fix nested tool call parsing by counting brace nesting

* fix lint error

* parsers: refactor ministral parser

The old one was very tied to expecting to see only one token at a time,
which I don't like to assume (who knows what the future might hold wrt
speculative decoding, etc). This new one follows a similar structure to
qwen3-coder's parser, which incidentally makes it easier to test as well
(since we can test the individual events that come out when given
particular inputs).

---------

Co-authored-by: Devon Rifkin <drifkin@drifkin.net>
2026-01-26 15:03:43 -08:00
Parth Sareen
3ab842b0f5 cmd: clawdbot config fixes (#13922) 2026-01-26 14:34:29 -08:00
Parth Sareen
b8e8ef8929 cmd: ollama launch clawdbot (#13921) 2026-01-26 13:40:59 -08:00
Parth Sareen
465d124183 cmd: fix opencode config (#13894) 2026-01-24 18:42:56 -08:00
Parth Sareen
d310e56fa3 cmd: add fallback for claude (#13892) 2026-01-24 18:26:01 -08:00
Jeffrey Morgan
a1ca428c90 glm4moelite: fix attention scale calculation (#13893)
Use the original key dimension (qkNopeHeadDim + qkRopeHeadDim = 256) for
the attention scale instead of the MLA absorbed dimension (kvLoraRank +
qkRopeHeadDim = 576).

MLA absorption is a mathematically equivalent reorganization of the
attention computation - it should not change the effective attention
scale. The scale should match training, which uses 1/sqrt(256).

This improves tool calling and model looping issues.
2026-01-24 17:48:09 -08:00
Jeffrey Morgan
16750865d1 glm4moelite: quantize more tensors to q8_0 and avoid double BOS token (#13891) 2026-01-24 16:33:54 -08:00
Jeffrey Morgan
f3b476c592 build: add -O3 optimization to CGO flags (#13877)
CGO_CFLAGS and CGO_CXXFLAGS were being set without optimization flags,
which overrides Go's default -O2 and results in unoptimized C++ code.

This caused significant performance degradation in release builds
compared to local `go build` which uses the default optimization.

- build_darwin.sh: add -O3 to CGO_CFLAGS and CGO_CXXFLAGS exports
- Dockerfile: preserve CGO_CFLAGS/CGO_CXXFLAGS from build args instead
  of overwriting them
- app/README.md: update documentation to include -O3
2026-01-24 10:55:38 -08:00
Parth Sareen
5267d31d56 docs: ollama launch (#13852) 2026-01-23 23:18:50 -08:00
Stillhart
b44f56319f README: Update the "Ollama for ruby" to the most popular and maintained ruby gem. (#13855)
* update README ruby link

the ollama-ai ruby gem is vastly less popular and seems unmaintained
https://rubygems.org/gems/ollama-ai

the defacto standard with the most downloads in the ruby ecosystem is ruby_llm
https://rubygems.org/gems/ruby_llm

I would link to that to avoid complication and guarantee feature compatibility with ollama.

* Update gem link ruby_llm from website to GitHub

ollama links mostly to github, not project websites, hence link to ruby_llm github.
2026-01-24 01:24:52 -05:00
Jeffrey Morgan
0209c268bb llama: fix CUDA MMA errors in release build (#13874) 2026-01-23 20:10:04 -08:00
Jeffrey Morgan
912d984346 llama: fix fattn-tile shared memory overflow on sm_50/52 (#13872)
Use nthreads=128 for ncols=4 configurations in flash attention tile
kernel to reduce shared memory usage below 48KB limit on Maxwell
architectures (sm_50/52).

With nthreads=256 and ncols=4, np=2 which caused shared memory to
exceed 48KB. With nthreads=128 and ncols=4, np=1 keeps shared memory
under the limit.
2026-01-23 19:22:32 -08:00
Parth Sareen
aae6ecbaff cmd: rename ollama config to ollama launch (#13871) 2026-01-23 18:40:40 -08:00
Jeffrey Morgan
64737330a4 Re-apply "model: add MLA absorption for glm4moelite" with fix (#13870)
The nvidia_fp32 config for (576, 512) head sizes had nbatch_fa=32,
which caused zero-sized arrays when computing array dimensions:
  nbatch_fa / (np * warp_size) = 32 / (2 * 32) = 0

This resulted in CUDA compilation failures on CUDA 12 (Windows and
Linux arm64):
- "static assertion failed with nbatch_fa % (np*warp_size) != 0"
- "the size of an array must be greater than zero"

Fix by changing nbatch_fa from 32 to 64 for all (576, 512) configs
in the nvidia_fp32 function, matching the nvidia_fp16 and AMD configs.
2026-01-23 18:40:28 -08:00
Jeffrey Morgan
2eda97f1c3 Revert "model: add MLA absorption for glm4moelite (#13810)" (#13869)
This reverts commit 1044b0419a.
2026-01-23 17:14:15 -08:00
Jeffrey Morgan
66831dcf70 x/imagegen: fix image editing support (#13866)
- Fix panic in ollama show for image gen models (safe type assertion)
- Add vision capability for Flux2KleinPipeline models at create time
- Flatten transparent PNG images onto white background for better results
2026-01-23 15:37:17 -08:00
Jeffrey Morgan
1044b0419a model: add MLA absorption for glm4moelite (#13810)
* model: add MLA absorption for glm4moelite

Split the combined KV_B tensor into separate K_B and V_B tensors
during conversion, enabling MLA (Multi-head Latent Attention)
absorption which compresses the KV cache for improved efficiency.

* ggml: enable MLA flash attention for GLM-4.7-flash

Add support for gqa_ratio 4 in MLA flash attention kernels. GLM-4.7-flash
uses head size 576 with gqa_ratio 4, which was previously only supported
for gqa_ratio 16 (DeepSeek).

Metal changes:
- Enable head size 576 for flash attention
- Increase simdgroups to 8 for large heads (>=512)
- Add case 8 kernel dispatch for 8 simdgroups

CUDA changes:
- Add gqa_ratio 4 support for head 576/512
- Add tile configs for (576, 512, 4) and (576, 512, 8)
- Add MMA config cases for ncols 4
- Add template instances for ncols2=4

* model: add compatibility validation for glm4moelite architecture
2026-01-23 14:47:42 -08:00
Parth Sareen
771d9280ec cmd: ollama config fix droid model name configuration (#13856) 2026-01-23 11:44:22 -08:00
Jeffrey Morgan
862bc0a3bf x/imagegen: respect stream=false in /api/generate (#13853)
When stream=false is set for image generation requests, return a single
JSON response instead of streaming multiple ndjson progress updates.
2026-01-22 22:16:39 -08:00
Jeffrey Morgan
c01608b6a1 x/imagegen: add image edit capabilities (#13846) 2026-01-22 20:35:08 -08:00
Parth Sareen
199c41e16e cmd: ollama config command to help configure integrations to use Ollama (#13712) 2026-01-22 20:17:11 -08:00
Jeffrey Morgan
3b3bf6c217 x/imagegen: replace memory estimation with actual weight size (#13848)
Remove static VRAM estimation (EstimateVRAM, CheckMemoryRequirements)
which wasn't helpful. Instead, report the actual tensor weight size
from the manifest for ollama ps.

- Remove memory estimation check from runner startup
- Remove EstimateVRAM, CheckMemoryRequirements, modelVRAMEstimates
- Add TotalTensorSize() to get actual weight size from manifest
- Use weight size for Server.vramSize instead of estimates

Note: This is better than showing 0 or inaccurate estimates, but the
weight size is a drastic underestimation of actual memory usage since
it doesn't account for activations, intermediate tensors, or MLX
overhead. Future work should query real-time memory from MLX
(e.g., MetalGetActiveMemory) for accurate reporting.
2026-01-22 18:32:41 -08:00
Parth Sareen
f52c21f457 fix: handle Enter key pressed during model loading (#13839) 2026-01-22 18:32:02 -08:00
658 changed files with 113000 additions and 16498 deletions

View File

@@ -27,7 +27,7 @@ jobs:
echo vendorsha=$(make -f Makefile.sync print-base) | tee -a $GITHUB_OUTPUT echo vendorsha=$(make -f Makefile.sync print-base) | tee -a $GITHUB_OUTPUT
darwin-build: darwin-build:
runs-on: macos-14-xlarge runs-on: macos-26-xlarge
environment: release environment: release
needs: setup-environment needs: setup-environment
env: env:
@@ -117,6 +117,25 @@ jobs:
install: https://sdk.lunarg.com/sdk/download/1.4.321.1/windows/vulkansdk-windows-X64-1.4.321.1.exe install: https://sdk.lunarg.com/sdk/download/1.4.321.1/windows/vulkansdk-windows-X64-1.4.321.1.exe
flags: '' flags: ''
runner_dir: 'vulkan' runner_dir: 'vulkan'
- os: windows
arch: amd64
preset: 'MLX CUDA 13'
install: https://developer.download.nvidia.com/compute/cuda/13.0.0/local_installers/cuda_13.0.0_windows.exe
cudnn-install: https://developer.download.nvidia.com/compute/cudnn/redist/cudnn/windows-x86_64/cudnn-windows-x86_64-9.18.1.3_cuda13-archive.zip
cuda-components:
- '"cudart"'
- '"nvcc"'
- '"cublas"'
- '"cublas_dev"'
- '"cufft"'
- '"cufft_dev"'
- '"nvrtc"'
- '"nvrtc_dev"'
- '"crt"'
- '"nvvm"'
- '"nvptxcompiler"'
cuda-version: '13.0'
flags: ''
runs-on: ${{ matrix.arch == 'arm64' && format('{0}-{1}', matrix.os, matrix.arch) || matrix.os }} runs-on: ${{ matrix.arch == 'arm64' && format('{0}-{1}', matrix.os, matrix.arch) || matrix.os }}
environment: release environment: release
env: env:
@@ -125,8 +144,10 @@ jobs:
- name: Install system dependencies - name: Install system dependencies
run: | run: |
choco install -y --no-progress ccache ninja choco install -y --no-progress ccache ninja
if (Get-Command ccache -ErrorAction SilentlyContinue) {
ccache -o cache_dir=${{ github.workspace }}\.ccache ccache -o cache_dir=${{ github.workspace }}\.ccache
- if: startsWith(matrix.preset, 'CUDA ') || startsWith(matrix.preset, 'ROCm ') || startsWith(matrix.preset, 'Vulkan') }
- if: startsWith(matrix.preset, 'CUDA ') || startsWith(matrix.preset, 'ROCm ') || startsWith(matrix.preset, 'Vulkan') || startsWith(matrix.preset, 'MLX ')
id: cache-install id: cache-install
uses: actions/cache/restore@v4 uses: actions/cache/restore@v4
with: with:
@@ -134,8 +155,9 @@ jobs:
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA
C:\Program Files\AMD\ROCm C:\Program Files\AMD\ROCm
C:\VulkanSDK C:\VulkanSDK
key: ${{ matrix.install }} C:\Program Files\NVIDIA\CUDNN
- if: startsWith(matrix.preset, 'CUDA ') key: ${{ matrix.install }}-${{ matrix.cudnn-install }}
- if: startsWith(matrix.preset, 'CUDA ') || startsWith(matrix.preset, 'MLX ')
name: Install CUDA ${{ matrix.cuda-version }} name: Install CUDA ${{ matrix.cuda-version }}
run: | run: |
$ErrorActionPreference = "Stop" $ErrorActionPreference = "Stop"
@@ -179,6 +201,23 @@ jobs:
run: | run: |
echo "CC=clang.exe" | Out-File -FilePath $env:GITHUB_ENV -Append echo "CC=clang.exe" | Out-File -FilePath $env:GITHUB_ENV -Append
echo "CXX=clang++.exe" | Out-File -FilePath $env:GITHUB_ENV -Append echo "CXX=clang++.exe" | Out-File -FilePath $env:GITHUB_ENV -Append
- if: startsWith(matrix.preset, 'MLX ')
name: Install cuDNN for MLX
run: |
$ErrorActionPreference = "Stop"
$cudnnRoot = "C:\Program Files\NVIDIA\CUDNN"
if ("${{ steps.cache-install.outputs.cache-hit }}" -ne 'true') {
Invoke-WebRequest -Uri "${{ matrix.cudnn-install }}" -OutFile "cudnn.zip"
Expand-Archive -Path cudnn.zip -DestinationPath cudnn-extracted
$cudnnDir = (Get-ChildItem -Path cudnn-extracted -Directory)[0].FullName
New-Item -ItemType Directory -Force -Path $cudnnRoot
Copy-Item -Path "$cudnnDir\*" -Destination "$cudnnRoot\" -Recurse
}
echo "CUDNN_ROOT_DIR=$cudnnRoot" | Out-File -FilePath $env:GITHUB_ENV -Append
echo "CUDNN_INCLUDE_PATH=$cudnnRoot\include" | Out-File -FilePath $env:GITHUB_ENV -Append
echo "CUDNN_LIBRARY_PATH=$cudnnRoot\lib\x64" | Out-File -FilePath $env:GITHUB_ENV -Append
echo "$cudnnRoot\bin\x64" | Out-File -FilePath $env:GITHUB_PATH -Encoding utf8 -Append
- if: ${{ !cancelled() && steps.cache-install.outputs.cache-hit != 'true' }} - if: ${{ !cancelled() && steps.cache-install.outputs.cache-hit != 'true' }}
uses: actions/cache/save@v4 uses: actions/cache/save@v4
with: with:
@@ -186,7 +225,8 @@ jobs:
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA
C:\Program Files\AMD\ROCm C:\Program Files\AMD\ROCm
C:\VulkanSDK C:\VulkanSDK
key: ${{ matrix.install }} C:\Program Files\NVIDIA\CUDNN
key: ${{ matrix.install }}-${{ matrix.cudnn-install }}
- uses: actions/checkout@v4 - uses: actions/checkout@v4
- uses: actions/cache@v4 - uses: actions/cache@v4
with: with:
@@ -198,7 +238,7 @@ jobs:
Enter-VsDevShell -VsInstallPath 'C:\Program Files\Microsoft Visual Studio\2022\Enterprise' -SkipAutomaticLocation -DevCmdArguments '-arch=x64 -no_logo' Enter-VsDevShell -VsInstallPath 'C:\Program Files\Microsoft Visual Studio\2022\Enterprise' -SkipAutomaticLocation -DevCmdArguments '-arch=x64 -no_logo'
cmake --preset "${{ matrix.preset }}" ${{ matrix.flags }} --install-prefix "$((pwd).Path)\dist\${{ matrix.os }}-${{ matrix.arch }}" cmake --preset "${{ matrix.preset }}" ${{ matrix.flags }} --install-prefix "$((pwd).Path)\dist\${{ matrix.os }}-${{ matrix.arch }}"
cmake --build --parallel ([Environment]::ProcessorCount) --preset "${{ matrix.preset }}" cmake --build --parallel ([Environment]::ProcessorCount) --preset "${{ matrix.preset }}"
cmake --install build --component "${{ startsWith(matrix.preset, 'CUDA ') && 'CUDA' || startsWith(matrix.preset, 'ROCm ') && 'HIP' || startsWith(matrix.preset, 'Vulkan') && 'Vulkan' || 'CPU' }}" --strip cmake --install build --component "${{ startsWith(matrix.preset, 'MLX ') && 'MLX' || startsWith(matrix.preset, 'CUDA ') && 'CUDA' || startsWith(matrix.preset, 'ROCm ') && 'HIP' || startsWith(matrix.preset, 'Vulkan') && 'Vulkan' || 'CPU' }}" --strip
Remove-Item -Path dist\lib\ollama\rocm\rocblas\library\*gfx906* -ErrorAction SilentlyContinue Remove-Item -Path dist\lib\ollama\rocm\rocblas\library\*gfx906* -ErrorAction SilentlyContinue
env: env:
CMAKE_GENERATOR: Ninja CMAKE_GENERATOR: Ninja
@@ -337,6 +377,7 @@ jobs:
name: bundles-windows name: bundles-windows
path: | path: |
dist/*.zip dist/*.zip
dist/*.ps1
dist/OllamaSetup.exe dist/OllamaSetup.exe
linux-build: linux-build:
@@ -383,6 +424,7 @@ jobs:
lib/ollama/cuda_v*) echo $COMPONENT >>ollama-${{ matrix.os }}-${{ matrix.arch }}.tar.in ;; lib/ollama/cuda_v*) echo $COMPONENT >>ollama-${{ matrix.os }}-${{ matrix.arch }}.tar.in ;;
lib/ollama/vulkan*) echo $COMPONENT >>ollama-${{ matrix.os }}-${{ matrix.arch }}.tar.in ;; lib/ollama/vulkan*) echo $COMPONENT >>ollama-${{ matrix.os }}-${{ matrix.arch }}.tar.in ;;
lib/ollama/mlx*) echo $COMPONENT >>ollama-${{ matrix.os }}-${{ matrix.arch }}.tar.in ;; lib/ollama/mlx*) echo $COMPONENT >>ollama-${{ matrix.os }}-${{ matrix.arch }}.tar.in ;;
lib/ollama/include*) echo $COMPONENT >>ollama-${{ matrix.os }}-${{ matrix.arch }}.tar.in ;;
lib/ollama/cuda_jetpack5) echo $COMPONENT >>ollama-${{ matrix.os }}-${{ matrix.arch }}-jetpack5.tar.in ;; lib/ollama/cuda_jetpack5) echo $COMPONENT >>ollama-${{ matrix.os }}-${{ matrix.arch }}-jetpack5.tar.in ;;
lib/ollama/cuda_jetpack6) echo $COMPONENT >>ollama-${{ matrix.os }}-${{ matrix.arch }}-jetpack6.tar.in ;; lib/ollama/cuda_jetpack6) echo $COMPONENT >>ollama-${{ matrix.os }}-${{ matrix.arch }}-jetpack6.tar.in ;;
lib/ollama/rocm) echo $COMPONENT >>ollama-${{ matrix.os }}-${{ matrix.arch }}-rocm.tar.in ;; lib/ollama/rocm) echo $COMPONENT >>ollama-${{ matrix.os }}-${{ matrix.arch }}-rocm.tar.in ;;
@@ -514,6 +556,9 @@ jobs:
- name: Log dist contents - name: Log dist contents
run: | run: |
ls -l dist/ ls -l dist/
- name: Copy install scripts to dist
run: |
cp scripts/install.sh dist/install.sh
- name: Generate checksum file - name: Generate checksum file
run: find . -type f -not -name 'sha256sum.txt' | xargs sha256sum | tee sha256sum.txt run: find . -type f -not -name 'sha256sum.txt' | xargs sha256sum | tee sha256sum.txt
working-directory: dist working-directory: dist
@@ -536,14 +581,22 @@ jobs:
- name: Upload release artifacts - name: Upload release artifacts
run: | run: |
pids=() pids=()
for payload in dist/*.txt dist/*.zip dist/*.tgz dist/*.tar.zst dist/*.exe dist/*.dmg ; do for payload in dist/*.txt dist/*.zip dist/*.tgz dist/*.tar.zst dist/*.exe dist/*.dmg dist/*.ps1 dist/*.sh ; do
echo "Uploading $payload" echo "Uploading $payload"
gh release upload ${GITHUB_REF_NAME} $payload --clobber & gh release upload ${GITHUB_REF_NAME} $payload --clobber &
pids[$!]=$! pids+=($!)
sleep 1 sleep 1
done done
echo "Waiting for uploads to complete" echo "Waiting for uploads to complete"
for pid in "${pids[*]}"; do failed=0
wait $pid for pid in "${pids[@]}"; do
if ! wait $pid; then
echo "::error::Upload failed (pid $pid)"
failed=1
fi
done done
if [ $failed -ne 0 ]; then
echo "One or more uploads failed"
exit 1
fi
echo "done" echo "done"

22
.github/workflows/test-install.yaml vendored Normal file
View File

@@ -0,0 +1,22 @@
name: test-install
on:
pull_request:
paths:
- 'scripts/install.sh'
- '.github/workflows/test-install.yaml'
jobs:
test:
strategy:
matrix:
os: [ubuntu-latest, macos-latest]
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v4
- name: Run install script
run: sh ./scripts/install.sh
env:
OLLAMA_NO_START: 1 # do not start app
- name: Verify ollama is available
run: ollama --version

View File

@@ -37,7 +37,7 @@ jobs:
| xargs python3 -c "import sys; from pathlib import Path; print(any(Path(x).match(glob) for x in sys.argv[1:] for glob in '$*'.split(' ')))" | xargs python3 -c "import sys; from pathlib import Path; print(any(Path(x).match(glob) for x in sys.argv[1:] for glob in '$*'.split(' ')))"
} }
echo changed=$(changed 'llama/llama.cpp/**/*' 'ml/backend/ggml/ggml/**/*') | tee -a $GITHUB_OUTPUT echo changed=$(changed 'llama/llama.cpp/**/*' 'ml/backend/ggml/ggml/**/*' '.github/**/*') | tee -a $GITHUB_OUTPUT
echo vendorsha=$(make -f Makefile.sync print-base) | tee -a $GITHUB_OUTPUT echo vendorsha=$(make -f Makefile.sync print-base) | tee -a $GITHUB_OUTPUT
linux: linux:
@@ -51,7 +51,7 @@ jobs:
container: nvidia/cuda:13.0.0-devel-ubuntu22.04 container: nvidia/cuda:13.0.0-devel-ubuntu22.04
flags: '-DCMAKE_CUDA_ARCHITECTURES=87' flags: '-DCMAKE_CUDA_ARCHITECTURES=87'
- preset: ROCm - preset: ROCm
container: rocm/dev-ubuntu-22.04:6.1.2 container: rocm/dev-ubuntu-22.04:7.2.1
extra-packages: rocm-libs extra-packages: rocm-libs
flags: '-DAMDGPU_TARGETS=gfx1010 -DCMAKE_PREFIX_PATH=/opt/rocm' flags: '-DAMDGPU_TARGETS=gfx1010 -DCMAKE_PREFIX_PATH=/opt/rocm'
- preset: Vulkan - preset: Vulkan
@@ -60,6 +60,11 @@ jobs:
mesa-vulkan-drivers vulkan-tools mesa-vulkan-drivers vulkan-tools
libvulkan1 libvulkan-dev libvulkan1 libvulkan-dev
vulkan-sdk cmake ccache g++ make vulkan-sdk cmake ccache g++ make
- preset: 'MLX CUDA 13'
container: nvidia/cuda:13.0.0-devel-ubuntu22.04
extra-packages: libcudnn9-dev-cuda-13 libopenblas-dev liblapack-dev liblapacke-dev git curl
flags: '-DCMAKE_CUDA_ARCHITECTURES=87 -DBLAS_INCLUDE_DIRS=/usr/include/x86_64-linux-gnu -DLAPACK_INCLUDE_DIRS=/usr/include/x86_64-linux-gnu'
install-go: true
runs-on: linux runs-on: linux
container: ${{ matrix.container }} container: ${{ matrix.container }}
steps: steps:
@@ -76,19 +81,29 @@ jobs:
$sudo apt-get update $sudo apt-get update
fi fi
$sudo apt-get install -y cmake ccache ${{ matrix.extra-packages }} $sudo apt-get install -y cmake ccache ${{ matrix.extra-packages }}
# MLX requires CMake 3.25+, install from official releases
if [ "${{ matrix.preset }}" = "MLX CUDA 13" ]; then
curl -fsSL https://github.com/Kitware/CMake/releases/download/v3.31.2/cmake-3.31.2-linux-$(uname -m).tar.gz | $sudo tar xz -C /usr/local --strip-components 1
fi
# Export VULKAN_SDK if provided by LunarG package (defensive) # Export VULKAN_SDK if provided by LunarG package (defensive)
if [ -d "/usr/lib/x86_64-linux-gnu/vulkan" ] && [ "${{ matrix.preset }}" = "Vulkan" ]; then if [ -d "/usr/lib/x86_64-linux-gnu/vulkan" ] && [ "${{ matrix.preset }}" = "Vulkan" ]; then
echo "VULKAN_SDK=/usr" >> $GITHUB_ENV echo "VULKAN_SDK=/usr" >> $GITHUB_ENV
fi fi
env: env:
DEBIAN_FRONTEND: noninteractive DEBIAN_FRONTEND: noninteractive
- if: matrix.install-go
name: Install Go
run: |
GO_VERSION=$(awk '/^go / { print $2 }' go.mod)
curl -fsSL "https://golang.org/dl/go${GO_VERSION}.linux-$(dpkg --print-architecture).tar.gz" | tar xz -C /usr/local
echo "/usr/local/go/bin" >> $GITHUB_PATH
- uses: actions/cache@v4 - uses: actions/cache@v4
with: with:
path: /github/home/.cache/ccache path: /github/home/.cache/ccache
key: ccache-${{ runner.os }}-${{ runner.arch }}-${{ matrix.preset }}-${{ needs.changes.outputs.vendorsha }} key: ccache-${{ runner.os }}-${{ runner.arch }}-${{ matrix.preset }}-${{ needs.changes.outputs.vendorsha }}
- run: | - run: |
cmake --preset ${{ matrix.preset }} ${{ matrix.flags }} cmake --preset "${{ matrix.preset }}" ${{ matrix.flags }}
cmake --build --preset ${{ matrix.preset }} --parallel cmake --build --preset "${{ matrix.preset }}" --parallel
windows: windows:
needs: [changes] needs: [changes]
@@ -114,12 +129,31 @@ jobs:
flags: '-DAMDGPU_TARGETS=gfx1010 -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_C_FLAGS="-parallel-jobs=4 -Wno-ignored-attributes -Wno-deprecated-pragma" -DCMAKE_CXX_FLAGS="-parallel-jobs=4 -Wno-ignored-attributes -Wno-deprecated-pragma"' flags: '-DAMDGPU_TARGETS=gfx1010 -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_C_FLAGS="-parallel-jobs=4 -Wno-ignored-attributes -Wno-deprecated-pragma" -DCMAKE_CXX_FLAGS="-parallel-jobs=4 -Wno-ignored-attributes -Wno-deprecated-pragma"'
- preset: Vulkan - preset: Vulkan
install: https://sdk.lunarg.com/sdk/download/1.4.321.1/windows/vulkansdk-windows-X64-1.4.321.1.exe install: https://sdk.lunarg.com/sdk/download/1.4.321.1/windows/vulkansdk-windows-X64-1.4.321.1.exe
- preset: 'MLX CUDA 13'
install: https://developer.download.nvidia.com/compute/cuda/13.0.0/local_installers/cuda_13.0.0_windows.exe
cudnn-install: https://developer.download.nvidia.com/compute/cudnn/redist/cudnn/windows-x86_64/cudnn-windows-x86_64-9.18.1.3_cuda13-archive.zip
flags: '-DCMAKE_CUDA_ARCHITECTURES=80'
cuda-components:
- '"cudart"'
- '"nvcc"'
- '"cublas"'
- '"cublas_dev"'
- '"cufft"'
- '"cufft_dev"'
- '"nvrtc"'
- '"nvrtc_dev"'
- '"crt"'
- '"nvvm"'
- '"nvptxcompiler"'
cuda-version: '13.0'
runs-on: windows runs-on: windows
steps: steps:
- run: | - run: |
choco install -y --no-progress ccache ninja choco install -y --no-progress ccache ninja
if (Get-Command ccache -ErrorAction SilentlyContinue) {
ccache -o cache_dir=${{ github.workspace }}\.ccache ccache -o cache_dir=${{ github.workspace }}\.ccache
- if: matrix.preset == 'CUDA' || matrix.preset == 'ROCm' || matrix.preset == 'Vulkan' }
- if: matrix.preset == 'CUDA' || matrix.preset == 'ROCm' || matrix.preset == 'Vulkan' || matrix.preset == 'MLX CUDA 13'
id: cache-install id: cache-install
uses: actions/cache/restore@v4 uses: actions/cache/restore@v4
with: with:
@@ -127,8 +161,9 @@ jobs:
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA
C:\Program Files\AMD\ROCm C:\Program Files\AMD\ROCm
C:\VulkanSDK C:\VulkanSDK
key: ${{ matrix.install }} C:\Program Files\NVIDIA\CUDNN
- if: matrix.preset == 'CUDA' key: ${{ matrix.install }}-${{ matrix.cudnn-install }}
- if: matrix.preset == 'CUDA' || matrix.preset == 'MLX CUDA 13'
name: Install CUDA ${{ matrix.cuda-version }} name: Install CUDA ${{ matrix.cuda-version }}
run: | run: |
$ErrorActionPreference = "Stop" $ErrorActionPreference = "Stop"
@@ -168,6 +203,23 @@ jobs:
$vulkanPath = (Resolve-Path "C:\VulkanSDK\*").path $vulkanPath = (Resolve-Path "C:\VulkanSDK\*").path
echo "$vulkanPath\bin" | Out-File -FilePath $env:GITHUB_PATH -Encoding utf8 -Append echo "$vulkanPath\bin" | Out-File -FilePath $env:GITHUB_PATH -Encoding utf8 -Append
echo "VULKAN_SDK=$vulkanPath" >> $env:GITHUB_ENV echo "VULKAN_SDK=$vulkanPath" >> $env:GITHUB_ENV
- if: matrix.preset == 'MLX CUDA 13'
name: Install cuDNN for MLX
run: |
$ErrorActionPreference = "Stop"
$cudnnRoot = "C:\Program Files\NVIDIA\CUDNN"
if ("${{ steps.cache-install.outputs.cache-hit }}" -ne 'true') {
Invoke-WebRequest -Uri "${{ matrix.cudnn-install }}" -OutFile "cudnn.zip"
Expand-Archive -Path cudnn.zip -DestinationPath cudnn-extracted
$cudnnDir = (Get-ChildItem -Path cudnn-extracted -Directory)[0].FullName
New-Item -ItemType Directory -Force -Path $cudnnRoot
Copy-Item -Path "$cudnnDir\*" -Destination "$cudnnRoot\" -Recurse
}
echo "CUDNN_ROOT_DIR=$cudnnRoot" | Out-File -FilePath $env:GITHUB_ENV -Append
echo "CUDNN_INCLUDE_PATH=$cudnnRoot\include" | Out-File -FilePath $env:GITHUB_ENV -Append
echo "CUDNN_LIBRARY_PATH=$cudnnRoot\lib\x64" | Out-File -FilePath $env:GITHUB_ENV -Append
echo "$cudnnRoot\bin\x64" | Out-File -FilePath $env:GITHUB_PATH -Encoding utf8 -Append
- if: ${{ !cancelled() && steps.cache-install.outputs.cache-hit != 'true' }} - if: ${{ !cancelled() && steps.cache-install.outputs.cache-hit != 'true' }}
uses: actions/cache/save@v4 uses: actions/cache/save@v4
with: with:
@@ -175,7 +227,8 @@ jobs:
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA
C:\Program Files\AMD\ROCm C:\Program Files\AMD\ROCm
C:\VulkanSDK C:\VulkanSDK
key: ${{ matrix.install }} C:\Program Files\NVIDIA\CUDNN
key: ${{ matrix.install }}-${{ matrix.cudnn-install }}
- uses: actions/checkout@v4 - uses: actions/checkout@v4
- uses: actions/cache@v4 - uses: actions/cache@v4
with: with:

1
.gitignore vendored
View File

@@ -15,3 +15,4 @@ __debug_bin*
llama/build llama/build
llama/vendor llama/vendor
/ollama /ollama
integration/testdata/models/

View File

@@ -64,10 +64,15 @@ set(CMAKE_LIBRARY_OUTPUT_DIRECTORY ${OLLAMA_BUILD_DIR})
set(CMAKE_LIBRARY_OUTPUT_DIRECTORY_DEBUG ${OLLAMA_BUILD_DIR}) set(CMAKE_LIBRARY_OUTPUT_DIRECTORY_DEBUG ${OLLAMA_BUILD_DIR})
set(CMAKE_LIBRARY_OUTPUT_DIRECTORY_RELEASE ${OLLAMA_BUILD_DIR}) set(CMAKE_LIBRARY_OUTPUT_DIRECTORY_RELEASE ${OLLAMA_BUILD_DIR})
include_directories(${CMAKE_CURRENT_SOURCE_DIR}/ml/backend/ggml/ggml/src) # Store ggml include paths for use with target_include_directories later.
include_directories(${CMAKE_CURRENT_SOURCE_DIR}/ml/backend/ggml/ggml/src/include) # We avoid global include_directories() to prevent polluting the include path
include_directories(${CMAKE_CURRENT_SOURCE_DIR}/ml/backend/ggml/ggml/src/ggml-cpu) # for other projects like MLX (whose openblas dependency has its own common.h).
include_directories(${CMAKE_CURRENT_SOURCE_DIR}/ml/backend/ggml/ggml/src/ggml-cpu/amx) set(GGML_INCLUDE_DIRS
${CMAKE_CURRENT_SOURCE_DIR}/ml/backend/ggml/ggml/src
${CMAKE_CURRENT_SOURCE_DIR}/ml/backend/ggml/ggml/src/include
${CMAKE_CURRENT_SOURCE_DIR}/ml/backend/ggml/ggml/src/ggml-cpu
${CMAKE_CURRENT_SOURCE_DIR}/ml/backend/ggml/ggml/src/ggml-cpu/amx
)
add_compile_definitions(NDEBUG GGML_VERSION=0x0 GGML_COMMIT=0x0) add_compile_definitions(NDEBUG GGML_VERSION=0x0 GGML_COMMIT=0x0)
@@ -87,6 +92,14 @@ if(NOT CPU_VARIANTS)
set(CPU_VARIANTS "ggml-cpu") set(CPU_VARIANTS "ggml-cpu")
endif() endif()
# Apply ggml include directories to ggml targets only (not globally)
target_include_directories(ggml-base PRIVATE ${GGML_INCLUDE_DIRS})
foreach(variant ${CPU_VARIANTS})
if(TARGET ${variant})
target_include_directories(${variant} PRIVATE ${GGML_INCLUDE_DIRS})
endif()
endforeach()
install(TARGETS ggml-base ${CPU_VARIANTS} install(TARGETS ggml-base ${CPU_VARIANTS}
RUNTIME_DEPENDENCIES RUNTIME_DEPENDENCIES
PRE_EXCLUDE_REGEXES ".*" PRE_EXCLUDE_REGEXES ".*"
@@ -103,6 +116,7 @@ if(CMAKE_CUDA_COMPILER)
find_package(CUDAToolkit) find_package(CUDAToolkit)
add_subdirectory(${CMAKE_CURRENT_SOURCE_DIR}/ml/backend/ggml/ggml/src/ggml-cuda) add_subdirectory(${CMAKE_CURRENT_SOURCE_DIR}/ml/backend/ggml/ggml/src/ggml-cuda)
target_include_directories(ggml-cuda PRIVATE ${GGML_INCLUDE_DIRS})
install(TARGETS ggml-cuda install(TARGETS ggml-cuda
RUNTIME_DEPENDENCIES RUNTIME_DEPENDENCIES
DIRECTORIES ${CUDAToolkit_BIN_DIR} ${CUDAToolkit_BIN_DIR}/x64 ${CUDAToolkit_LIBRARY_DIR} DIRECTORIES ${CUDAToolkit_BIN_DIR} ${CUDAToolkit_BIN_DIR}/x64 ${CUDAToolkit_LIBRARY_DIR}
@@ -134,6 +148,7 @@ if(CMAKE_HIP_COMPILER)
if(AMDGPU_TARGETS) if(AMDGPU_TARGETS)
find_package(hip REQUIRED) find_package(hip REQUIRED)
add_subdirectory(${CMAKE_CURRENT_SOURCE_DIR}/ml/backend/ggml/ggml/src/ggml-hip) add_subdirectory(${CMAKE_CURRENT_SOURCE_DIR}/ml/backend/ggml/ggml/src/ggml-hip)
target_include_directories(ggml-hip PRIVATE ${GGML_INCLUDE_DIRS})
if (WIN32) if (WIN32)
target_compile_definitions(ggml-hip PRIVATE GGML_CUDA_NO_PEER_COPY) target_compile_definitions(ggml-hip PRIVATE GGML_CUDA_NO_PEER_COPY)
@@ -148,7 +163,7 @@ if(CMAKE_HIP_COMPILER)
) )
install(RUNTIME_DEPENDENCY_SET rocm install(RUNTIME_DEPENDENCY_SET rocm
DIRECTORIES ${HIP_BIN_INSTALL_DIR} ${HIP_LIB_INSTALL_DIR} DIRECTORIES ${HIP_BIN_INSTALL_DIR} ${HIP_LIB_INSTALL_DIR}
PRE_INCLUDE_REGEXES hipblas rocblas amdhip64 rocsolver amd_comgr hsa-runtime64 rocsparse tinfo rocprofiler-register drm drm_amdgpu numa elf PRE_INCLUDE_REGEXES hipblas rocblas amdhip64 rocsolver amd_comgr hsa-runtime64 rocsparse tinfo rocprofiler-register roctx64 rocroller drm drm_amdgpu numa elf
PRE_EXCLUDE_REGEXES ".*" PRE_EXCLUDE_REGEXES ".*"
POST_EXCLUDE_REGEXES "system32" POST_EXCLUDE_REGEXES "system32"
RUNTIME DESTINATION ${OLLAMA_INSTALL_DIR} COMPONENT HIP RUNTIME DESTINATION ${OLLAMA_INSTALL_DIR} COMPONENT HIP
@@ -168,6 +183,7 @@ if(NOT APPLE)
find_package(Vulkan) find_package(Vulkan)
if(Vulkan_FOUND) if(Vulkan_FOUND)
add_subdirectory(${CMAKE_CURRENT_SOURCE_DIR}/ml/backend/ggml/ggml/src/ggml-vulkan) add_subdirectory(${CMAKE_CURRENT_SOURCE_DIR}/ml/backend/ggml/ggml/src/ggml-vulkan)
target_include_directories(ggml-vulkan PRIVATE ${GGML_INCLUDE_DIRS})
install(TARGETS ggml-vulkan install(TARGETS ggml-vulkan
RUNTIME_DEPENDENCIES RUNTIME_DEPENDENCIES
PRE_INCLUDE_REGEXES vulkan PRE_INCLUDE_REGEXES vulkan
@@ -179,18 +195,43 @@ if(NOT APPLE)
endif() endif()
option(MLX_ENGINE "Enable MLX backend" OFF) option(MLX_ENGINE "Enable MLX backend" OFF)
if(MLX_ENGINE) if(MLX_ENGINE)
message(STATUS "Setting up MLX (this takes a while...)") message(STATUS "Setting up MLX (this takes a while...)")
add_subdirectory(${CMAKE_CURRENT_SOURCE_DIR}/x/ml/backend/mlx) add_subdirectory(${CMAKE_CURRENT_SOURCE_DIR}/x/imagegen/mlx)
# Find CUDA toolkit if MLX is built with CUDA support # Find CUDA toolkit if MLX is built with CUDA support
find_package(CUDAToolkit) find_package(CUDAToolkit)
# Build list of directories for runtime dependency resolution
set(MLX_RUNTIME_DIRS ${CUDAToolkit_BIN_DIR} ${CUDAToolkit_BIN_DIR}/x64 ${CUDAToolkit_LIBRARY_DIR})
# Add cuDNN bin paths for DLLs (Windows MLX CUDA builds)
# CUDNN_ROOT_DIR is the standard CMake variable for cuDNN location
if(DEFINED ENV{CUDNN_ROOT_DIR})
# cuDNN 9.x has versioned subdirectories under bin/ (e.g., bin/13.0/)
file(GLOB CUDNN_BIN_SUBDIRS "$ENV{CUDNN_ROOT_DIR}/bin/*")
list(APPEND MLX_RUNTIME_DIRS ${CUDNN_BIN_SUBDIRS})
endif()
# Add build output directory and MLX dependency build directories
list(APPEND MLX_RUNTIME_DIRS ${OLLAMA_BUILD_DIR})
# OpenBLAS DLL location (pre-built zip extracts into openblas-src/bin/)
list(APPEND MLX_RUNTIME_DIRS ${CMAKE_BINARY_DIR}/_deps/openblas-src/bin)
# NCCL: on Linux, if real NCCL is found, cmake bundles libnccl.so via the
# regex below. If NCCL is not found, MLX links a static stub (OBJECT lib)
# so there is no runtime dependency. This path covers the stub build dir
# for windows so we include the DLL in our dependencies.
list(APPEND MLX_RUNTIME_DIRS ${CMAKE_BINARY_DIR}/_deps/mlx-build/mlx/distributed/nccl/nccl_stub-prefix/src/nccl_stub-build/Release)
# Base regexes for runtime dependencies (cross-platform)
set(MLX_INCLUDE_REGEXES cublas cublasLt cudart cufft nvrtc nvrtc-builtins cudnn nccl openblas gfortran)
# On Windows, also include dl.dll (dlfcn-win32 POSIX emulation layer)
if(WIN32)
list(APPEND MLX_INCLUDE_REGEXES "^dl\\.dll$")
endif()
install(TARGETS mlx mlxc install(TARGETS mlx mlxc
RUNTIME_DEPENDENCIES RUNTIME_DEPENDENCIES
DIRECTORIES ${CUDAToolkit_BIN_DIR} ${CUDAToolkit_BIN_DIR}/x64 ${CUDAToolkit_LIBRARY_DIR} DIRECTORIES ${MLX_RUNTIME_DIRS}
PRE_INCLUDE_REGEXES cublas cublasLt cudart nvrtc nvrtc-builtins cudnn nccl openblas gfortran PRE_INCLUDE_REGEXES ${MLX_INCLUDE_REGEXES}
PRE_EXCLUDE_REGEXES ".*" PRE_EXCLUDE_REGEXES ".*"
RUNTIME DESTINATION ${OLLAMA_INSTALL_DIR} COMPONENT MLX RUNTIME DESTINATION ${OLLAMA_INSTALL_DIR} COMPONENT MLX
LIBRARY DESTINATION ${OLLAMA_INSTALL_DIR} COMPONENT MLX LIBRARY DESTINATION ${OLLAMA_INSTALL_DIR} COMPONENT MLX
@@ -205,13 +246,117 @@ if(MLX_ENGINE)
COMPONENT MLX) COMPONENT MLX)
endif() endif()
# Manually install cudart and cublas since they might not be picked up as direct dependencies # Install headers for NVRTC JIT compilation at runtime.
# MLX's own install rules use the default component so they get skipped by
# --component MLX. Headers are installed alongside libmlx in OLLAMA_INSTALL_DIR.
#
# Layout:
# ${OLLAMA_INSTALL_DIR}/include/cccl/{cuda,nv}/ — CCCL headers
# ${OLLAMA_INSTALL_DIR}/include/*.h — CUDA toolkit headers
#
# MLX's jit_module.cpp resolves CCCL via
# current_binary_dir()[.parent_path()] / "include" / "cccl"
# On Linux, MLX's jit_module.cpp resolves CCCL via
# current_binary_dir().parent_path() / "include" / "cccl", so we create a
# symlink from lib/ollama/include -> ${OLLAMA_RUNNER_DIR}/include
# This will need refinement if we add multiple CUDA versions for MLX in the future.
# CUDA runtime headers are found via CUDA_PATH env var (set by mlxrunner).
if(EXISTS ${CMAKE_BINARY_DIR}/_deps/cccl-src/include/cuda)
install(DIRECTORY ${CMAKE_BINARY_DIR}/_deps/cccl-src/include/cuda
DESTINATION ${OLLAMA_INSTALL_DIR}/include/cccl
COMPONENT MLX)
install(DIRECTORY ${CMAKE_BINARY_DIR}/_deps/cccl-src/include/nv
DESTINATION ${OLLAMA_INSTALL_DIR}/include/cccl
COMPONENT MLX)
if(NOT WIN32 AND NOT APPLE)
install(CODE "
set(_link \"${CMAKE_INSTALL_PREFIX}/lib/ollama/include\")
set(_target \"${OLLAMA_RUNNER_DIR}/include\")
if(NOT EXISTS \${_link})
execute_process(COMMAND \${CMAKE_COMMAND} -E create_symlink \${_target} \${_link})
endif()
" COMPONENT MLX)
endif()
endif()
# Install minimal CUDA toolkit headers needed by MLX JIT kernels.
# These are the transitive closure of includes from mlx/backend/cuda/device/*.cuh.
# The Go mlxrunner sets CUDA_PATH to OLLAMA_INSTALL_DIR so MLX finds them at
# $CUDA_PATH/include/*.h via NVRTC --include-path.
if(CUDAToolkit_FOUND) if(CUDAToolkit_FOUND)
file(GLOB CUDART_LIBS # CUDAToolkit_INCLUDE_DIRS may be a semicolon-separated list
# (e.g. ".../include;.../include/cccl"). Find the entry that
# contains the CUDA runtime headers we need.
set(_cuda_inc "")
foreach(_dir ${CUDAToolkit_INCLUDE_DIRS})
if(EXISTS "${_dir}/cuda_runtime_api.h")
set(_cuda_inc "${_dir}")
break()
endif()
endforeach()
if(NOT _cuda_inc)
message(WARNING "Could not find cuda_runtime_api.h in CUDAToolkit_INCLUDE_DIRS: ${CUDAToolkit_INCLUDE_DIRS}")
else()
set(_dst "${OLLAMA_INSTALL_DIR}/include")
set(_MLX_JIT_CUDA_HEADERS
builtin_types.h
cooperative_groups.h
cuda_bf16.h
cuda_bf16.hpp
cuda_device_runtime_api.h
cuda_fp16.h
cuda_fp16.hpp
cuda_fp8.h
cuda_fp8.hpp
cuda_runtime_api.h
device_types.h
driver_types.h
math_constants.h
surface_types.h
texture_types.h
vector_functions.h
vector_functions.hpp
vector_types.h
)
foreach(_hdr ${_MLX_JIT_CUDA_HEADERS})
install(FILES "${_cuda_inc}/${_hdr}"
DESTINATION ${_dst}
COMPONENT MLX)
endforeach()
# Subdirectory headers
install(DIRECTORY "${_cuda_inc}/cooperative_groups"
DESTINATION ${_dst}
COMPONENT MLX
FILES_MATCHING PATTERN "*.h")
install(FILES "${_cuda_inc}/crt/host_defines.h"
DESTINATION "${_dst}/crt"
COMPONENT MLX)
endif()
endif()
# On Windows, explicitly install dl.dll (dlfcn-win32 POSIX dlopen emulation)
# RUNTIME_DEPENDENCIES auto-excludes it via POST_EXCLUDE_FILES_STRICT because
# dlfcn-win32 is a known CMake target with its own install rules (which install
# to the wrong destination). We must install it explicitly here.
if(WIN32)
install(FILES ${OLLAMA_BUILD_DIR}/dl.dll
DESTINATION ${OLLAMA_INSTALL_DIR}
COMPONENT MLX)
endif()
# Manually install CUDA runtime libraries that MLX loads via dlopen
# (not detected by RUNTIME_DEPENDENCIES since they aren't link-time deps)
if(CUDAToolkit_FOUND)
file(GLOB MLX_CUDA_LIBS
"${CUDAToolkit_LIBRARY_DIR}/libcudart.so*" "${CUDAToolkit_LIBRARY_DIR}/libcudart.so*"
"${CUDAToolkit_LIBRARY_DIR}/libcublas.so*") "${CUDAToolkit_LIBRARY_DIR}/libcublas.so*"
if(CUDART_LIBS) "${CUDAToolkit_LIBRARY_DIR}/libcublasLt.so*"
install(FILES ${CUDART_LIBS} "${CUDAToolkit_LIBRARY_DIR}/libnvrtc.so*"
"${CUDAToolkit_LIBRARY_DIR}/libnvrtc-builtins.so*"
"${CUDAToolkit_LIBRARY_DIR}/libcufft.so*"
"${CUDAToolkit_LIBRARY_DIR}/libcudnn.so*")
if(MLX_CUDA_LIBS)
install(FILES ${MLX_CUDA_LIBS}
DESTINATION ${OLLAMA_INSTALL_DIR} DESTINATION ${OLLAMA_INSTALL_DIR}
COMPONENT MLX) COMPONENT MLX)
endif() endif()

View File

@@ -77,6 +77,15 @@
"OLLAMA_RUNNER_DIR": "rocm" "OLLAMA_RUNNER_DIR": "rocm"
} }
}, },
{
"name": "ROCm 7",
"inherits": [ "ROCm" ],
"cacheVariables": {
"CMAKE_HIP_FLAGS": "-parallel-jobs=4",
"AMDGPU_TARGETS": "gfx942;gfx950;gfx1010;gfx1012;gfx1030;gfx1100;gfx1101;gfx1102;gfx1103;gfx1150;gfx1151;gfx1200;gfx1201;gfx908:xnack-;gfx90a:xnack+;gfx90a:xnack-",
"OLLAMA_RUNNER_DIR": "rocm"
}
},
{ {
"name": "Vulkan", "name": "Vulkan",
"inherits": [ "Default" ], "inherits": [ "Default" ],
@@ -103,6 +112,7 @@
"name": "MLX CUDA 13", "name": "MLX CUDA 13",
"inherits": [ "MLX", "CUDA 13" ], "inherits": [ "MLX", "CUDA 13" ],
"cacheVariables": { "cacheVariables": {
"MLX_CUDA_ARCHITECTURES": "86;89;90;90a;100;103;75-virtual;80-virtual;110-virtual;120-virtual;121-virtual",
"OLLAMA_RUNNER_DIR": "mlx_cuda_v13" "OLLAMA_RUNNER_DIR": "mlx_cuda_v13"
} }
} }
@@ -158,6 +168,11 @@
"inherits": [ "ROCm" ], "inherits": [ "ROCm" ],
"configurePreset": "ROCm 6" "configurePreset": "ROCm 6"
}, },
{
"name": "ROCm 7",
"inherits": [ "ROCm" ],
"configurePreset": "ROCm 7"
},
{ {
"name": "Vulkan", "name": "Vulkan",
"targets": [ "ggml-vulkan" ], "targets": [ "ggml-vulkan" ],

View File

@@ -1,33 +1,23 @@
# vim: filetype=dockerfile # vim: filetype=dockerfile
ARG FLAVOR=${TARGETARCH} ARG FLAVOR=${TARGETARCH}
ARG PARALLEL=8
ARG ROCMVERSION=6.3.3 ARG ROCMVERSION=7.2.1
ARG JETPACK5VERSION=r35.4.1 ARG JETPACK5VERSION=r35.4.1
ARG JETPACK6VERSION=r36.4.0 ARG JETPACK6VERSION=r36.4.0
ARG CMAKEVERSION=3.31.2 ARG CMAKEVERSION=3.31.2
ARG NINJAVERSION=1.12.1
ARG VULKANVERSION=1.4.321.1 ARG VULKANVERSION=1.4.321.1
# We require gcc v10 minimum. v10.3 has regressions, so the rockylinux 8.5 AppStream has the latest compatible version # Default empty stages for local MLX source overrides.
# Override with: docker build --build-context local-mlx=../mlx --build-context local-mlx-c=../mlx-c
FROM scratch AS local-mlx
FROM scratch AS local-mlx-c
FROM --platform=linux/amd64 rocm/dev-almalinux-8:${ROCMVERSION}-complete AS base-amd64 FROM --platform=linux/amd64 rocm/dev-almalinux-8:${ROCMVERSION}-complete AS base-amd64
RUN yum install -y yum-utils \ RUN dnf install -y yum-utils ccache gcc-toolset-11-gcc gcc-toolset-11-gcc-c++ gcc-toolset-11-binutils \
&& yum-config-manager --add-repo https://dl.rockylinux.org/vault/rocky/8.5/AppStream/\$basearch/os/ \
&& rpm --import https://dl.rockylinux.org/pub/rocky/RPM-GPG-KEY-Rocky-8 \
&& dnf install -y yum-utils ccache gcc-toolset-10-gcc-10.2.1-8.2.el8 gcc-toolset-10-gcc-c++-10.2.1-8.2.el8 gcc-toolset-10-binutils-2.35-11.el8 \
&& dnf install -y ccache \
&& yum-config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64/cuda-rhel8.repo && yum-config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64/cuda-rhel8.repo
ENV PATH=/opt/rh/gcc-toolset-10/root/usr/bin:$PATH ENV PATH=/opt/rh/gcc-toolset-11/root/usr/bin:$PATH
ARG VULKANVERSION
RUN wget https://sdk.lunarg.com/sdk/download/${VULKANVERSION}/linux/vulkansdk-linux-x86_64-${VULKANVERSION}.tar.xz -O /tmp/vulkansdk-linux-x86_64-${VULKANVERSION}.tar.xz \
&& tar xvf /tmp/vulkansdk-linux-x86_64-${VULKANVERSION}.tar.xz \
&& dnf -y install ninja-build \
&& ln -s /usr/bin/python3 /usr/bin/python \
&& /${VULKANVERSION}/vulkansdk -j 8 vulkan-headers \
&& /${VULKANVERSION}/vulkansdk -j 8 shaderc
RUN cp -r /${VULKANVERSION}/x86_64/include/* /usr/local/include/ \
&& cp -r /${VULKANVERSION}/x86_64/lib/* /usr/local/lib
ENV PATH=/${VULKANVERSION}/x86_64/bin:$PATH
FROM --platform=linux/arm64 almalinux:8 AS base-arm64 FROM --platform=linux/arm64 almalinux:8 AS base-arm64
# install epel-release for ccache # install epel-release for ccache
@@ -38,100 +28,119 @@ ENV CC=clang CXX=clang++
FROM base-${TARGETARCH} AS base FROM base-${TARGETARCH} AS base
ARG CMAKEVERSION ARG CMAKEVERSION
ARG NINJAVERSION
RUN curl -fsSL https://github.com/Kitware/CMake/releases/download/v${CMAKEVERSION}/cmake-${CMAKEVERSION}-linux-$(uname -m).tar.gz | tar xz -C /usr/local --strip-components 1 RUN curl -fsSL https://github.com/Kitware/CMake/releases/download/v${CMAKEVERSION}/cmake-${CMAKEVERSION}-linux-$(uname -m).tar.gz | tar xz -C /usr/local --strip-components 1
RUN dnf install -y unzip \
&& curl -fsSL -o /tmp/ninja.zip https://github.com/ninja-build/ninja/releases/download/v${NINJAVERSION}/ninja-linux$([ "$(uname -m)" = "aarch64" ] && echo "-aarch64").zip \
&& unzip /tmp/ninja.zip -d /usr/local/bin \
&& rm /tmp/ninja.zip
ENV CMAKE_GENERATOR=Ninja
ENV LDFLAGS=-s ENV LDFLAGS=-s
FROM base AS cpu FROM base AS cpu
RUN dnf install -y gcc-toolset-11-gcc gcc-toolset-11-gcc-c++ RUN dnf install -y gcc-toolset-11-gcc gcc-toolset-11-gcc-c++
ENV PATH=/opt/rh/gcc-toolset-11/root/usr/bin:$PATH ENV PATH=/opt/rh/gcc-toolset-11/root/usr/bin:$PATH
ARG PARALLEL
COPY CMakeLists.txt CMakePresets.json . COPY CMakeLists.txt CMakePresets.json .
COPY ml/backend/ggml/ggml ml/backend/ggml/ggml COPY ml/backend/ggml/ggml ml/backend/ggml/ggml
RUN --mount=type=cache,target=/root/.ccache \ RUN --mount=type=cache,target=/root/.ccache \
cmake --preset 'CPU' \ cmake --preset 'CPU' \
&& cmake --build --parallel ${PARALLEL} --preset 'CPU' \ && cmake --build --preset 'CPU' -- -l $(nproc) \
&& cmake --install build --component CPU --strip --parallel ${PARALLEL} && cmake --install build --component CPU --strip
FROM base AS cuda-11 FROM base AS cuda-11
ARG CUDA11VERSION=11.8 ARG CUDA11VERSION=11.8
RUN dnf install -y cuda-toolkit-${CUDA11VERSION//./-} RUN dnf install -y cuda-toolkit-${CUDA11VERSION//./-}
ENV PATH=/usr/local/cuda-11/bin:$PATH ENV PATH=/usr/local/cuda-11/bin:$PATH
ARG PARALLEL
COPY CMakeLists.txt CMakePresets.json . COPY CMakeLists.txt CMakePresets.json .
COPY ml/backend/ggml/ggml ml/backend/ggml/ggml COPY ml/backend/ggml/ggml ml/backend/ggml/ggml
RUN --mount=type=cache,target=/root/.ccache \ RUN --mount=type=cache,target=/root/.ccache \
cmake --preset 'CUDA 11' \ cmake --preset 'CUDA 11' \
&& cmake --build --parallel ${PARALLEL} --preset 'CUDA 11' \ && cmake --build --preset 'CUDA 11' -- -l $(nproc) \
&& cmake --install build --component CUDA --strip --parallel ${PARALLEL} && cmake --install build --component CUDA --strip
FROM base AS cuda-12 FROM base AS cuda-12
ARG CUDA12VERSION=12.8 ARG CUDA12VERSION=12.8
RUN dnf install -y cuda-toolkit-${CUDA12VERSION//./-} RUN dnf install -y cuda-toolkit-${CUDA12VERSION//./-}
ENV PATH=/usr/local/cuda-12/bin:$PATH ENV PATH=/usr/local/cuda-12/bin:$PATH
ARG PARALLEL
COPY CMakeLists.txt CMakePresets.json . COPY CMakeLists.txt CMakePresets.json .
COPY ml/backend/ggml/ggml ml/backend/ggml/ggml COPY ml/backend/ggml/ggml ml/backend/ggml/ggml
RUN --mount=type=cache,target=/root/.ccache \ RUN --mount=type=cache,target=/root/.ccache \
cmake --preset 'CUDA 12' \ cmake --preset 'CUDA 12' \
&& cmake --build --parallel ${PARALLEL} --preset 'CUDA 12' \ && cmake --build --preset 'CUDA 12' -- -l $(nproc) \
&& cmake --install build --component CUDA --strip --parallel ${PARALLEL} && cmake --install build --component CUDA --strip
FROM base AS cuda-13 FROM base AS cuda-13
ARG CUDA13VERSION=13.0 ARG CUDA13VERSION=13.0
RUN dnf install -y cuda-toolkit-${CUDA13VERSION//./-} RUN dnf install -y cuda-toolkit-${CUDA13VERSION//./-}
ENV PATH=/usr/local/cuda-13/bin:$PATH ENV PATH=/usr/local/cuda-13/bin:$PATH
ARG PARALLEL
COPY CMakeLists.txt CMakePresets.json . COPY CMakeLists.txt CMakePresets.json .
COPY ml/backend/ggml/ggml ml/backend/ggml/ggml COPY ml/backend/ggml/ggml ml/backend/ggml/ggml
RUN --mount=type=cache,target=/root/.ccache \ RUN --mount=type=cache,target=/root/.ccache \
cmake --preset 'CUDA 13' \ cmake --preset 'CUDA 13' \
&& cmake --build --parallel ${PARALLEL} --preset 'CUDA 13' \ && cmake --build --preset 'CUDA 13' -- -l $(nproc) \
&& cmake --install build --component CUDA --strip --parallel ${PARALLEL} && cmake --install build --component CUDA --strip
FROM base AS rocm-6 FROM base AS rocm-7
ENV PATH=/opt/rocm/hcc/bin:/opt/rocm/hip/bin:/opt/rocm/bin:/opt/rocm/hcc/bin:$PATH ENV PATH=/opt/rocm/hcc/bin:/opt/rocm/hip/bin:/opt/rocm/bin:/opt/rocm/hcc/bin:$PATH
ARG PARALLEL
COPY CMakeLists.txt CMakePresets.json . COPY CMakeLists.txt CMakePresets.json .
COPY ml/backend/ggml/ggml ml/backend/ggml/ggml COPY ml/backend/ggml/ggml ml/backend/ggml/ggml
RUN --mount=type=cache,target=/root/.ccache \ RUN --mount=type=cache,target=/root/.ccache \
cmake --preset 'ROCm 6' \ cmake --preset 'ROCm 7' \
&& cmake --build --parallel ${PARALLEL} --preset 'ROCm 6' \ && cmake --build --preset 'ROCm 7' -- -l $(nproc) \
&& cmake --install build --component HIP --strip --parallel ${PARALLEL} && cmake --install build --component HIP --strip
RUN rm -f dist/lib/ollama/rocm/rocblas/library/*gfx90[06]* RUN rm -f dist/lib/ollama/rocm/rocblas/library/*gfx90[06]*
FROM --platform=linux/arm64 nvcr.io/nvidia/l4t-jetpack:${JETPACK5VERSION} AS jetpack-5 FROM --platform=linux/arm64 nvcr.io/nvidia/l4t-jetpack:${JETPACK5VERSION} AS jetpack-5
ARG CMAKEVERSION ARG CMAKEVERSION
RUN apt-get update && apt-get install -y curl ccache \ ARG NINJAVERSION
&& curl -fsSL https://github.com/Kitware/CMake/releases/download/v${CMAKEVERSION}/cmake-${CMAKEVERSION}-linux-$(uname -m).tar.gz | tar xz -C /usr/local --strip-components 1 RUN apt-get update && apt-get install -y curl ccache unzip \
&& curl -fsSL https://github.com/Kitware/CMake/releases/download/v${CMAKEVERSION}/cmake-${CMAKEVERSION}-linux-$(uname -m).tar.gz | tar xz -C /usr/local --strip-components 1 \
&& curl -fsSL -o /tmp/ninja.zip https://github.com/ninja-build/ninja/releases/download/v${NINJAVERSION}/ninja-linux-aarch64.zip \
&& unzip /tmp/ninja.zip -d /usr/local/bin \
&& rm /tmp/ninja.zip
ENV CMAKE_GENERATOR=Ninja
COPY CMakeLists.txt CMakePresets.json . COPY CMakeLists.txt CMakePresets.json .
COPY ml/backend/ggml/ggml ml/backend/ggml/ggml COPY ml/backend/ggml/ggml ml/backend/ggml/ggml
ARG PARALLEL
RUN --mount=type=cache,target=/root/.ccache \ RUN --mount=type=cache,target=/root/.ccache \
cmake --preset 'JetPack 5' \ cmake --preset 'JetPack 5' \
&& cmake --build --parallel ${PARALLEL} --preset 'JetPack 5' \ && cmake --build --preset 'JetPack 5' -- -l $(nproc) \
&& cmake --install build --component CUDA --strip --parallel ${PARALLEL} && cmake --install build --component CUDA --strip
FROM --platform=linux/arm64 nvcr.io/nvidia/l4t-jetpack:${JETPACK6VERSION} AS jetpack-6 FROM --platform=linux/arm64 nvcr.io/nvidia/l4t-jetpack:${JETPACK6VERSION} AS jetpack-6
ARG CMAKEVERSION ARG CMAKEVERSION
RUN apt-get update && apt-get install -y curl ccache \ ARG NINJAVERSION
&& curl -fsSL https://github.com/Kitware/CMake/releases/download/v${CMAKEVERSION}/cmake-${CMAKEVERSION}-linux-$(uname -m).tar.gz | tar xz -C /usr/local --strip-components 1 RUN apt-get update && apt-get install -y curl ccache unzip \
&& curl -fsSL https://github.com/Kitware/CMake/releases/download/v${CMAKEVERSION}/cmake-${CMAKEVERSION}-linux-$(uname -m).tar.gz | tar xz -C /usr/local --strip-components 1 \
&& curl -fsSL -o /tmp/ninja.zip https://github.com/ninja-build/ninja/releases/download/v${NINJAVERSION}/ninja-linux-aarch64.zip \
&& unzip /tmp/ninja.zip -d /usr/local/bin \
&& rm /tmp/ninja.zip
ENV CMAKE_GENERATOR=Ninja
COPY CMakeLists.txt CMakePresets.json . COPY CMakeLists.txt CMakePresets.json .
COPY ml/backend/ggml/ggml ml/backend/ggml/ggml COPY ml/backend/ggml/ggml ml/backend/ggml/ggml
ARG PARALLEL
RUN --mount=type=cache,target=/root/.ccache \ RUN --mount=type=cache,target=/root/.ccache \
cmake --preset 'JetPack 6' \ cmake --preset 'JetPack 6' \
&& cmake --build --parallel ${PARALLEL} --preset 'JetPack 6' \ && cmake --build --preset 'JetPack 6' -- -l $(nproc) \
&& cmake --install build --component CUDA --strip --parallel ${PARALLEL} && cmake --install build --component CUDA --strip
FROM base AS vulkan FROM base AS vulkan
ARG VULKANVERSION
RUN ln -s /usr/bin/python3 /usr/bin/python \
&& wget https://sdk.lunarg.com/sdk/download/${VULKANVERSION}/linux/vulkansdk-linux-x86_64-${VULKANVERSION}.tar.xz -O /tmp/vulkansdk.tar.xz \
&& tar xvf /tmp/vulkansdk.tar.xz -C /tmp \
&& /tmp/${VULKANVERSION}/vulkansdk -j 8 vulkan-headers \
&& /tmp/${VULKANVERSION}/vulkansdk -j 8 shaderc \
&& cp -r /tmp/${VULKANVERSION}/x86_64/include/* /usr/local/include/ \
&& cp -r /tmp/${VULKANVERSION}/x86_64/lib/* /usr/local/lib \
&& cp -r /tmp/${VULKANVERSION}/x86_64/bin/* /usr/local/bin/ \
&& rm -rf /tmp/${VULKANVERSION} /tmp/vulkansdk.tar.xz
COPY CMakeLists.txt CMakePresets.json . COPY CMakeLists.txt CMakePresets.json .
COPY ml/backend/ggml/ggml ml/backend/ggml/ggml COPY ml/backend/ggml/ggml ml/backend/ggml/ggml
RUN --mount=type=cache,target=/root/.ccache \ RUN --mount=type=cache,target=/root/.ccache \
cmake --preset 'Vulkan' \ cmake --preset 'Vulkan' \
&& cmake --build --parallel --preset 'Vulkan' \ && cmake --build --preset 'Vulkan' -- -l $(nproc) \
&& cmake --install build --component Vulkan --strip --parallel 8 && cmake --install build --component Vulkan --strip
FROM base AS mlx FROM base AS mlx
ARG CUDA13VERSION=13.0 ARG CUDA13VERSION=13.0
@@ -143,20 +152,27 @@ ENV PATH=/usr/local/cuda-13/bin:$PATH
ENV BLAS_INCLUDE_DIRS=/usr/include/openblas ENV BLAS_INCLUDE_DIRS=/usr/include/openblas
ENV LAPACK_INCLUDE_DIRS=/usr/include/openblas ENV LAPACK_INCLUDE_DIRS=/usr/include/openblas
ENV CGO_LDFLAGS="-L/usr/local/cuda-13/lib64 -L/usr/local/cuda-13/targets/x86_64-linux/lib/stubs" ENV CGO_LDFLAGS="-L/usr/local/cuda-13/lib64 -L/usr/local/cuda-13/targets/x86_64-linux/lib/stubs"
ARG PARALLEL
WORKDIR /go/src/github.com/ollama/ollama WORKDIR /go/src/github.com/ollama/ollama
COPY CMakeLists.txt CMakePresets.json . COPY CMakeLists.txt CMakePresets.json .
COPY ml/backend/ggml/ggml ml/backend/ggml/ggml COPY ml/backend/ggml/ggml ml/backend/ggml/ggml
COPY x/ml/backend/mlx x/ml/backend/mlx COPY x/imagegen/mlx x/imagegen/mlx
COPY go.mod go.sum . COPY go.mod go.sum .
COPY MLX_VERSION . COPY MLX_VERSION MLX_C_VERSION .
RUN curl -fsSL https://golang.org/dl/go$(awk '/^go/ { print $2 }' go.mod).linux-$(case $(uname -m) in x86_64) echo amd64 ;; aarch64) echo arm64 ;; esac).tar.gz | tar xz -C /usr/local RUN curl -fsSL https://golang.org/dl/go$(awk '/^go/ { print $2 }' go.mod).linux-$(case $(uname -m) in x86_64) echo amd64 ;; aarch64) echo arm64 ;; esac).tar.gz | tar xz -C /usr/local
ENV PATH=/usr/local/go/bin:$PATH ENV PATH=/usr/local/go/bin:$PATH
RUN go mod download RUN go mod download
RUN --mount=type=cache,target=/root/.ccache \ RUN --mount=type=cache,target=/root/.ccache \
cmake --preset 'MLX CUDA 13' -DBLAS_INCLUDE_DIRS=/usr/include/openblas -DLAPACK_INCLUDE_DIRS=/usr/include/openblas \ --mount=type=bind,from=local-mlx,target=/tmp/local-mlx \
&& cmake --build --parallel ${PARALLEL} --preset 'MLX CUDA 13' \ --mount=type=bind,from=local-mlx-c,target=/tmp/local-mlx-c \
&& cmake --install build --component MLX --strip --parallel ${PARALLEL} if [ -f /tmp/local-mlx/CMakeLists.txt ]; then \
export OLLAMA_MLX_SOURCE=/tmp/local-mlx; \
fi \
&& if [ -f /tmp/local-mlx-c/CMakeLists.txt ]; then \
export OLLAMA_MLX_C_SOURCE=/tmp/local-mlx-c; \
fi \
&& cmake --preset 'MLX CUDA 13' -DBLAS_INCLUDE_DIRS=/usr/include/openblas -DLAPACK_INCLUDE_DIRS=/usr/include/openblas \
&& cmake --build --preset 'MLX CUDA 13' -- -l $(nproc) \
&& cmake --install build --component MLX --strip
FROM base AS build FROM base AS build
WORKDIR /go/src/github.com/ollama/ollama WORKDIR /go/src/github.com/ollama/ollama
@@ -165,14 +181,14 @@ RUN curl -fsSL https://golang.org/dl/go$(awk '/^go/ { print $2 }' go.mod).linux-
ENV PATH=/usr/local/go/bin:$PATH ENV PATH=/usr/local/go/bin:$PATH
RUN go mod download RUN go mod download
COPY . . COPY . .
# Clone mlx-c headers for CGO (version from MLX_VERSION file)
RUN git clone --depth 1 --branch "$(cat MLX_VERSION)" https://github.com/ml-explore/mlx-c.git build/_deps/mlx-c-src
ARG GOFLAGS="'-ldflags=-w -s'" ARG GOFLAGS="'-ldflags=-w -s'"
ENV CGO_ENABLED=1 ENV CGO_ENABLED=1
ENV CGO_CFLAGS="-I/go/src/github.com/ollama/ollama/build/_deps/mlx-c-src" ARG CGO_CFLAGS
ARG CGO_CXXFLAGS ARG CGO_CXXFLAGS
ENV CGO_CFLAGS="${CGO_CFLAGS}"
ENV CGO_CXXFLAGS="${CGO_CXXFLAGS}"
RUN --mount=type=cache,target=/root/.cache/go-build \ RUN --mount=type=cache,target=/root/.cache/go-build \
go build -tags mlx -trimpath -buildmode=pie -o /bin/ollama . go build -trimpath -buildmode=pie -o /bin/ollama .
FROM --platform=linux/amd64 scratch AS amd64 FROM --platform=linux/amd64 scratch AS amd64
# COPY --from=cuda-11 dist/lib/ollama/ /lib/ollama/ # COPY --from=cuda-11 dist/lib/ollama/ /lib/ollama/
@@ -189,10 +205,9 @@ COPY --from=jetpack-5 dist/lib/ollama/ /lib/ollama/
COPY --from=jetpack-6 dist/lib/ollama/ /lib/ollama/ COPY --from=jetpack-6 dist/lib/ollama/ /lib/ollama/
FROM scratch AS rocm FROM scratch AS rocm
COPY --from=rocm-6 dist/lib/ollama /lib/ollama COPY --from=rocm-7 dist/lib/ollama /lib/ollama
FROM ${FLAVOR} AS archive FROM ${FLAVOR} AS archive
ARG VULKANVERSION
COPY --from=cpu dist/lib/ollama /lib/ollama COPY --from=cpu dist/lib/ollama /lib/ollama
COPY --from=build /bin/ollama /bin/ollama COPY --from=build /bin/ollama /bin/ollama

1
MLX_C_VERSION Normal file
View File

@@ -0,0 +1 @@
0726ca922fc902c4c61ef9c27d94132be418e945

View File

@@ -1 +1 @@
v0.4.1 38ad257088fb2193ad47e527cf6534a689f30943

910
README.md
View File

@@ -1,20 +1,30 @@
<div align="center"> <p align="center">
  <a href="https://ollama.com"> <a href="https://ollama.com">
<img alt="ollama" width="240" src="https://github.com/ollama/ollama/assets/3325447/0d0b44e2-8f4a-4e99-9b52-a5c1c741c8f7"> <img src="https://github.com/ollama/ollama/assets/3325447/0d0b44e2-8f4a-4e99-9b52-a5c1c741c8f7" alt="ollama" width="200"/>
</a> </a>
</div> </p>
# Ollama # Ollama
Get up and running with large language models. Start building with open models.
## Download
### macOS ### macOS
[Download](https://ollama.com/download/Ollama.dmg) ```shell
curl -fsSL https://ollama.com/install.sh | sh
```
or [download manually](https://ollama.com/download/Ollama.dmg)
### Windows ### Windows
[Download](https://ollama.com/download/OllamaSetup.exe) ```shell
irm https://ollama.com/install.ps1 | iex
```
or [download manually](https://ollama.com/download/OllamaSetup.exe)
### Linux ### Linux
@@ -36,647 +46,311 @@ The official [Ollama Docker image](https://hub.docker.com/r/ollama/ollama) `olla
### Community ### Community
- [Discord](https://discord.gg/ollama) - [Discord](https://discord.gg/ollama)
- [𝕏 (Twitter)](https://x.com/ollama)
- [Reddit](https://reddit.com/r/ollama) - [Reddit](https://reddit.com/r/ollama)
## Quickstart ## Get started
To run and chat with [Gemma 3](https://ollama.com/library/gemma3): ```
ollama
```
```shell You'll be prompted to run a model or connect Ollama to your existing agents or applications such as `claude`, `codex`, `openclaw` and more.
### Coding
To launch a specific integration:
```
ollama launch claude
```
Supported integrations include [Claude Code](https://docs.ollama.com/integrations/claude-code), [Codex](https://docs.ollama.com/integrations/codex), [Droid](https://docs.ollama.com/integrations/droid), and [OpenCode](https://docs.ollama.com/integrations/opencode).
### AI assistant
Use [OpenClaw](https://docs.ollama.com/integrations/openclaw) to turn Ollama into a personal AI assistant across WhatsApp, Telegram, Slack, Discord, and more:
```
ollama launch openclaw
```
### Chat with a model
Run and chat with [Gemma 3](https://ollama.com/library/gemma3):
```
ollama run gemma3 ollama run gemma3
``` ```
## Model library See [ollama.com/library](https://ollama.com/library) for the full list.
Ollama supports a list of models available on [ollama.com/library](https://ollama.com/library "ollama model library") See the [quickstart guide](https://docs.ollama.com/quickstart) for more details.
Here are some example models that can be downloaded:
| Model | Parameters | Size | Download |
| ------------------ | ---------- | ----- | -------------------------------- |
| Gemma 3 | 1B | 815MB | `ollama run gemma3:1b` |
| Gemma 3 | 4B | 3.3GB | `ollama run gemma3` |
| Gemma 3 | 12B | 8.1GB | `ollama run gemma3:12b` |
| Gemma 3 | 27B | 17GB | `ollama run gemma3:27b` |
| QwQ | 32B | 20GB | `ollama run qwq` |
| DeepSeek-R1 | 7B | 4.7GB | `ollama run deepseek-r1` |
| DeepSeek-R1 | 671B | 404GB | `ollama run deepseek-r1:671b` |
| Llama 4 | 109B | 67GB | `ollama run llama4:scout` |
| Llama 4 | 400B | 245GB | `ollama run llama4:maverick` |
| Llama 3.3 | 70B | 43GB | `ollama run llama3.3` |
| Llama 3.2 | 3B | 2.0GB | `ollama run llama3.2` |
| Llama 3.2 | 1B | 1.3GB | `ollama run llama3.2:1b` |
| Llama 3.2 Vision | 11B | 7.9GB | `ollama run llama3.2-vision` |
| Llama 3.2 Vision | 90B | 55GB | `ollama run llama3.2-vision:90b` |
| Llama 3.1 | 8B | 4.7GB | `ollama run llama3.1` |
| Llama 3.1 | 405B | 231GB | `ollama run llama3.1:405b` |
| Phi 4 | 14B | 9.1GB | `ollama run phi4` |
| Phi 4 Mini | 3.8B | 2.5GB | `ollama run phi4-mini` |
| Mistral | 7B | 4.1GB | `ollama run mistral` |
| Moondream 2 | 1.4B | 829MB | `ollama run moondream` |
| Neural Chat | 7B | 4.1GB | `ollama run neural-chat` |
| Starling | 7B | 4.1GB | `ollama run starling-lm` |
| Code Llama | 7B | 3.8GB | `ollama run codellama` |
| Llama 2 Uncensored | 7B | 3.8GB | `ollama run llama2-uncensored` |
| LLaVA | 7B | 4.5GB | `ollama run llava` |
| Granite-3.3 | 8B | 4.9GB | `ollama run granite3.3` |
> [!NOTE]
> You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 13B models, and 32 GB to run the 33B models.
## Customize a model
### Import from GGUF
Ollama supports importing GGUF models in the Modelfile:
1. Create a file named `Modelfile`, with a `FROM` instruction with the local filepath to the model you want to import.
```
FROM ./vicuna-33b.Q4_0.gguf
```
2. Create the model in Ollama
```shell
ollama create example -f Modelfile
```
3. Run the model
```shell
ollama run example
```
### Import from Safetensors
See the [guide](https://docs.ollama.com/import) on importing models for more information.
### Customize a prompt
Models from the Ollama library can be customized with a prompt. For example, to customize the `llama3.2` model:
```shell
ollama pull llama3.2
```
Create a `Modelfile`:
```
FROM llama3.2
# set the temperature to 1 [higher is more creative, lower is more coherent]
PARAMETER temperature 1
# set the system message
SYSTEM """
You are Mario from Super Mario Bros. Answer as Mario, the assistant, only.
"""
```
Next, create and run the model:
```
ollama create mario -f ./Modelfile
ollama run mario
>>> hi
Hello! It's your friend Mario.
```
For more information on working with a Modelfile, see the [Modelfile](https://docs.ollama.com/modelfile) documentation.
## CLI Reference
### Create a model
`ollama create` is used to create a model from a Modelfile.
```shell
ollama create mymodel -f ./Modelfile
```
### Pull a model
```shell
ollama pull llama3.2
```
> This command can also be used to update a local model. Only the diff will be pulled.
### Remove a model
```shell
ollama rm llama3.2
```
### Copy a model
```shell
ollama cp llama3.2 my-model
```
### Multiline input
For multiline input, you can wrap text with `"""`:
```
>>> """Hello,
... world!
... """
I'm a basic program that prints the famous "Hello, world!" message to the console.
```
### Multimodal models
```
ollama run llava "What's in this image? /Users/jmorgan/Desktop/smile.png"
```
> **Output**: The image features a yellow smiley face, which is likely the central focus of the picture.
### Pass the prompt as an argument
```shell
ollama run llama3.2 "Summarize this file: $(cat README.md)"
```
> **Output**: Ollama is a lightweight, extensible framework for building and running language models on the local machine. It provides a simple API for creating, running, and managing models, as well as a library of pre-built models that can be easily used in a variety of applications.
### Show model information
```shell
ollama show llama3.2
```
### List models on your computer
```shell
ollama list
```
### List which models are currently loaded
```shell
ollama ps
```
### Stop a model which is currently running
```shell
ollama stop llama3.2
```
### Generate embeddings from the CLI
```shell
ollama run embeddinggemma "Your text to embed"
```
You can also pipe text for scripted workflows:
```shell
echo "Your text to embed" | ollama run embeddinggemma
```
### Start Ollama
`ollama serve` is used when you want to start ollama without running the desktop application.
## Building
See the [developer guide](https://github.com/ollama/ollama/blob/main/docs/development.md)
### Running local builds
Next, start the server:
```shell
./ollama serve
```
Finally, in a separate shell, run a model:
```shell
./ollama run llama3.2
```
## Building with MLX (experimental)
First build the MLX libraries:
```shell
cmake --preset MLX
cmake --build --preset MLX --parallel
cmake --install build --component MLX
```
When building with the `-tags mlx` flag, the main `ollama` binary includes MLX support for experimental features like image generation:
```shell
go build -tags mlx .
```
Finally, start the server:
```
./ollama serve
```
### Building MLX with CUDA
When building with CUDA, use the preset "MLX CUDA 13" or "MLX CUDA 12" to enable CUDA with default architectures:
```shell
cmake --preset 'MLX CUDA 13'
cmake --build --preset 'MLX CUDA 13' --parallel
cmake --install build --component MLX
```
## REST API ## REST API
Ollama has a REST API for running and managing models. Ollama has a REST API for running and managing models.
### Generate a response
```shell
curl http://localhost:11434/api/generate -d '{
"model": "llama3.2",
"prompt":"Why is the sky blue?"
}'
``` ```
### Chat with a model
```shell
curl http://localhost:11434/api/chat -d '{ curl http://localhost:11434/api/chat -d '{
"model": "llama3.2", "model": "gemma3",
"messages": [ "messages": [{
{ "role": "user", "content": "why is the sky blue?" } "role": "user",
] "content": "Why is the sky blue?"
}],
"stream": false
}' }'
``` ```
See the [API documentation](./docs/api.md) for all endpoints. See the [API documentation](https://docs.ollama.com/api) for all endpoints.
### Python
```
pip install ollama
```
```python
from ollama import chat
response = chat(model='gemma3', messages=[
{
'role': 'user',
'content': 'Why is the sky blue?',
},
])
print(response.message.content)
```
### JavaScript
```
npm i ollama
```
```javascript
import ollama from "ollama";
const response = await ollama.chat({
model: "gemma3",
messages: [{ role: "user", content: "Why is the sky blue?" }],
});
console.log(response.message.content);
```
## Supported backends
- [llama.cpp](https://github.com/ggml-org/llama.cpp) project founded by Georgi Gerganov.
## Documentation
- [CLI reference](https://docs.ollama.com/cli)
- [REST API reference](https://docs.ollama.com/api)
- [Importing models](https://docs.ollama.com/import)
- [Modelfile reference](https://docs.ollama.com/modelfile)
- [Building from source](https://github.com/ollama/ollama/blob/main/docs/development.md)
## Community Integrations ## Community Integrations
### Web & Desktop > Want to add your project? Open a pull request.
- [Onyx](https://github.com/onyx-dot-app/onyx) ### Chat Interfaces
- [Open WebUI](https://github.com/open-webui/open-webui)
- [SwiftChat (macOS with ReactNative)](https://github.com/aws-samples/swift-chat)
- [Enchanted (macOS native)](https://github.com/AugustDev/enchanted)
- [Hollama](https://github.com/fmaclen/hollama)
- [Lollms WebUI (Single user)](https://github.com/ParisNeo/lollms-webui)
- [Lollms (Multi users)](https://github.com/ParisNeo/lollms)
- [LibreChat](https://github.com/danny-avila/LibreChat)
- [Bionic GPT](https://github.com/bionic-gpt/bionic-gpt)
- [HTML UI](https://github.com/rtcfirefly/ollama-ui)
- [AI-UI](https://github.com/bajahaw/ai-ui)
- [Saddle](https://github.com/jikkuatwork/saddle)
- [TagSpaces](https://www.tagspaces.org) (A platform for file-based apps, [utilizing Ollama](https://docs.tagspaces.org/ai/) for the generation of tags and descriptions)
- [Chatbot UI](https://github.com/ivanfioravanti/chatbot-ollama)
- [Chatbot UI v2](https://github.com/mckaywrigley/chatbot-ui)
- [Typescript UI](https://github.com/ollama-interface/Ollama-Gui?tab=readme-ov-file)
- [Minimalistic React UI for Ollama Models](https://github.com/richawo/minimal-llm-ui)
- [Ollamac](https://github.com/kevinhermawan/Ollamac)
- [big-AGI](https://github.com/enricoros/big-AGI)
- [Cheshire Cat assistant framework](https://github.com/cheshire-cat-ai/core)
- [Amica](https://github.com/semperai/amica)
- [chatd](https://github.com/BruceMacD/chatd)
- [Ollama-SwiftUI](https://github.com/kghandour/Ollama-SwiftUI)
- [Dify.AI](https://github.com/langgenius/dify)
- [MindMac](https://mindmac.app)
- [NextJS Web Interface for Ollama](https://github.com/jakobhoeg/nextjs-ollama-llm-ui)
- [Msty](https://msty.app)
- [Chatbox](https://github.com/Bin-Huang/Chatbox)
- [WinForm Ollama Copilot](https://github.com/tgraupmann/WinForm_Ollama_Copilot)
- [NextChat](https://github.com/ChatGPTNextWeb/ChatGPT-Next-Web) with [Get Started Doc](https://docs.nextchat.dev/models/ollama)
- [Alpaca WebUI](https://github.com/mmo80/alpaca-webui)
- [OllamaGUI](https://github.com/enoch1118/ollamaGUI)
- [OpenAOE](https://github.com/InternLM/OpenAOE)
- [Odin Runes](https://github.com/leonid20000/OdinRunes)
- [LLM-X](https://github.com/mrdjohnson/llm-x) (Progressive Web App)
- [AnythingLLM (Docker + MacOs/Windows/Linux native app)](https://github.com/Mintplex-Labs/anything-llm)
- [Ollama Basic Chat: Uses HyperDiv Reactive UI](https://github.com/rapidarchitect/ollama_basic_chat)
- [Ollama-chats RPG](https://github.com/drazdra/ollama-chats)
- [IntelliBar](https://intellibar.app/) (AI-powered assistant for macOS)
- [Jirapt](https://github.com/AliAhmedNada/jirapt) (Jira Integration to generate issues, tasks, epics)
- [ojira](https://github.com/AliAhmedNada/ojira) (Jira chrome plugin to easily generate descriptions for tasks)
- [QA-Pilot](https://github.com/reid41/QA-Pilot) (Interactive chat tool that can leverage Ollama models for rapid understanding and navigation of GitHub code repositories)
- [ChatOllama](https://github.com/sugarforever/chat-ollama) (Open Source Chatbot based on Ollama with Knowledge Bases)
- [CRAG Ollama Chat](https://github.com/Nagi-ovo/CRAG-Ollama-Chat) (Simple Web Search with Corrective RAG)
- [RAGFlow](https://github.com/infiniflow/ragflow) (Open-source Retrieval-Augmented Generation engine based on deep document understanding)
- [StreamDeploy](https://github.com/StreamDeploy-DevRel/streamdeploy-llm-app-scaffold) (LLM Application Scaffold)
- [chat](https://github.com/swuecho/chat) (chat web app for teams)
- [Lobe Chat](https://github.com/lobehub/lobe-chat) with [Integrating Doc](https://lobehub.com/docs/self-hosting/examples/ollama)
- [Ollama RAG Chatbot](https://github.com/datvodinh/rag-chatbot.git) (Local Chat with multiple PDFs using Ollama and RAG)
- [BrainSoup](https://www.nurgo-software.com/products/brainsoup) (Flexible native client with RAG & multi-agent automation)
- [macai](https://github.com/Renset/macai) (macOS client for Ollama, ChatGPT, and other compatible API back-ends)
- [RWKV-Runner](https://github.com/josStorer/RWKV-Runner) (RWKV offline LLM deployment tool, also usable as a client for ChatGPT and Ollama)
- [Ollama Grid Search](https://github.com/dezoito/ollama-grid-search) (app to evaluate and compare models)
- [Olpaka](https://github.com/Otacon/olpaka) (User-friendly Flutter Web App for Ollama)
- [Casibase](https://casibase.org) (An open source AI knowledge base and dialogue system combining the latest RAG, SSO, ollama support, and multiple large language models.)
- [OllamaSpring](https://github.com/CrazyNeil/OllamaSpring) (Ollama Client for macOS)
- [LLocal.in](https://github.com/kartikm7/llocal) (Easy to use Electron Desktop Client for Ollama)
- [Shinkai Desktop](https://github.com/dcSpark/shinkai-apps) (Two click install Local AI using Ollama + Files + RAG)
- [AiLama](https://github.com/zeyoyt/ailama) (A Discord User App that allows you to interact with Ollama anywhere in Discord)
- [Ollama with Google Mesop](https://github.com/rapidarchitect/ollama_mesop/) (Mesop Chat Client implementation with Ollama)
- [R2R](https://github.com/SciPhi-AI/R2R) (Open-source RAG engine)
- [Ollama-Kis](https://github.com/elearningshow/ollama-kis) (A simple easy-to-use GUI with sample custom LLM for Drivers Education)
- [OpenGPA](https://opengpa.org) (Open-source offline-first Enterprise Agentic Application)
- [Painting Droid](https://github.com/mateuszmigas/painting-droid) (Painting app with AI integrations)
- [Kerlig AI](https://www.kerlig.com/) (AI writing assistant for macOS)
- [AI Studio](https://github.com/MindWorkAI/AI-Studio)
- [Sidellama](https://github.com/gyopak/sidellama) (browser-based LLM client)
- [LLMStack](https://github.com/trypromptly/LLMStack) (No-code multi-agent framework to build LLM agents and workflows)
- [BoltAI for Mac](https://boltai.com) (AI Chat Client for Mac)
- [Harbor](https://github.com/av/harbor) (Containerized LLM Toolkit with Ollama as default backend)
- [PyGPT](https://github.com/szczyglis-dev/py-gpt) (AI desktop assistant for Linux, Windows, and Mac)
- [Alpaca](https://github.com/Jeffser/Alpaca) (An Ollama client application for Linux and macOS made with GTK4 and Adwaita)
- [AutoGPT](https://github.com/Significant-Gravitas/AutoGPT/blob/master/docs/content/platform/ollama.md) (AutoGPT Ollama integration)
- [Go-CREW](https://www.jonathanhecl.com/go-crew/) (Powerful Offline RAG in Golang)
- [PartCAD](https://github.com/openvmp/partcad/) (CAD model generation with OpenSCAD and CadQuery)
- [Ollama4j Web UI](https://github.com/ollama4j/ollama4j-web-ui) - Java-based Web UI for Ollama built with Vaadin, Spring Boot, and Ollama4j
- [PyOllaMx](https://github.com/kspviswa/pyOllaMx) - macOS application capable of chatting with both Ollama and Apple MLX models.
- [Cline](https://github.com/cline/cline) - Formerly known as Claude Dev is a VS Code extension for multi-file/whole-repo coding
- [Void](https://github.com/voideditor/void) (Open source AI code editor and Cursor alternative)
- [Cherry Studio](https://github.com/kangfenmao/cherry-studio) (Desktop client with Ollama support)
- [ConfiChat](https://github.com/1runeberg/confichat) (Lightweight, standalone, multi-platform, and privacy-focused LLM chat interface with optional encryption)
- [Archyve](https://github.com/nickthecook/archyve) (RAG-enabling document library)
- [crewAI with Mesop](https://github.com/rapidarchitect/ollama-crew-mesop) (Mesop Web Interface to run crewAI with Ollama)
- [Tkinter-based client](https://github.com/chyok/ollama-gui) (Python tkinter-based Client for Ollama)
- [LLMChat](https://github.com/trendy-design/llmchat) (Privacy focused, 100% local, intuitive all-in-one chat interface)
- [Local Multimodal AI Chat](https://github.com/Leon-Sander/Local-Multimodal-AI-Chat) (Ollama-based LLM Chat with support for multiple features, including PDF RAG, voice chat, image-based interactions, and integration with OpenAI.)
- [ARGO](https://github.com/xark-argo/argo) (Locally download and run Ollama and Huggingface models with RAG and deep research on Mac/Windows/Linux)
- [OrionChat](https://github.com/EliasPereirah/OrionChat) - OrionChat is a web interface for chatting with different AI providers
- [G1](https://github.com/bklieger-groq/g1) (Prototype of using prompting strategies to improve the LLM's reasoning through o1-like reasoning chains.)
- [Web management](https://github.com/lemonit-eric-mao/ollama-web-management) (Web management page)
- [Promptery](https://github.com/promptery/promptery) (desktop client for Ollama.)
- [Ollama App](https://github.com/JHubi1/ollama-app) (Modern and easy-to-use multi-platform client for Ollama)
- [chat-ollama](https://github.com/annilq/chat-ollama) (a React Native client for Ollama)
- [SpaceLlama](https://github.com/tcsenpai/spacellama) (Firefox and Chrome extension to quickly summarize web pages with ollama in a sidebar)
- [YouLama](https://github.com/tcsenpai/youlama) (Webapp to quickly summarize any YouTube video, supporting Invidious as well)
- [DualMind](https://github.com/tcsenpai/dualmind) (Experimental app allowing two models to talk to each other in the terminal or in a web interface)
- [ollamarama-matrix](https://github.com/h1ddenpr0cess20/ollamarama-matrix) (Ollama chatbot for the Matrix chat protocol)
- [ollama-chat-app](https://github.com/anan1213095357/ollama-chat-app) (Flutter-based chat app)
- [Perfect Memory AI](https://www.perfectmemory.ai/) (Productivity AI assists personalized by what you have seen on your screen, heard, and said in the meetings)
- [Hexabot](https://github.com/hexastack/hexabot) (A conversational AI builder)
- [Reddit Rate](https://github.com/rapidarchitect/reddit_analyzer) (Search and Rate Reddit topics with a weighted summation)
- [OpenTalkGpt](https://github.com/adarshM84/OpenTalkGpt) (Chrome Extension to manage open-source models supported by Ollama, create custom models, and chat with models from a user-friendly UI)
- [VT](https://github.com/vinhnx/vt.ai) (A minimal multimodal AI chat app, with dynamic conversation routing. Supports local models via Ollama)
- [Nosia](https://github.com/nosia-ai/nosia) (Easy to install and use RAG platform based on Ollama)
- [Witsy](https://github.com/nbonamy/witsy) (An AI Desktop application available for Mac/Windows/Linux)
- [Abbey](https://github.com/US-Artificial-Intelligence/abbey) (A configurable AI interface server with notebooks, document storage, and YouTube support)
- [Minima](https://github.com/dmayboroda/minima) (RAG with on-premises or fully local workflow)
- [aidful-ollama-model-delete](https://github.com/AidfulAI/aidful-ollama-model-delete) (User interface for simplified model cleanup)
- [Perplexica](https://github.com/ItzCrazyKns/Perplexica) (An AI-powered search engine & an open-source alternative to Perplexity AI)
- [Ollama Chat WebUI for Docker ](https://github.com/oslook/ollama-webui) (Support for local docker deployment, lightweight ollama webui)
- [AI Toolkit for Visual Studio Code](https://aka.ms/ai-tooklit/ollama-docs) (Microsoft-official VS Code extension to chat, test, evaluate models with Ollama support, and use them in your AI applications.)
- [MinimalNextOllamaChat](https://github.com/anilkay/MinimalNextOllamaChat) (Minimal Web UI for Chat and Model Control)
- [Chipper](https://github.com/TilmanGriesel/chipper) AI interface for tinkerers (Ollama, Haystack RAG, Python)
- [ChibiChat](https://github.com/CosmicEventHorizon/ChibiChat) (Kotlin-based Android app to chat with Ollama and Koboldcpp API endpoints)
- [LocalLLM](https://github.com/qusaismael/localllm) (Minimal Web-App to run ollama models on it with a GUI)
- [Ollamazing](https://github.com/buiducnhat/ollamazing) (Web extension to run Ollama models)
- [OpenDeepResearcher-via-searxng](https://github.com/benhaotang/OpenDeepResearcher-via-searxng) (A Deep Research equivalent endpoint with Ollama support for running locally)
- [AntSK](https://github.com/AIDotNet/AntSK) (Out-of-the-box & Adaptable RAG Chatbot)
- [MaxKB](https://github.com/1Panel-dev/MaxKB/) (Ready-to-use & flexible RAG Chatbot)
- [yla](https://github.com/danielekp/yla) (Web interface to freely interact with your customized models)
- [LangBot](https://github.com/RockChinQ/LangBot) (LLM-based instant messaging bots platform, with Agents, RAG features, supports multiple platforms)
- [1Panel](https://github.com/1Panel-dev/1Panel/) (Web-based Linux Server Management Tool)
- [AstrBot](https://github.com/Soulter/AstrBot/) (User-friendly LLM-based multi-platform chatbot with a WebUI, supporting RAG, LLM agents, and plugins integration)
- [Reins](https://github.com/ibrahimcetin/reins) (Easily tweak parameters, customize system prompts per chat, and enhance your AI experiments with reasoning model support.)
- [Flufy](https://github.com/Aharon-Bensadoun/Flufy) (A beautiful chat interface for interacting with Ollama's API. Built with React, TypeScript, and Material-UI.)
- [Ellama](https://github.com/zeozeozeo/ellama) (Friendly native app to chat with an Ollama instance)
- [screenpipe](https://github.com/mediar-ai/screenpipe) Build agents powered by your screen history
- [Ollamb](https://github.com/hengkysteen/ollamb) (Simple yet rich in features, cross-platform built with Flutter and designed for Ollama. Try the [web demo](https://hengkysteen.github.io/demo/ollamb/).)
- [Writeopia](https://github.com/Writeopia/Writeopia) (Text editor with integration with Ollama)
- [AppFlowy](https://github.com/AppFlowy-IO/AppFlowy) (AI collaborative workspace with Ollama, cross-platform and self-hostable)
- [Lumina](https://github.com/cushydigit/lumina.git) (A lightweight, minimal React.js frontend for interacting with Ollama servers)
- [Tiny Notepad](https://pypi.org/project/tiny-notepad) (A lightweight, notepad-like interface to chat with ollama available on PyPI)
- [macLlama (macOS native)](https://github.com/hellotunamayo/macLlama) (A native macOS GUI application for interacting with Ollama models, featuring a chat interface.)
- [GPTranslate](https://github.com/philberndt/GPTranslate) (A fast and lightweight, AI powered desktop translation application written with Rust and Tauri. Features real-time translation with OpenAI/Azure/Ollama.)
- [ollama launcher](https://github.com/NGC13009/ollama-launcher) (A launcher for Ollama, aiming to provide users with convenient functions such as ollama server launching, management, or configuration.)
- [ai-hub](https://github.com/Aj-Seven/ai-hub) (AI Hub supports multiple models via API keys and Chat support via Ollama API.)
- [Mayan EDMS](https://gitlab.com/mayan-edms/mayan-edms) (Open source document management system to organize, tag, search, and automate your files with powerful Ollama driven workflows.)
- [Serene Pub](https://github.com/doolijb/serene-pub) (Beginner friendly, open source AI Roleplaying App for Windows, Mac OS and Linux. Search, download and use models with Ollama all inside the app.)
- [Andes](https://github.com/aqerd/andes) (A Visual Studio Code extension that provides a local UI interface for Ollama models)
- [KDeps](https://github.com/kdeps/kdeps) (Kdeps is an offline-first AI framework for building Dockerized full-stack AI applications declaratively using Apple PKL and integrates APIs with Ollama on the backend.)
- [Clueless](https://github.com/KashyapTan/clueless) (Open Source & Local Cluely: A desktop application LLM assistant to help you talk to anything on your screen using locally served Ollama models. Also undetectable to screenshare)
- [ollama-co2](https://github.com/carbonatedWaterOrg/ollama-co2) (FastAPI web interface for monitoring and managing local and remote Ollama servers with real-time model monitoring and concurrent downloads)
- [Hillnote](https://hillnote.com) (A Markdown-first workspace designed to supercharge your AI workflow. Create documents ready to integrate with Claude, ChatGPT, Gemini, Cursor, and more - all while keeping your work on your device.)
### Cloud #### Web
- [Open WebUI](https://github.com/open-webui/open-webui) - Extensible, self-hosted AI interface
- [Onyx](https://github.com/onyx-dot-app/onyx) - Connected AI workspace
- [LibreChat](https://github.com/danny-avila/LibreChat) - Enhanced ChatGPT clone with multi-provider support
- [Lobe Chat](https://github.com/lobehub/lobe-chat) - Modern chat framework with plugin ecosystem ([docs](https://lobehub.com/docs/self-hosting/examples/ollama))
- [NextChat](https://github.com/ChatGPTNextWeb/ChatGPT-Next-Web) - Cross-platform ChatGPT UI ([docs](https://docs.nextchat.dev/models/ollama))
- [Perplexica](https://github.com/ItzCrazyKns/Perplexica) - AI-powered search engine, open-source Perplexity alternative
- [big-AGI](https://github.com/enricoros/big-AGI) - AI suite for professionals
- [Lollms WebUI](https://github.com/ParisNeo/lollms-webui) - Multi-model web interface
- [ChatOllama](https://github.com/sugarforever/chat-ollama) - Chatbot with knowledge bases
- [Bionic GPT](https://github.com/bionic-gpt/bionic-gpt) - On-premise AI platform
- [Chatbot UI](https://github.com/ivanfioravanti/chatbot-ollama) - ChatGPT-style web interface
- [Hollama](https://github.com/fmaclen/hollama) - Minimal web interface
- [Chatbox](https://github.com/Bin-Huang/Chatbox) - Desktop and web AI client
- [chat](https://github.com/swuecho/chat) - Chat web app for teams
- [Ollama RAG Chatbot](https://github.com/datvodinh/rag-chatbot.git) - Chat with multiple PDFs using RAG
- [Tkinter-based client](https://github.com/chyok/ollama-gui) - Python desktop client
#### Desktop
- [Dify.AI](https://github.com/langgenius/dify) - LLM app development platform
- [AnythingLLM](https://github.com/Mintplex-Labs/anything-llm) - All-in-one AI app for Mac, Windows, and Linux
- [Maid](https://github.com/Mobile-Artificial-Intelligence/maid) - Cross-platform mobile and desktop client
- [Witsy](https://github.com/nbonamy/witsy) - AI desktop app for Mac, Windows, and Linux
- [Cherry Studio](https://github.com/kangfenmao/cherry-studio) - Multi-provider desktop client
- [Ollama App](https://github.com/JHubi1/ollama-app) - Multi-platform client for desktop and mobile
- [PyGPT](https://github.com/szczyglis-dev/py-gpt) - AI desktop assistant for Linux, Windows, and Mac
- [Alpaca](https://github.com/Jeffser/Alpaca) - GTK4 client for Linux and macOS
- [SwiftChat](https://github.com/aws-samples/swift-chat) - Cross-platform including iOS, Android, and Apple Vision Pro
- [Enchanted](https://github.com/AugustDev/enchanted) - Native macOS and iOS client
- [RWKV-Runner](https://github.com/josStorer/RWKV-Runner) - Multi-model desktop runner
- [Ollama Grid Search](https://github.com/dezoito/ollama-grid-search) - Evaluate and compare models
- [macai](https://github.com/Renset/macai) - macOS client for Ollama and ChatGPT
- [AI Studio](https://github.com/MindWorkAI/AI-Studio) - Multi-provider desktop IDE
- [Reins](https://github.com/ibrahimcetin/reins) - Parameter tuning and reasoning model support
- [ConfiChat](https://github.com/1runeberg/confichat) - Privacy-focused with optional encryption
- [LLocal.in](https://github.com/kartikm7/llocal) - Electron desktop client
- [MindMac](https://mindmac.app) - AI chat client for Mac
- [Msty](https://msty.app) - Multi-model desktop client
- [BoltAI for Mac](https://boltai.com) - AI chat client for Mac
- [IntelliBar](https://intellibar.app/) - AI-powered assistant for macOS
- [Kerlig AI](https://www.kerlig.com/) - AI writing assistant for macOS
- [Hillnote](https://hillnote.com) - Markdown-first AI workspace
- [Perfect Memory AI](https://www.perfectmemory.ai/) - Productivity AI personalized by screen and meeting history
#### Mobile
- [Ollama Android Chat](https://github.com/sunshine0523/OllamaServer) - One-click Ollama on Android
> SwiftChat, Enchanted, Maid, Ollama App, Reins, and ConfiChat listed above also support mobile platforms.
### Code Editors & Development
- [Cline](https://github.com/cline/cline) - VS Code extension for multi-file/whole-repo coding
- [Continue](https://github.com/continuedev/continue) - Open-source AI code assistant for any IDE
- [Void](https://github.com/voideditor/void) - Open source AI code editor, Cursor alternative
- [Copilot for Obsidian](https://github.com/logancyang/obsidian-copilot) - AI assistant for Obsidian
- [twinny](https://github.com/rjmacarthy/twinny) - Copilot and Copilot chat alternative
- [gptel Emacs client](https://github.com/karthink/gptel) - LLM client for Emacs
- [Ollama Copilot](https://github.com/bernardo-bruning/ollama-copilot) - Use Ollama as GitHub Copilot
- [Obsidian Local GPT](https://github.com/pfrankov/obsidian-local-gpt) - Local AI for Obsidian
- [Ellama Emacs client](https://github.com/s-kostyaev/ellama) - LLM tool for Emacs
- [orbiton](https://github.com/xyproto/orbiton) - Config-free text editor with Ollama tab completion
- [AI ST Completion](https://github.com/yaroslavyaroslav/OpenAI-sublime-text) - Sublime Text 4 AI assistant
- [VT Code](https://github.com/vinhnx/vtcode) - Rust-based terminal coding agent with Tree-sitter
- [QodeAssist](https://github.com/Palm1r/QodeAssist) - AI coding assistant for Qt Creator
- [AI Toolkit for VS Code](https://aka.ms/ai-tooklit/ollama-docs) - Microsoft-official VS Code extension
- [Open Interpreter](https://docs.openinterpreter.com/language-model-setup/local-models/ollama) - Natural language interface for computers
### Libraries & SDKs
- [LiteLLM](https://github.com/BerriAI/litellm) - Unified API for 100+ LLM providers
- [Semantic Kernel](https://github.com/microsoft/semantic-kernel/tree/main/python/semantic_kernel/connectors/ai/ollama) - Microsoft AI orchestration SDK
- [LangChain4j](https://github.com/langchain4j/langchain4j) - Java LangChain ([example](https://github.com/langchain4j/langchain4j-examples/tree/main/ollama-examples/src/main/java))
- [LangChainGo](https://github.com/tmc/langchaingo/) - Go LangChain ([example](https://github.com/tmc/langchaingo/tree/main/examples/ollama-completion-example))
- [Spring AI](https://github.com/spring-projects/spring-ai) - Spring framework AI support ([docs](https://docs.spring.io/spring-ai/reference/api/chat/ollama-chat.html))
- [LangChain](https://python.langchain.com/docs/integrations/chat/ollama/) and [LangChain.js](https://js.langchain.com/docs/integrations/chat/ollama/) with [example](https://js.langchain.com/docs/tutorials/local_rag/)
- [Ollama for Ruby](https://github.com/crmne/ruby_llm) - Ruby LLM library
- [any-llm](https://github.com/mozilla-ai/any-llm) - Unified LLM interface by Mozilla
- [OllamaSharp for .NET](https://github.com/awaescher/OllamaSharp) - .NET SDK
- [LangChainRust](https://github.com/Abraxas-365/langchain-rust) - Rust LangChain ([example](https://github.com/Abraxas-365/langchain-rust/blob/main/examples/llm_ollama.rs))
- [Agents-Flex for Java](https://github.com/agents-flex/agents-flex) - Java agent framework ([example](https://github.com/agents-flex/agents-flex/tree/main/agents-flex-llm/agents-flex-llm-ollama/src/test/java/com/agentsflex/llm/ollama))
- [Elixir LangChain](https://github.com/brainlid/langchain) - Elixir LangChain
- [Ollama-rs for Rust](https://github.com/pepperoni21/ollama-rs) - Rust SDK
- [LangChain for .NET](https://github.com/tryAGI/LangChain) - .NET LangChain ([example](https://github.com/tryAGI/LangChain/blob/main/examples/LangChain.Samples.OpenAI/Program.cs))
- [chromem-go](https://github.com/philippgille/chromem-go) - Go vector database with Ollama embeddings ([example](https://github.com/philippgille/chromem-go/tree/v0.5.0/examples/rag-wikipedia-ollama))
- [LangChainDart](https://github.com/davidmigloz/langchain_dart) - Dart LangChain
- [LlmTornado](https://github.com/lofcz/llmtornado) - Unified C# interface for multiple inference APIs
- [Ollama4j for Java](https://github.com/ollama4j/ollama4j) - Java SDK
- [Ollama for Laravel](https://github.com/cloudstudio/ollama-laravel) - Laravel integration
- [Ollama for Swift](https://github.com/mattt/ollama-swift) - Swift SDK
- [LlamaIndex](https://docs.llamaindex.ai/en/stable/examples/llm/ollama/) and [LlamaIndexTS](https://ts.llamaindex.ai/modules/llms/available_llms/ollama) - Data framework for LLM apps
- [Haystack](https://github.com/deepset-ai/haystack-integrations/blob/main/integrations/ollama.md) - AI pipeline framework
- [Firebase Genkit](https://firebase.google.com/docs/genkit/plugins/ollama) - Google AI framework
- [Ollama-hpp for C++](https://github.com/jmont-dev/ollama-hpp) - C++ SDK
- [PromptingTools.jl](https://github.com/svilupp/PromptingTools.jl) - Julia LLM toolkit ([example](https://svilupp.github.io/PromptingTools.jl/dev/examples/working_with_ollama))
- [Ollama for R - rollama](https://github.com/JBGruber/rollama) - R SDK
- [Portkey](https://portkey.ai/docs/welcome/integration-guides/ollama) - AI gateway
- [Testcontainers](https://testcontainers.com/modules/ollama/) - Container-based testing
- [LLPhant](https://github.com/theodo-group/LLPhant?tab=readme-ov-file#ollama) - PHP AI framework
### Frameworks & Agents
- [AutoGPT](https://github.com/Significant-Gravitas/AutoGPT/blob/master/docs/content/platform/ollama.md) - Autonomous AI agent platform
- [crewAI](https://github.com/crewAIInc/crewAI) - Multi-agent orchestration framework
- [Strands Agents](https://github.com/strands-agents/sdk-python) - Model-driven agent building by AWS
- [Cheshire Cat](https://github.com/cheshire-cat-ai/core) - AI assistant framework
- [any-agent](https://github.com/mozilla-ai/any-agent) - Unified agent framework interface by Mozilla
- [Stakpak](https://github.com/stakpak/agent) - Open source DevOps agent
- [Hexabot](https://github.com/hexastack/hexabot) - Conversational AI builder
- [Neuro SAN](https://github.com/cognizant-ai-lab/neuro-san-studio) - Multi-agent orchestration ([docs](https://github.com/cognizant-ai-lab/neuro-san-studio/blob/main/docs/user_guide.md#ollama))
### RAG & Knowledge Bases
- [RAGFlow](https://github.com/infiniflow/ragflow) - RAG engine based on deep document understanding
- [R2R](https://github.com/SciPhi-AI/R2R) - Open-source RAG engine
- [MaxKB](https://github.com/1Panel-dev/MaxKB/) - Ready-to-use RAG chatbot
- [Minima](https://github.com/dmayboroda/minima) - On-premises or fully local RAG
- [Chipper](https://github.com/TilmanGriesel/chipper) - AI interface with Haystack RAG
- [ARGO](https://github.com/xark-argo/argo) - RAG and deep research on Mac/Windows/Linux
- [Archyve](https://github.com/nickthecook/archyve) - RAG-enabling document library
- [Casibase](https://casibase.org) - AI knowledge base with RAG and SSO
- [BrainSoup](https://www.nurgo-software.com/products/brainsoup) - Native client with RAG and multi-agent automation
### Bots & Messaging
- [LangBot](https://github.com/RockChinQ/LangBot) - Multi-platform messaging bots with agents and RAG
- [AstrBot](https://github.com/Soulter/AstrBot/) - Multi-platform chatbot with RAG and plugins
- [Discord-Ollama Chat Bot](https://github.com/kevinthedang/discord-ollama) - TypeScript Discord bot
- [Ollama Telegram Bot](https://github.com/ruecat/ollama-telegram) - Telegram bot
- [LLM Telegram Bot](https://github.com/innightwolfsleep/llm_telegram_bot) - Telegram bot for roleplay
### Terminal & CLI
- [aichat](https://github.com/sigoden/aichat) - All-in-one LLM CLI with Shell Assistant, RAG, and AI tools
- [oterm](https://github.com/ggozad/oterm) - Terminal client for Ollama
- [gollama](https://github.com/sammcj/gollama) - Go-based model manager for Ollama
- [tlm](https://github.com/yusufcanb/tlm) - Local shell copilot
- [tenere](https://github.com/pythops/tenere) - TUI for LLMs
- [ParLlama](https://github.com/paulrobello/parllama) - TUI for Ollama
- [llm-ollama](https://github.com/taketwo/llm-ollama) - Plugin for [Datasette's LLM CLI](https://llm.datasette.io/en/stable/)
- [ShellOracle](https://github.com/djcopley/ShellOracle) - Shell command suggestions
- [LLM-X](https://github.com/mrdjohnson/llm-x) - Progressive web app for LLMs
- [cmdh](https://github.com/pgibler/cmdh) - Natural language to shell commands
- [VT](https://github.com/vinhnx/vt.ai) - Minimal multimodal AI chat app
### Productivity & Apps
- [AppFlowy](https://github.com/AppFlowy-IO/AppFlowy) - AI collaborative workspace, self-hostable Notion alternative
- [Screenpipe](https://github.com/mediar-ai/screenpipe) - 24/7 screen and mic recording with AI-powered search
- [Vibe](https://github.com/thewh1teagle/vibe) - Transcribe and analyze meetings
- [Page Assist](https://github.com/n4ze3m/page-assist) - Chrome extension for AI-powered browsing
- [NativeMind](https://github.com/NativeMindBrowser/NativeMindExtension) - Private, on-device browser AI assistant
- [Ollama Fortress](https://github.com/ParisNeo/ollama_proxy_server) - Security proxy for Ollama
- [1Panel](https://github.com/1Panel-dev/1Panel/) - Web-based Linux server management
- [Writeopia](https://github.com/Writeopia/Writeopia) - Text editor with Ollama integration
- [QA-Pilot](https://github.com/reid41/QA-Pilot) - GitHub code repository understanding
- [Raycast extension](https://github.com/MassimilianoPasquini97/raycast_ollama) - Ollama in Raycast
- [Painting Droid](https://github.com/mateuszmigas/painting-droid) - Painting app with AI integrations
- [Serene Pub](https://github.com/doolijb/serene-pub) - AI roleplaying app
- [Mayan EDMS](https://gitlab.com/mayan-edms/mayan-edms) - Document management with Ollama workflows
- [TagSpaces](https://www.tagspaces.org) - File management with [AI tagging](https://docs.tagspaces.org/ai/)
### Observability & Monitoring
- [Opik](https://www.comet.com/docs/opik/cookbook/ollama) - Debug, evaluate, and monitor LLM applications
- [OpenLIT](https://github.com/openlit/openlit) - OpenTelemetry-native monitoring for Ollama and GPUs
- [Lunary](https://lunary.ai/docs/integrations/ollama) - LLM observability with analytics and PII masking
- [Langfuse](https://langfuse.com/docs/integrations/ollama) - Open source LLM observability
- [HoneyHive](https://docs.honeyhive.ai/integrations/ollama) - AI observability and evaluation for agents
- [MLflow Tracing](https://mlflow.org/docs/latest/llms/tracing/index.html#automatic-tracing) - Open source LLM observability
### Database & Embeddings
- [pgai](https://github.com/timescale/pgai) - PostgreSQL as a vector database ([guide](https://github.com/timescale/pgai/blob/main/docs/vectorizer-quick-start.md))
- [MindsDB](https://github.com/mindsdb/mindsdb/blob/staging/mindsdb/integrations/handlers/ollama_handler/README.md) - Connect Ollama with 200+ data platforms
- [chromem-go](https://github.com/philippgille/chromem-go/blob/v0.5.0/embed_ollama.go) - Embeddable vector database for Go ([example](https://github.com/philippgille/chromem-go/tree/v0.5.0/examples/rag-wikipedia-ollama))
- [Kangaroo](https://github.com/dbkangaroo/kangaroo) - AI-powered SQL client
### Infrastructure & Deployment
#### Cloud
- [Google Cloud](https://cloud.google.com/run/docs/tutorials/gpu-gemma2-with-ollama) - [Google Cloud](https://cloud.google.com/run/docs/tutorials/gpu-gemma2-with-ollama)
- [Fly.io](https://fly.io/docs/python/do-more/add-ollama/) - [Fly.io](https://fly.io/docs/python/do-more/add-ollama/)
- [Koyeb](https://www.koyeb.com/deploy/ollama) - [Koyeb](https://www.koyeb.com/deploy/ollama)
- [Harbor](https://github.com/av/harbor) - Containerized LLM toolkit with Ollama as default backend
### Tutorial #### Package Managers
- [handy-ollama](https://github.com/datawhalechina/handy-ollama) (Chinese Tutorial for Ollama by [Datawhale ](https://github.com/datawhalechina) - China's Largest Open Source AI Learning Community)
### Terminal
- [oterm](https://github.com/ggozad/oterm)
- [Ellama Emacs client](https://github.com/s-kostyaev/ellama)
- [Emacs client](https://github.com/zweifisch/ollama)
- [neollama](https://github.com/paradoxical-dev/neollama) UI client for interacting with models from within Neovim
- [gen.nvim](https://github.com/David-Kunz/gen.nvim)
- [ollama.nvim](https://github.com/nomnivore/ollama.nvim)
- [ollero.nvim](https://github.com/marco-souza/ollero.nvim)
- [ollama-chat.nvim](https://github.com/gerazov/ollama-chat.nvim)
- [ogpt.nvim](https://github.com/huynle/ogpt.nvim)
- [gptel Emacs client](https://github.com/karthink/gptel)
- [Oatmeal](https://github.com/dustinblackman/oatmeal)
- [cmdh](https://github.com/pgibler/cmdh)
- [ooo](https://github.com/npahlfer/ooo)
- [shell-pilot](https://github.com/reid41/shell-pilot)(Interact with models via pure shell scripts on Linux or macOS)
- [tenere](https://github.com/pythops/tenere)
- [llm-ollama](https://github.com/taketwo/llm-ollama) for [Datasette's LLM CLI](https://llm.datasette.io/en/stable/).
- [typechat-cli](https://github.com/anaisbetts/typechat-cli)
- [ShellOracle](https://github.com/djcopley/ShellOracle)
- [tlm](https://github.com/yusufcanb/tlm)
- [podman-ollama](https://github.com/ericcurtin/podman-ollama)
- [gollama](https://github.com/sammcj/gollama)
- [ParLlama](https://github.com/paulrobello/parllama)
- [Ollama eBook Summary](https://github.com/cognitivetech/ollama-ebook-summary/)
- [Ollama Mixture of Experts (MOE) in 50 lines of code](https://github.com/rapidarchitect/ollama_moe)
- [vim-intelligence-bridge](https://github.com/pepo-ec/vim-intelligence-bridge) Simple interaction of "Ollama" with the Vim editor
- [x-cmd ollama](https://x-cmd.com/mod/ollama)
- [bb7](https://github.com/drunkwcodes/bb7)
- [SwollamaCLI](https://github.com/marcusziade/Swollama) bundled with the Swollama Swift package. [Demo](https://github.com/marcusziade/Swollama?tab=readme-ov-file#cli-usage)
- [aichat](https://github.com/sigoden/aichat) All-in-one LLM CLI tool featuring Shell Assistant, Chat-REPL, RAG, AI tools & agents, with access to OpenAI, Claude, Gemini, Ollama, Groq, and more.
- [PowershAI](https://github.com/rrg92/powershai) PowerShell module that brings AI to terminal on Windows, including support for Ollama
- [DeepShell](https://github.com/Abyss-c0re/deepshell) Your self-hosted AI assistant. Interactive Shell, Files and Folders analysis.
- [orbiton](https://github.com/xyproto/orbiton) Configuration-free text editor and IDE with support for tab completion with Ollama.
- [orca-cli](https://github.com/molbal/orca-cli) Ollama Registry CLI Application - Browse, pull, and download models from Ollama Registry in your terminal.
- [GGUF-to-Ollama](https://github.com/jonathanhecl/gguf-to-ollama) - Importing GGUF to Ollama made easy (multiplatform)
- [AWS-Strands-With-Ollama](https://github.com/rapidarchitect/ollama_strands) - AWS Strands Agents with Ollama Examples
- [ollama-multirun](https://github.com/attogram/ollama-multirun) - A bash shell script to run a single prompt against any or all of your locally installed ollama models, saving the output and performance statistics as easily navigable web pages. ([Demo](https://attogram.github.io/ai_test_zone/))
- [ollama-bash-toolshed](https://github.com/attogram/ollama-bash-toolshed) - Bash scripts to chat with tool using models. Add new tools to your shed with ease. Runs on Ollama.
- [hle-eval-ollama](https://github.com/mags0ft/hle-eval-ollama) - Runs benchmarks like "Humanity's Last Exam" (HLE) on your favorite local Ollama models and evaluates the quality of their responses
- [VT Code](https://github.com/vinhnx/vtcode) - VT Code is a Rust-based terminal coding agent with semantic code intelligence via Tree-sitter. Ollama integration for running local/cloud models with configurable endpoints.
### Apple Vision Pro
- [SwiftChat](https://github.com/aws-samples/swift-chat) (Cross-platform AI chat app supporting Apple Vision Pro via "Designed for iPad")
- [Enchanted](https://github.com/AugustDev/enchanted)
### Database
- [pgai](https://github.com/timescale/pgai) - PostgreSQL as a vector database (Create and search embeddings from Ollama models using pgvector)
- [Get started guide](https://github.com/timescale/pgai/blob/main/docs/vectorizer-quick-start.md)
- [MindsDB](https://github.com/mindsdb/mindsdb/blob/staging/mindsdb/integrations/handlers/ollama_handler/README.md) (Connects Ollama models with nearly 200 data platforms and apps)
- [chromem-go](https://github.com/philippgille/chromem-go/blob/v0.5.0/embed_ollama.go) with [example](https://github.com/philippgille/chromem-go/tree/v0.5.0/examples/rag-wikipedia-ollama)
- [Kangaroo](https://github.com/dbkangaroo/kangaroo) (AI-powered SQL client and admin tool for popular databases)
### Package managers
- [Pacman](https://archlinux.org/packages/extra/x86_64/ollama/) - [Pacman](https://archlinux.org/packages/extra/x86_64/ollama/)
- [Gentoo](https://github.com/gentoo/guru/tree/master/app-misc/ollama)
- [Homebrew](https://formulae.brew.sh/formula/ollama) - [Homebrew](https://formulae.brew.sh/formula/ollama)
- [Helm Chart](https://artifacthub.io/packages/helm/ollama-helm/ollama)
- [Guix channel](https://codeberg.org/tusharhero/ollama-guix)
- [Nix package](https://search.nixos.org/packages?show=ollama&from=0&size=50&sort=relevance&type=packages&query=ollama) - [Nix package](https://search.nixos.org/packages?show=ollama&from=0&size=50&sort=relevance&type=packages&query=ollama)
- [Helm Chart](https://artifacthub.io/packages/helm/ollama-helm/ollama)
- [Gentoo](https://github.com/gentoo/guru/tree/master/app-misc/ollama)
- [Flox](https://flox.dev/blog/ollama-part-one) - [Flox](https://flox.dev/blog/ollama-part-one)
- [Guix channel](https://codeberg.org/tusharhero/ollama-guix)
### Libraries
- [LangChain](https://python.langchain.com/docs/integrations/chat/ollama/) and [LangChain.js](https://js.langchain.com/docs/integrations/chat/ollama/) with [example](https://js.langchain.com/docs/tutorials/local_rag/)
- [Firebase Genkit](https://firebase.google.com/docs/genkit/plugins/ollama)
- [crewAI](https://github.com/crewAIInc/crewAI)
- [Yacana](https://remembersoftwares.github.io/yacana/) (User-friendly multi-agent framework for brainstorming and executing predetermined flows with built-in tool integration)
- [Strands Agents](https://github.com/strands-agents/sdk-python) (A model-driven approach to building AI agents in just a few lines of code)
- [Spring AI](https://github.com/spring-projects/spring-ai) with [reference](https://docs.spring.io/spring-ai/reference/api/chat/ollama-chat.html) and [example](https://github.com/tzolov/ollama-tools)
- [LangChainGo](https://github.com/tmc/langchaingo/) with [example](https://github.com/tmc/langchaingo/tree/main/examples/ollama-completion-example)
- [LangChain4j](https://github.com/langchain4j/langchain4j) with [example](https://github.com/langchain4j/langchain4j-examples/tree/main/ollama-examples/src/main/java)
- [LangChainRust](https://github.com/Abraxas-365/langchain-rust) with [example](https://github.com/Abraxas-365/langchain-rust/blob/main/examples/llm_ollama.rs)
- [LangChain for .NET](https://github.com/tryAGI/LangChain) with [example](https://github.com/tryAGI/LangChain/blob/main/examples/LangChain.Samples.OpenAI/Program.cs)
- [LLPhant](https://github.com/theodo-group/LLPhant?tab=readme-ov-file#ollama)
- [LlamaIndex](https://docs.llamaindex.ai/en/stable/examples/llm/ollama/) and [LlamaIndexTS](https://ts.llamaindex.ai/modules/llms/available_llms/ollama)
- [LiteLLM](https://github.com/BerriAI/litellm)
- [OllamaFarm for Go](https://github.com/presbrey/ollamafarm)
- [OllamaSharp for .NET](https://github.com/awaescher/OllamaSharp)
- [Ollama for Ruby](https://github.com/gbaptista/ollama-ai)
- [Ollama-rs for Rust](https://github.com/pepperoni21/ollama-rs)
- [Ollama-hpp for C++](https://github.com/jmont-dev/ollama-hpp)
- [Ollama4j for Java](https://github.com/ollama4j/ollama4j)
- [ModelFusion Typescript Library](https://modelfusion.dev/integration/model-provider/ollama)
- [OllamaKit for Swift](https://github.com/kevinhermawan/OllamaKit)
- [Ollama for Dart](https://github.com/breitburg/dart-ollama)
- [Ollama for Laravel](https://github.com/cloudstudio/ollama-laravel)
- [LangChainDart](https://github.com/davidmigloz/langchain_dart)
- [Semantic Kernel - Python](https://github.com/microsoft/semantic-kernel/tree/main/python/semantic_kernel/connectors/ai/ollama)
- [Haystack](https://github.com/deepset-ai/haystack-integrations/blob/main/integrations/ollama.md)
- [Elixir LangChain](https://github.com/brainlid/langchain)
- [Ollama for R - rollama](https://github.com/JBGruber/rollama)
- [Ollama for R - ollama-r](https://github.com/hauselin/ollama-r)
- [Ollama-ex for Elixir](https://github.com/lebrunel/ollama-ex)
- [Ollama Connector for SAP ABAP](https://github.com/b-tocs/abap_btocs_ollama)
- [Testcontainers](https://testcontainers.com/modules/ollama/)
- [Portkey](https://portkey.ai/docs/welcome/integration-guides/ollama)
- [PromptingTools.jl](https://github.com/svilupp/PromptingTools.jl) with an [example](https://svilupp.github.io/PromptingTools.jl/dev/examples/working_with_ollama)
- [LlamaScript](https://github.com/Project-Llama/llamascript)
- [llm-axe](https://github.com/emirsahin1/llm-axe) (Python Toolkit for Building LLM Powered Apps)
- [Gollm](https://docs.gollm.co/examples/ollama-example)
- [Gollama for Golang](https://github.com/jonathanhecl/gollama)
- [Ollamaclient for Golang](https://github.com/xyproto/ollamaclient)
- [High-level function abstraction in Go](https://gitlab.com/tozd/go/fun)
- [Ollama PHP](https://github.com/ArdaGnsrn/ollama-php)
- [Agents-Flex for Java](https://github.com/agents-flex/agents-flex) with [example](https://github.com/agents-flex/agents-flex/tree/main/agents-flex-llm/agents-flex-llm-ollama/src/test/java/com/agentsflex/llm/ollama)
- [Parakeet](https://github.com/parakeet-nest/parakeet) is a GoLang library, made to simplify the development of small generative AI applications with Ollama.
- [Haverscript](https://github.com/andygill/haverscript) with [examples](https://github.com/andygill/haverscript/tree/main/examples)
- [Ollama for Swift](https://github.com/mattt/ollama-swift)
- [Swollama for Swift](https://github.com/guitaripod/Swollama) with [DocC](https://guitaripod.github.io/Swollama/documentation/swollama)
- [GoLamify](https://github.com/prasad89/golamify)
- [Ollama for Haskell](https://github.com/tusharad/ollama-haskell)
- [multi-llm-ts](https://github.com/nbonamy/multi-llm-ts) (A Typescript/JavaScript library allowing access to different LLM in a unified API)
- [LlmTornado](https://github.com/lofcz/llmtornado) (C# library providing a unified interface for major FOSS & Commercial inference APIs)
- [Ollama for Zig](https://github.com/dravenk/ollama-zig)
- [Abso](https://github.com/lunary-ai/abso) (OpenAI-compatible TypeScript SDK for any LLM provider)
- [Nichey](https://github.com/goodreasonai/nichey) is a Python package for generating custom wikis for your research topic
- [Ollama for D](https://github.com/kassane/ollama-d)
- [OllamaPlusPlus](https://github.com/HardCodeDev777/OllamaPlusPlus) (Very simple C++ library for Ollama)
- [any-llm](https://github.com/mozilla-ai/any-llm) (A single interface to use different llm providers by [mozilla.ai](https://www.mozilla.ai/))
- [any-agent](https://github.com/mozilla-ai/any-agent) (A single interface to use and evaluate different agent frameworks by [mozilla.ai](https://www.mozilla.ai/))
- [Neuro SAN](https://github.com/cognizant-ai-lab/neuro-san-studio) (Data-driven multi-agent orchestration framework) with [example](https://github.com/cognizant-ai-lab/neuro-san-studio/blob/main/docs/user_guide.md#ollama)
- [achatbot-go](https://github.com/ai-bot-pro/achatbot-go) a multimodal(text/audio/image) chatbot.
- [Ollama Bash Lib](https://github.com/attogram/ollama-bash-lib) - A Bash Library for Ollama. Run LLM prompts straight from your shell, and more
### Mobile
- [SwiftChat](https://github.com/aws-samples/swift-chat) (Lightning-fast Cross-platform AI chat app with native UI for Android, iOS, and iPad)
- [Enchanted](https://github.com/AugustDev/enchanted)
- [Maid](https://github.com/Mobile-Artificial-Intelligence/maid)
- [Ollama App](https://github.com/JHubi1/ollama-app) (Modern and easy-to-use multi-platform client for Ollama)
- [ConfiChat](https://github.com/1runeberg/confichat) (Lightweight, standalone, multi-platform, and privacy-focused LLM chat interface with optional encryption)
- [Ollama Android Chat](https://github.com/sunshine0523/OllamaServer) (No need for Termux, start the Ollama service with one click on an Android device)
- [Reins](https://github.com/ibrahimcetin/reins) (Easily tweak parameters, customize system prompts per chat, and enhance your AI experiments with reasoning model support.)
### Extensions & Plugins
- [Raycast extension](https://github.com/MassimilianoPasquini97/raycast_ollama)
- [Discollama](https://github.com/mxyng/discollama) (Discord bot inside the Ollama discord channel)
- [Continue](https://github.com/continuedev/continue)
- [Vibe](https://github.com/thewh1teagle/vibe) (Transcribe and analyze meetings with Ollama)
- [Obsidian Ollama plugin](https://github.com/hinterdupfinger/obsidian-ollama)
- [Logseq Ollama plugin](https://github.com/omagdy7/ollama-logseq)
- [NotesOllama](https://github.com/andersrex/notesollama) (Apple Notes Ollama plugin)
- [Dagger Chatbot](https://github.com/samalba/dagger-chatbot)
- [Discord AI Bot](https://github.com/mekb-turtle/discord-ai-bot)
- [Ollama Telegram Bot](https://github.com/ruecat/ollama-telegram)
- [Hass Ollama Conversation](https://github.com/ej52/hass-ollama-conversation)
- [Rivet plugin](https://github.com/abrenneke/rivet-plugin-ollama)
- [Obsidian BMO Chatbot plugin](https://github.com/longy2k/obsidian-bmo-chatbot)
- [Cliobot](https://github.com/herval/cliobot) (Telegram bot with Ollama support)
- [Copilot for Obsidian plugin](https://github.com/logancyang/obsidian-copilot)
- [Obsidian Local GPT plugin](https://github.com/pfrankov/obsidian-local-gpt)
- [Open Interpreter](https://docs.openinterpreter.com/language-model-setup/local-models/ollama)
- [Llama Coder](https://github.com/ex3ndr/llama-coder) (Copilot alternative using Ollama)
- [Ollama Copilot](https://github.com/bernardo-bruning/ollama-copilot) (Proxy that allows you to use Ollama as a copilot like GitHub Copilot)
- [twinny](https://github.com/rjmacarthy/twinny) (Copilot and Copilot chat alternative using Ollama)
- [Wingman-AI](https://github.com/RussellCanfield/wingman-ai) (Copilot code and chat alternative using Ollama and Hugging Face)
- [Page Assist](https://github.com/n4ze3m/page-assist) (Chrome Extension)
- [Plasmoid Ollama Control](https://github.com/imoize/plasmoid-ollamacontrol) (KDE Plasma extension that allows you to quickly manage/control Ollama model)
- [AI Telegram Bot](https://github.com/tusharhero/aitelegrambot) (Telegram bot using Ollama in backend)
- [AI ST Completion](https://github.com/yaroslavyaroslav/OpenAI-sublime-text) (Sublime Text 4 AI assistant plugin with Ollama support)
- [Discord-Ollama Chat Bot](https://github.com/kevinthedang/discord-ollama) (Generalized TypeScript Discord Bot w/ Tuning Documentation)
- [ChatGPTBox: All in one browser extension](https://github.com/josStorer/chatGPTBox) with [Integrating Tutorial](https://github.com/josStorer/chatGPTBox/issues/616#issuecomment-1975186467)
- [Discord AI chat/moderation bot](https://github.com/rapmd73/Companion) Chat/moderation bot written in python. Uses Ollama to create personalities.
- [Headless Ollama](https://github.com/nischalj10/headless-ollama) (Scripts to automatically install ollama client & models on any OS for apps that depend on ollama server)
- [Terraform AWS Ollama & Open WebUI](https://github.com/xuyangbocn/terraform-aws-self-host-llm) (A Terraform module to deploy on AWS a ready-to-use Ollama service, together with its front-end Open WebUI service.)
- [node-red-contrib-ollama](https://github.com/jakubburkiewicz/node-red-contrib-ollama)
- [Local AI Helper](https://github.com/ivostoykov/localAI) (Chrome and Firefox extensions that enable interactions with the active tab and customisable API endpoints. Includes secure storage for user prompts.)
- [LSP-AI](https://github.com/SilasMarvin/lsp-ai) (Open-source language server for AI-powered functionality)
- [QodeAssist](https://github.com/Palm1r/QodeAssist) (AI-powered coding assistant plugin for Qt Creator)
- [Obsidian Quiz Generator plugin](https://github.com/ECuiDev/obsidian-quiz-generator)
- [AI Summary Helper plugin](https://github.com/philffm/ai-summary-helper)
- [TextCraft](https://github.com/suncloudsmoon/TextCraft) (Copilot in Word alternative using Ollama)
- [Alfred Ollama](https://github.com/zeitlings/alfred-ollama) (Alfred Workflow)
- [TextLLaMA](https://github.com/adarshM84/TextLLaMA) A Chrome Extension that helps you write emails, correct grammar, and translate into any language
- [Simple-Discord-AI](https://github.com/zyphixor/simple-discord-ai)
- [LLM Telegram Bot](https://github.com/innightwolfsleep/llm_telegram_bot) (telegram bot, primary for RP. Oobabooga-like buttons, [A1111](https://github.com/AUTOMATIC1111/stable-diffusion-webui) API integration e.t.c)
- [mcp-llm](https://github.com/sammcj/mcp-llm) (MCP Server to allow LLMs to call other LLMs)
- [SimpleOllamaUnity](https://github.com/HardCodeDev777/SimpleOllamaUnity) (Unity Engine extension for communicating with Ollama in a few lines of code. Also works at runtime)
- [UnityCodeLama](https://github.com/HardCodeDev777/UnityCodeLama) (Unity Editor tool to analyze scripts via Ollama)
- [NativeMind](https://github.com/NativeMindBrowser/NativeMindExtension) (Private, on-device AI Assistant, no cloud dependencies)
- [GMAI - Gradle Managed AI](https://gmai.premex.se/) (Gradle plugin for automated Ollama lifecycle management during build phases)
- [NOMYO Router](https://github.com/nomyo-ai/nomyo-router) (A transparent Ollama proxy with model deployment aware routing which auto-manages multiple Ollama instances in a given network)
### Supported backends
- [llama.cpp](https://github.com/ggml-org/llama.cpp) project founded by Georgi Gerganov.
### Observability
- [Opik](https://www.comet.com/docs/opik/cookbook/ollama) is an open-source platform to debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards. Opik supports native integration to Ollama.
- [Lunary](https://lunary.ai/docs/integrations/ollama) is the leading open-source LLM observability platform. It provides a variety of enterprise-grade features such as real-time analytics, prompt templates management, PII masking, and comprehensive agent tracing.
- [OpenLIT](https://github.com/openlit/openlit) is an OpenTelemetry-native tool for monitoring Ollama Applications & GPUs using traces and metrics.
- [HoneyHive](https://docs.honeyhive.ai/integrations/ollama) is an AI observability and evaluation platform for AI agents. Use HoneyHive to evaluate agent performance, interrogate failures, and monitor quality in production.
- [Langfuse](https://langfuse.com/docs/integrations/ollama) is an open source LLM observability platform that enables teams to collaboratively monitor, evaluate and debug AI applications.
- [MLflow Tracing](https://mlflow.org/docs/latest/llms/tracing/index.html#automatic-tracing) is an open source LLM observability tool with a convenient API to log and visualize traces, making it easy to debug and evaluate GenAI applications.
### Security
- [Ollama Fortress](https://github.com/ParisNeo/ollama_proxy_server)

553
anthropic/anthropic.go Normal file → Executable file
View File

@@ -1,17 +1,25 @@
package anthropic package anthropic
import ( import (
"bytes"
"context"
"crypto/rand" "crypto/rand"
"encoding/base64" "encoding/base64"
"encoding/json" "encoding/json"
"errors" "errors"
"fmt" "fmt"
"io"
"log/slog" "log/slog"
"net/http" "net/http"
"net/url"
"strconv"
"strings" "strings"
"time" "time"
"github.com/ollama/ollama/api" "github.com/ollama/ollama/api"
"github.com/ollama/ollama/auth"
internalcloud "github.com/ollama/ollama/internal/cloud"
"github.com/ollama/ollama/logutil"
) )
// Error types matching Anthropic API // Error types matching Anthropic API
@@ -60,7 +68,7 @@ type MessagesRequest struct {
Model string `json:"model"` Model string `json:"model"`
MaxTokens int `json:"max_tokens"` MaxTokens int `json:"max_tokens"`
Messages []MessageParam `json:"messages"` Messages []MessageParam `json:"messages"`
System any `json:"system,omitempty"` // string or []ContentBlock System any `json:"system,omitempty"` // string or []map[string]any (JSON-decoded ContentBlock)
Stream bool `json:"stream,omitempty"` Stream bool `json:"stream,omitempty"`
Temperature *float64 `json:"temperature,omitempty"` Temperature *float64 `json:"temperature,omitempty"`
TopP *float64 `json:"top_p,omitempty"` TopP *float64 `json:"top_p,omitempty"`
@@ -75,29 +83,51 @@ type MessagesRequest struct {
// MessageParam represents a message in the request // MessageParam represents a message in the request
type MessageParam struct { type MessageParam struct {
Role string `json:"role"` // "user" or "assistant" Role string `json:"role"` // "user" or "assistant"
Content any `json:"content"` // string or []ContentBlock Content []ContentBlock `json:"content"` // always []ContentBlock; plain strings are normalized on unmarshal
}
func (m *MessageParam) UnmarshalJSON(data []byte) error {
var raw struct {
Role string `json:"role"`
Content json.RawMessage `json:"content"`
}
if err := json.Unmarshal(data, &raw); err != nil {
return err
}
m.Role = raw.Role
var s string
if err := json.Unmarshal(raw.Content, &s); err == nil {
m.Content = []ContentBlock{{Type: "text", Text: &s}}
return nil
}
return json.Unmarshal(raw.Content, &m.Content)
} }
// ContentBlock represents a content block in a message. // ContentBlock represents a content block in a message.
// Text and Thinking use pointers so they serialize as the field being present (even if empty) // Text and Thinking use pointers so they serialize as the field being present (even if empty)
// only when set, which is required for SDK streaming accumulation. // only when set, which is required for SDK streaming accumulation.
type ContentBlock struct { type ContentBlock struct {
Type string `json:"type"` // text, image, tool_use, tool_result, thinking Type string `json:"type"` // text, image, tool_use, tool_result, thinking, server_tool_use, web_search_tool_result
// For text blocks - pointer so field only appears when set (SDK requires it for accumulation) // For text blocks - pointer so field only appears when set (SDK requires it for accumulation)
Text *string `json:"text,omitempty"` Text *string `json:"text,omitempty"`
// For text blocks with citations
Citations []Citation `json:"citations,omitempty"`
// For image blocks // For image blocks
Source *ImageSource `json:"source,omitempty"` Source *ImageSource `json:"source,omitempty"`
// For tool_use blocks // For tool_use and server_tool_use blocks
ID string `json:"id,omitempty"` ID string `json:"id,omitempty"`
Name string `json:"name,omitempty"` Name string `json:"name,omitempty"`
Input any `json:"input,omitempty"` Input api.ToolCallFunctionArguments `json:"input,omitzero"`
// For tool_result blocks // For tool_result and web_search_tool_result blocks
ToolUseID string `json:"tool_use_id,omitempty"` ToolUseID string `json:"tool_use_id,omitempty"`
Content any `json:"content,omitempty"` // string or []ContentBlock Content any `json:"content,omitempty"` // string, []ContentBlock, []WebSearchResult, or WebSearchToolResultError
IsError bool `json:"is_error,omitempty"` IsError bool `json:"is_error,omitempty"`
// For thinking blocks - pointer so field only appears when set (SDK requires it for accumulation) // For thinking blocks - pointer so field only appears when set (SDK requires it for accumulation)
@@ -105,6 +135,30 @@ type ContentBlock struct {
Signature string `json:"signature,omitempty"` Signature string `json:"signature,omitempty"`
} }
// Citation represents a citation in a text block
type Citation struct {
Type string `json:"type"` // "web_search_result_location"
URL string `json:"url"`
Title string `json:"title"`
EncryptedIndex string `json:"encrypted_index,omitempty"`
CitedText string `json:"cited_text,omitempty"`
}
// WebSearchResult represents a single web search result
type WebSearchResult struct {
Type string `json:"type"` // "web_search_result"
URL string `json:"url"`
Title string `json:"title"`
EncryptedContent string `json:"encrypted_content,omitempty"`
PageAge string `json:"page_age,omitempty"`
}
// WebSearchToolResultError represents an error from web search
type WebSearchToolResultError struct {
Type string `json:"type"` // "web_search_tool_result_error"
ErrorCode string `json:"error_code"`
}
// ImageSource represents the source of an image // ImageSource represents the source of an image
type ImageSource struct { type ImageSource struct {
Type string `json:"type"` // "base64" or "url" Type string `json:"type"` // "base64" or "url"
@@ -115,10 +169,13 @@ type ImageSource struct {
// Tool represents a tool definition // Tool represents a tool definition
type Tool struct { type Tool struct {
Type string `json:"type,omitempty"` // "custom" for user-defined tools Type string `json:"type,omitempty"` // "custom" for user-defined tools, or "web_search_20250305" for web search
Name string `json:"name"` Name string `json:"name"`
Description string `json:"description,omitempty"` Description string `json:"description,omitempty"`
InputSchema json.RawMessage `json:"input_schema,omitempty"` InputSchema json.RawMessage `json:"input_schema,omitempty"`
// Web search specific fields
MaxUses int `json:"max_uses,omitempty"`
} }
// ToolChoice controls how the model uses tools // ToolChoice controls how the model uses tools
@@ -211,6 +268,7 @@ type MessageDelta struct {
// DeltaUsage contains cumulative token usage // DeltaUsage contains cumulative token usage
type DeltaUsage struct { type DeltaUsage struct {
InputTokens int `json:"input_tokens"`
OutputTokens int `json:"output_tokens"` OutputTokens int `json:"output_tokens"`
} }
@@ -232,6 +290,8 @@ type StreamErrorEvent struct {
// FromMessagesRequest converts an Anthropic MessagesRequest to an Ollama api.ChatRequest // FromMessagesRequest converts an Anthropic MessagesRequest to an Ollama api.ChatRequest
func FromMessagesRequest(r MessagesRequest) (*api.ChatRequest, error) { func FromMessagesRequest(r MessagesRequest) (*api.ChatRequest, error) {
logutil.Trace("anthropic: converting request", "req", TraceMessagesRequest(r))
var messages []api.Message var messages []api.Message
if r.System != nil { if r.System != nil {
@@ -258,9 +318,10 @@ func FromMessagesRequest(r MessagesRequest) (*api.ChatRequest, error) {
} }
} }
for _, msg := range r.Messages { for i, msg := range r.Messages {
converted, err := convertMessage(msg) converted, err := convertMessage(msg)
if err != nil { if err != nil {
logutil.Trace("anthropic: message conversion failed", "index", i, "role", msg.Role, "err", err)
return nil, err return nil, err
} }
messages = append(messages, converted...) messages = append(messages, converted...)
@@ -287,8 +348,24 @@ func FromMessagesRequest(r MessagesRequest) (*api.ChatRequest, error) {
} }
var tools api.Tools var tools api.Tools
hasBuiltinWebSearch := false
for _, t := range r.Tools { for _, t := range r.Tools {
tool, err := convertTool(t) if strings.HasPrefix(t.Type, "web_search") {
hasBuiltinWebSearch = true
break
}
}
for _, t := range r.Tools {
// Anthropic built-in web_search maps to Ollama function name "web_search".
// If a user-defined tool also uses that name in the same request, drop the
// user-defined one to avoid ambiguous tool-call routing.
if hasBuiltinWebSearch && !strings.HasPrefix(t.Type, "web_search") && t.Name == "web_search" {
logutil.Trace("anthropic: dropping colliding custom web_search tool", "tool", TraceTool(t))
continue
}
tool, _, err := convertTool(t)
if err != nil { if err != nil {
return nil, err return nil, err
} }
@@ -301,15 +378,17 @@ func FromMessagesRequest(r MessagesRequest) (*api.ChatRequest, error) {
} }
stream := r.Stream stream := r.Stream
convertedRequest := &api.ChatRequest{
return &api.ChatRequest{
Model: r.Model, Model: r.Model,
Messages: messages, Messages: messages,
Options: options, Options: options,
Stream: &stream, Stream: &stream,
Tools: tools, Tools: tools,
Think: think, Think: think,
}, nil }
logutil.Trace("anthropic: converted request", "req", TraceChatRequest(convertedRequest))
return convertedRequest, nil
} }
// convertMessage converts an Anthropic MessageParam to Ollama api.Message(s) // convertMessage converts an Anthropic MessageParam to Ollama api.Message(s)
@@ -317,75 +396,70 @@ func convertMessage(msg MessageParam) ([]api.Message, error) {
var messages []api.Message var messages []api.Message
role := strings.ToLower(msg.Role) role := strings.ToLower(msg.Role)
switch content := msg.Content.(type) {
case string:
messages = append(messages, api.Message{Role: role, Content: content})
case []any:
var textContent strings.Builder var textContent strings.Builder
var images []api.ImageData var images []api.ImageData
var toolCalls []api.ToolCall var toolCalls []api.ToolCall
var thinking string var thinking string
var toolResults []api.Message var toolResults []api.Message
textBlocks := 0
imageBlocks := 0
toolUseBlocks := 0
toolResultBlocks := 0
serverToolUseBlocks := 0
webSearchToolResultBlocks := 0
thinkingBlocks := 0
unknownBlocks := 0
for _, block := range content { for _, block := range msg.Content {
blockMap, ok := block.(map[string]any) switch block.Type {
if !ok {
return nil, errors.New("invalid content block format")
}
blockType, _ := blockMap["type"].(string)
switch blockType {
case "text": case "text":
if text, ok := blockMap["text"].(string); ok { textBlocks++
textContent.WriteString(text) if block.Text != nil {
textContent.WriteString(*block.Text)
} }
case "image": case "image":
source, ok := blockMap["source"].(map[string]any) imageBlocks++
if !ok { if block.Source == nil {
logutil.Trace("anthropic: invalid image source", "role", role)
return nil, errors.New("invalid image source") return nil, errors.New("invalid image source")
} }
sourceType, _ := source["type"].(string) if block.Source.Type == "base64" {
if sourceType == "base64" { decoded, err := base64.StdEncoding.DecodeString(block.Source.Data)
data, _ := source["data"].(string)
decoded, err := base64.StdEncoding.DecodeString(data)
if err != nil { if err != nil {
logutil.Trace("anthropic: invalid base64 image data", "role", role, "error", err)
return nil, fmt.Errorf("invalid base64 image data: %w", err) return nil, fmt.Errorf("invalid base64 image data: %w", err)
} }
images = append(images, decoded) images = append(images, decoded)
} else { } else {
return nil, fmt.Errorf("invalid image source type: %s. Only base64 images are supported.", sourceType) logutil.Trace("anthropic: unsupported image source type", "role", role, "source_type", block.Source.Type)
return nil, fmt.Errorf("invalid image source type: %s. Only base64 images are supported.", block.Source.Type)
} }
// URL images would need to be fetched - skip for now
case "tool_use": case "tool_use":
id, ok := blockMap["id"].(string) toolUseBlocks++
if !ok { if block.ID == "" {
logutil.Trace("anthropic: tool_use block missing id", "role", role)
return nil, errors.New("tool_use block missing required 'id' field") return nil, errors.New("tool_use block missing required 'id' field")
} }
name, ok := blockMap["name"].(string) if block.Name == "" {
if !ok { logutil.Trace("anthropic: tool_use block missing name", "role", role)
return nil, errors.New("tool_use block missing required 'name' field") return nil, errors.New("tool_use block missing required 'name' field")
} }
tc := api.ToolCall{ toolCalls = append(toolCalls, api.ToolCall{
ID: id, ID: block.ID,
Function: api.ToolCallFunction{ Function: api.ToolCallFunction{
Name: name, Name: block.Name,
Arguments: block.Input,
}, },
} })
if input, ok := blockMap["input"].(map[string]any); ok {
tc.Function.Arguments = mapToArgs(input)
}
toolCalls = append(toolCalls, tc)
case "tool_result": case "tool_result":
toolUseID, _ := blockMap["tool_use_id"].(string) toolResultBlocks++
var resultContent string var resultContent string
switch c := blockMap["content"].(type) { switch c := block.Content.(type) {
case string: case string:
resultContent = c resultContent = c
case []any: case []any:
@@ -403,13 +477,34 @@ func convertMessage(msg MessageParam) ([]api.Message, error) {
toolResults = append(toolResults, api.Message{ toolResults = append(toolResults, api.Message{
Role: "tool", Role: "tool",
Content: resultContent, Content: resultContent,
ToolCallID: toolUseID, ToolCallID: block.ToolUseID,
}) })
case "thinking": case "thinking":
if t, ok := blockMap["thinking"].(string); ok { thinkingBlocks++
thinking = t if block.Thinking != nil {
thinking = *block.Thinking
} }
case "server_tool_use":
serverToolUseBlocks++
toolCalls = append(toolCalls, api.ToolCall{
ID: block.ID,
Function: api.ToolCallFunction{
Name: block.Name,
Arguments: block.Input,
},
})
case "web_search_tool_result":
webSearchToolResultBlocks++
toolResults = append(toolResults, api.Message{
Role: "tool",
Content: formatWebSearchToolResultContent(block.Content),
ToolCallID: block.ToolUseID,
})
default:
unknownBlocks++
} }
} }
@@ -426,20 +521,111 @@ func convertMessage(msg MessageParam) ([]api.Message, error) {
// Add tool results as separate messages // Add tool results as separate messages
messages = append(messages, toolResults...) messages = append(messages, toolResults...)
logutil.Trace("anthropic: converted block message",
default: "role", role,
return nil, fmt.Errorf("invalid message content type: %T", content) "blocks", len(msg.Content),
} "text", textBlocks,
"image", imageBlocks,
"tool_use", toolUseBlocks,
"tool_result", toolResultBlocks,
"server_tool_use", serverToolUseBlocks,
"web_search_result", webSearchToolResultBlocks,
"thinking", thinkingBlocks,
"unknown", unknownBlocks,
"messages", TraceAPIMessages(messages),
)
return messages, nil return messages, nil
} }
// convertTool converts an Anthropic Tool to an Ollama api.Tool func formatWebSearchToolResultContent(content any) string {
func convertTool(t Tool) (api.Tool, error) { switch c := content.(type) {
case string:
return c
case []WebSearchResult:
var resultContent strings.Builder
for _, item := range c {
if item.Type != "web_search_result" {
continue
}
fmt.Fprintf(&resultContent, "- %s: %s\n", item.Title, item.URL)
}
return resultContent.String()
case []any:
var resultContent strings.Builder
for _, item := range c {
itemMap, ok := item.(map[string]any)
if !ok {
continue
}
switch itemMap["type"] {
case "web_search_result":
title, _ := itemMap["title"].(string)
url, _ := itemMap["url"].(string)
fmt.Fprintf(&resultContent, "- %s: %s\n", title, url)
case "web_search_tool_result_error":
errorCode, _ := itemMap["error_code"].(string)
if errorCode == "" {
return "web_search_tool_result_error"
}
return "web_search_tool_result_error: " + errorCode
}
}
return resultContent.String()
case map[string]any:
if c["type"] == "web_search_tool_result_error" {
errorCode, _ := c["error_code"].(string)
if errorCode == "" {
return "web_search_tool_result_error"
}
return "web_search_tool_result_error: " + errorCode
}
data, err := json.Marshal(c)
if err != nil {
return ""
}
return string(data)
case WebSearchToolResultError:
if c.ErrorCode == "" {
return "web_search_tool_result_error"
}
return "web_search_tool_result_error: " + c.ErrorCode
default:
data, err := json.Marshal(c)
if err != nil {
return ""
}
return string(data)
}
}
// convertTool converts an Anthropic Tool to an Ollama api.Tool, returning true if it's a server tool
func convertTool(t Tool) (api.Tool, bool, error) {
if strings.HasPrefix(t.Type, "web_search") {
props := api.NewToolPropertiesMap()
props.Set("query", api.ToolProperty{
Type: api.PropertyType{"string"},
Description: "The search query to look up on the web",
})
return api.Tool{
Type: "function",
Function: api.ToolFunction{
Name: "web_search",
Description: "Search the web for current information. Use this to find up-to-date information about any topic.",
Parameters: api.ToolFunctionParameters{
Type: "object",
Required: []string{"query"},
Properties: props,
},
},
}, true, nil
}
var params api.ToolFunctionParameters var params api.ToolFunctionParameters
if len(t.InputSchema) > 0 { if len(t.InputSchema) > 0 {
if err := json.Unmarshal(t.InputSchema, &params); err != nil { if err := json.Unmarshal(t.InputSchema, &params); err != nil {
return api.Tool{}, fmt.Errorf("invalid input_schema for tool %q: %w", t.Name, err) logutil.Trace("anthropic: invalid tool schema", "tool", t.Name, "err", err)
return api.Tool{}, false, fmt.Errorf("invalid input_schema for tool %q: %w", t.Name, err)
} }
} }
@@ -450,7 +636,7 @@ func convertTool(t Tool) (api.Tool, error) {
Description: t.Description, Description: t.Description,
Parameters: params, Parameters: params,
}, },
}, nil }, false, nil
} }
// ToMessagesResponse converts an Ollama api.ChatResponse to an Anthropic MessagesResponse // ToMessagesResponse converts an Ollama api.ChatResponse to an Anthropic MessagesResponse
@@ -523,17 +709,19 @@ type StreamConverter struct {
contentIndex int contentIndex int
inputTokens int inputTokens int
outputTokens int outputTokens int
estimatedInputTokens int // Estimated tokens from request (used when actual metrics are 0)
thinkingStarted bool thinkingStarted bool
thinkingDone bool thinkingDone bool
textStarted bool textStarted bool
toolCallsSent map[string]bool toolCallsSent map[string]bool
} }
func NewStreamConverter(id, model string) *StreamConverter { func NewStreamConverter(id, model string, estimatedInputTokens int) *StreamConverter {
return &StreamConverter{ return &StreamConverter{
ID: id, ID: id,
Model: model, Model: model,
firstWrite: true, firstWrite: true,
estimatedInputTokens: estimatedInputTokens,
toolCallsSent: make(map[string]bool), toolCallsSent: make(map[string]bool),
} }
} }
@@ -550,7 +738,11 @@ func (c *StreamConverter) Process(r api.ChatResponse) []StreamEvent {
if c.firstWrite { if c.firstWrite {
c.firstWrite = false c.firstWrite = false
// Use actual metrics if available, otherwise use estimate
c.inputTokens = r.Metrics.PromptEvalCount c.inputTokens = r.Metrics.PromptEvalCount
if c.inputTokens == 0 && c.estimatedInputTokens > 0 {
c.inputTokens = c.estimatedInputTokens
}
events = append(events, StreamEvent{ events = append(events, StreamEvent{
Event: "message_start", Event: "message_start",
@@ -646,6 +838,19 @@ func (c *StreamConverter) Process(r api.ChatResponse) []StreamEvent {
continue continue
} }
// Close thinking block if still open (thinking → tool_use without text in between)
if c.thinkingStarted && !c.thinkingDone {
c.thinkingDone = true
events = append(events, StreamEvent{
Event: "content_block_stop",
Data: ContentBlockStopEvent{
Type: "content_block_stop",
Index: c.contentIndex,
},
})
c.contentIndex++
}
if c.textStarted { if c.textStarted {
events = append(events, StreamEvent{ events = append(events, StreamEvent{
Event: "content_block_stop", Event: "content_block_stop",
@@ -663,7 +868,6 @@ func (c *StreamConverter) Process(r api.ChatResponse) []StreamEvent {
slog.Error("failed to marshal tool arguments", "error", err, "tool_id", tc.ID) slog.Error("failed to marshal tool arguments", "error", err, "tool_id", tc.ID)
continue continue
} }
events = append(events, StreamEvent{ events = append(events, StreamEvent{
Event: "content_block_start", Event: "content_block_start",
Data: ContentBlockStartEvent{ Data: ContentBlockStartEvent{
@@ -673,7 +877,7 @@ func (c *StreamConverter) Process(r api.ChatResponse) []StreamEvent {
Type: "tool_use", Type: "tool_use",
ID: tc.ID, ID: tc.ID,
Name: tc.Function.Name, Name: tc.Function.Name,
Input: map[string]any{}, Input: api.NewToolCallFunctionArguments(),
}, },
}, },
}) })
@@ -721,6 +925,7 @@ func (c *StreamConverter) Process(r api.ChatResponse) []StreamEvent {
}) })
} }
c.inputTokens = r.Metrics.PromptEvalCount
c.outputTokens = r.Metrics.EvalCount c.outputTokens = r.Metrics.EvalCount
stopReason := mapStopReason(r.DoneReason, len(c.toolCallsSent) > 0) stopReason := mapStopReason(r.DoneReason, len(c.toolCallsSent) > 0)
@@ -732,6 +937,7 @@ func (c *StreamConverter) Process(r api.ChatResponse) []StreamEvent {
StopReason: stopReason, StopReason: stopReason,
}, },
Usage: DeltaUsage{ Usage: DeltaUsage{
InputTokens: c.inputTokens,
OutputTokens: c.outputTokens, OutputTokens: c.outputTokens,
}, },
}, },
@@ -768,11 +974,216 @@ func ptr(s string) *string {
return &s return &s
} }
// mapToArgs converts a map to ToolCallFunctionArguments // CountTokensRequest represents an Anthropic count_tokens request
func mapToArgs(m map[string]any) api.ToolCallFunctionArguments { type CountTokensRequest struct {
args := api.NewToolCallFunctionArguments() Model string `json:"model"`
for k, v := range m { Messages []MessageParam `json:"messages"`
args.Set(k, v) System any `json:"system,omitempty"`
Tools []Tool `json:"tools,omitempty"`
Thinking *ThinkingConfig `json:"thinking,omitempty"`
} }
return args
// EstimateInputTokens estimates input tokens from a MessagesRequest (reuses CountTokensRequest logic)
func EstimateInputTokens(req MessagesRequest) int {
return estimateTokens(CountTokensRequest{
Model: req.Model,
Messages: req.Messages,
System: req.System,
Tools: req.Tools,
Thinking: req.Thinking,
})
}
// CountTokensResponse represents an Anthropic count_tokens response
type CountTokensResponse struct {
InputTokens int `json:"input_tokens"`
}
// estimateTokens returns a rough estimate of tokens (len/4).
// TODO: Replace with actual tokenization via Tokenize API for accuracy.
// Current len/4 heuristic is a rough approximation (~4 chars/token average).
func estimateTokens(req CountTokensRequest) int {
var totalLen int
// Count system prompt
totalLen += countAnyContent(req.System)
for _, msg := range req.Messages {
// Count role (always present)
totalLen += len(msg.Role)
// Count content
totalLen += countAnyContent(msg.Content)
}
for _, tool := range req.Tools {
totalLen += len(tool.Name) + len(tool.Description) + len(tool.InputSchema)
}
// Return len/4 as rough token estimate, minimum 1 if there's any content
tokens := totalLen / 4
if tokens == 0 && (len(req.Messages) > 0 || req.System != nil) {
tokens = 1
}
return tokens
}
func countAnyContent(content any) int {
if content == nil {
return 0
}
switch c := content.(type) {
case string:
return len(c)
case []ContentBlock:
total := 0
for _, block := range c {
total += countContentBlock(block)
}
return total
case []any:
total := 0
for _, item := range c {
data, err := json.Marshal(item)
if err != nil {
continue
}
var block ContentBlock
if err := json.Unmarshal(data, &block); err == nil {
total += countContentBlock(block)
}
}
return total
default:
if data, err := json.Marshal(content); err == nil {
return len(data)
}
return 0
}
}
func countContentBlock(block ContentBlock) int {
total := 0
if block.Text != nil {
total += len(*block.Text)
}
if block.Thinking != nil {
total += len(*block.Thinking)
}
if block.Type == "tool_use" || block.Type == "tool_result" {
if data, err := json.Marshal(block); err == nil {
total += len(data)
}
}
return total
}
// OllamaWebSearchRequest represents a request to the Ollama web search API
type OllamaWebSearchRequest struct {
Query string `json:"query"`
MaxResults int `json:"max_results,omitempty"`
}
// OllamaWebSearchResult represents a single search result from Ollama API
type OllamaWebSearchResult struct {
Title string `json:"title"`
URL string `json:"url"`
Content string `json:"content"`
}
// OllamaWebSearchResponse represents the response from the Ollama web search API
type OllamaWebSearchResponse struct {
Results []OllamaWebSearchResult `json:"results"`
}
var WebSearchEndpoint = "https://ollama.com/api/web_search"
func WebSearch(ctx context.Context, query string, maxResults int) (*OllamaWebSearchResponse, error) {
if internalcloud.Disabled() {
logutil.TraceContext(ctx, "anthropic: web search blocked", "reason", "cloud_disabled")
return nil, errors.New(internalcloud.DisabledError("web search is unavailable"))
}
if maxResults <= 0 {
maxResults = 5
}
if maxResults > 10 {
maxResults = 10
}
reqBody := OllamaWebSearchRequest{
Query: query,
MaxResults: maxResults,
}
body, err := json.Marshal(reqBody)
if err != nil {
return nil, fmt.Errorf("failed to marshal web search request: %w", err)
}
searchURL, err := url.Parse(WebSearchEndpoint)
if err != nil {
return nil, fmt.Errorf("failed to parse web search URL: %w", err)
}
logutil.TraceContext(ctx, "anthropic: web search request",
"query", TraceTruncateString(query),
"max_results", maxResults,
"url", searchURL.String(),
)
q := searchURL.Query()
q.Set("ts", strconv.FormatInt(time.Now().Unix(), 10))
searchURL.RawQuery = q.Encode()
signature := ""
if strings.EqualFold(searchURL.Hostname(), "ollama.com") {
challenge := fmt.Sprintf("%s,%s", http.MethodPost, searchURL.RequestURI())
signature, err = auth.Sign(ctx, []byte(challenge))
if err != nil {
return nil, fmt.Errorf("failed to sign web search request: %w", err)
}
}
logutil.TraceContext(ctx, "anthropic: web search auth", "signed", signature != "")
req, err := http.NewRequestWithContext(ctx, "POST", searchURL.String(), bytes.NewReader(body))
if err != nil {
return nil, fmt.Errorf("failed to create web search request: %w", err)
}
req.Header.Set("Content-Type", "application/json")
if signature != "" {
req.Header.Set("Authorization", fmt.Sprintf("Bearer %s", signature))
}
resp, err := http.DefaultClient.Do(req)
if err != nil {
return nil, fmt.Errorf("web search request failed: %w", err)
}
defer resp.Body.Close()
logutil.TraceContext(ctx, "anthropic: web search response", "status", resp.StatusCode)
if resp.StatusCode != http.StatusOK {
respBody, _ := io.ReadAll(resp.Body)
return nil, fmt.Errorf("web search returned status %d: %s", resp.StatusCode, string(respBody))
}
var searchResp OllamaWebSearchResponse
if err := json.NewDecoder(resp.Body).Decode(&searchResp); err != nil {
return nil, fmt.Errorf("failed to decode web search response: %w", err)
}
logutil.TraceContext(ctx, "anthropic: web search results", "count", len(searchResp.Results))
return &searchResp, nil
}
func ConvertOllamaToAnthropicResults(ollamaResults *OllamaWebSearchResponse) []WebSearchResult {
var results []WebSearchResult
for _, r := range ollamaResults.Results {
results = append(results, WebSearchResult{
Type: "web_search_result",
URL: r.URL,
Title: r.Title,
})
}
return results
} }

845
anthropic/anthropic_test.go Normal file → Executable file

File diff suppressed because it is too large Load Diff

352
anthropic/trace.go Normal file
View File

@@ -0,0 +1,352 @@
package anthropic
import (
"encoding/json"
"fmt"
"sort"
"github.com/ollama/ollama/api"
)
// Trace truncation limits.
const (
TraceMaxStringRunes = 240
TraceMaxSliceItems = 8
TraceMaxMapEntries = 16
TraceMaxDepth = 4
)
// TraceTruncateString shortens s to TraceMaxStringRunes, appending a count of
// omitted characters when truncated.
func TraceTruncateString(s string) string {
if len(s) == 0 {
return s
}
runes := []rune(s)
if len(runes) <= TraceMaxStringRunes {
return s
}
return fmt.Sprintf("%s...(+%d chars)", string(runes[:TraceMaxStringRunes]), len(runes)-TraceMaxStringRunes)
}
// TraceJSON round-trips v through JSON and returns a compacted representation.
func TraceJSON(v any) any {
if v == nil {
return nil
}
data, err := json.Marshal(v)
if err != nil {
return map[string]any{"marshal_error": err.Error(), "type": fmt.Sprintf("%T", v)}
}
var out any
if err := json.Unmarshal(data, &out); err != nil {
return TraceTruncateString(string(data))
}
return TraceCompactValue(out, 0)
}
// TraceCompactValue recursively truncates strings, slices, and maps for trace
// output. depth tracks recursion to enforce TraceMaxDepth.
func TraceCompactValue(v any, depth int) any {
if v == nil {
return nil
}
if depth >= TraceMaxDepth {
switch t := v.(type) {
case string:
return TraceTruncateString(t)
case []any:
return fmt.Sprintf("<array len=%d>", len(t))
case map[string]any:
return fmt.Sprintf("<object keys=%d>", len(t))
default:
return fmt.Sprintf("<%T>", v)
}
}
switch t := v.(type) {
case string:
return TraceTruncateString(t)
case []any:
limit := min(len(t), TraceMaxSliceItems)
out := make([]any, 0, limit+1)
for i := range limit {
out = append(out, TraceCompactValue(t[i], depth+1))
}
if len(t) > limit {
out = append(out, fmt.Sprintf("... +%d more items", len(t)-limit))
}
return out
case map[string]any:
keys := make([]string, 0, len(t))
for k := range t {
keys = append(keys, k)
}
sort.Strings(keys)
limit := min(len(keys), TraceMaxMapEntries)
out := make(map[string]any, limit+1)
for i := range limit {
out[keys[i]] = TraceCompactValue(t[keys[i]], depth+1)
}
if len(keys) > limit {
out["__truncated_keys"] = len(keys) - limit
}
return out
default:
return t
}
}
// ---------------------------------------------------------------------------
// Anthropic request/response tracing
// ---------------------------------------------------------------------------
// TraceMessagesRequest returns a compact trace representation of a MessagesRequest.
func TraceMessagesRequest(r MessagesRequest) map[string]any {
return map[string]any{
"model": r.Model,
"max_tokens": r.MaxTokens,
"messages": traceMessageParams(r.Messages),
"system": traceAnthropicContent(r.System),
"stream": r.Stream,
"tools": traceTools(r.Tools),
"tool_choice": TraceJSON(r.ToolChoice),
"thinking": TraceJSON(r.Thinking),
"stop_sequences": r.StopSequences,
"temperature": ptrVal(r.Temperature),
"top_p": ptrVal(r.TopP),
"top_k": ptrVal(r.TopK),
}
}
// TraceMessagesResponse returns a compact trace representation of a MessagesResponse.
func TraceMessagesResponse(r MessagesResponse) map[string]any {
return map[string]any{
"id": r.ID,
"model": r.Model,
"content": TraceJSON(r.Content),
"stop_reason": r.StopReason,
"usage": r.Usage,
}
}
func traceMessageParams(msgs []MessageParam) []map[string]any {
out := make([]map[string]any, 0, len(msgs))
for _, m := range msgs {
out = append(out, map[string]any{
"role": m.Role,
"content": traceAnthropicContent(m.Content),
})
}
return out
}
func traceAnthropicContent(content any) any {
switch c := content.(type) {
case nil:
return nil
case string:
return TraceTruncateString(c)
case []any:
blocks := make([]any, 0, len(c))
for _, block := range c {
blockMap, ok := block.(map[string]any)
if !ok {
blocks = append(blocks, TraceCompactValue(block, 0))
continue
}
blocks = append(blocks, traceAnthropicBlock(blockMap))
}
return blocks
default:
return TraceJSON(c)
}
}
func traceAnthropicBlock(block map[string]any) map[string]any {
blockType, _ := block["type"].(string)
out := map[string]any{"type": blockType}
switch blockType {
case "text":
if text, ok := block["text"].(string); ok {
out["text"] = TraceTruncateString(text)
} else {
out["text"] = TraceCompactValue(block["text"], 0)
}
case "thinking":
if thinking, ok := block["thinking"].(string); ok {
out["thinking"] = TraceTruncateString(thinking)
} else {
out["thinking"] = TraceCompactValue(block["thinking"], 0)
}
case "tool_use", "server_tool_use":
out["id"] = block["id"]
out["name"] = block["name"]
out["input"] = TraceCompactValue(block["input"], 0)
case "tool_result", "web_search_tool_result":
out["tool_use_id"] = block["tool_use_id"]
out["content"] = TraceCompactValue(block["content"], 0)
case "image":
if source, ok := block["source"].(map[string]any); ok {
out["source"] = map[string]any{
"type": source["type"],
"media_type": source["media_type"],
"url": source["url"],
"data_len": len(fmt.Sprint(source["data"])),
}
}
default:
out["block"] = TraceCompactValue(block, 0)
}
return out
}
func traceTools(tools []Tool) []map[string]any {
out := make([]map[string]any, 0, len(tools))
for _, t := range tools {
out = append(out, TraceTool(t))
}
return out
}
// TraceTool returns a compact trace representation of an Anthropic Tool.
func TraceTool(t Tool) map[string]any {
return map[string]any{
"type": t.Type,
"name": t.Name,
"description": TraceTruncateString(t.Description),
"input_schema": TraceJSON(t.InputSchema),
"max_uses": t.MaxUses,
}
}
// ContentBlockTypes returns the type strings from content (when it's []any blocks).
func ContentBlockTypes(content any) []string {
blocks, ok := content.([]any)
if !ok {
return nil
}
types := make([]string, 0, len(blocks))
for _, block := range blocks {
blockMap, ok := block.(map[string]any)
if !ok {
types = append(types, fmt.Sprintf("%T", block))
continue
}
t, _ := blockMap["type"].(string)
types = append(types, t)
}
return types
}
func ptrVal[T any](v *T) any {
if v == nil {
return nil
}
return *v
}
// ---------------------------------------------------------------------------
// Ollama api.* tracing (shared between anthropic and middleware packages)
// ---------------------------------------------------------------------------
// TraceChatRequest returns a compact trace representation of an Ollama ChatRequest.
func TraceChatRequest(req *api.ChatRequest) map[string]any {
if req == nil {
return nil
}
stream := false
if req.Stream != nil {
stream = *req.Stream
}
return map[string]any{
"model": req.Model,
"messages": TraceAPIMessages(req.Messages),
"tools": TraceAPITools(req.Tools),
"stream": stream,
"options": req.Options,
"think": TraceJSON(req.Think),
}
}
// TraceChatResponse returns a compact trace representation of an Ollama ChatResponse.
func TraceChatResponse(resp api.ChatResponse) map[string]any {
return map[string]any{
"model": resp.Model,
"done": resp.Done,
"done_reason": resp.DoneReason,
"message": TraceAPIMessage(resp.Message),
"metrics": TraceJSON(resp.Metrics),
}
}
// TraceAPIMessages returns compact trace representations for a slice of api.Message.
func TraceAPIMessages(msgs []api.Message) []map[string]any {
out := make([]map[string]any, 0, len(msgs))
for _, m := range msgs {
out = append(out, TraceAPIMessage(m))
}
return out
}
// TraceAPIMessage returns a compact trace representation of a single api.Message.
func TraceAPIMessage(m api.Message) map[string]any {
return map[string]any{
"role": m.Role,
"content": TraceTruncateString(m.Content),
"thinking": TraceTruncateString(m.Thinking),
"images": traceImageSizes(m.Images),
"tool_calls": traceToolCalls(m.ToolCalls),
"tool_name": m.ToolName,
"tool_call_id": m.ToolCallID,
}
}
func traceImageSizes(images []api.ImageData) []int {
if len(images) == 0 {
return nil
}
sizes := make([]int, 0, len(images))
for _, img := range images {
sizes = append(sizes, len(img))
}
return sizes
}
// TraceAPITools returns compact trace representations for a slice of api.Tool.
func TraceAPITools(tools api.Tools) []map[string]any {
out := make([]map[string]any, 0, len(tools))
for _, t := range tools {
out = append(out, TraceAPITool(t))
}
return out
}
// TraceAPITool returns a compact trace representation of a single api.Tool.
func TraceAPITool(t api.Tool) map[string]any {
return map[string]any{
"type": t.Type,
"name": t.Function.Name,
"description": TraceTruncateString(t.Function.Description),
"parameters": TraceJSON(t.Function.Parameters),
}
}
// TraceToolCall returns a compact trace representation of an api.ToolCall.
func TraceToolCall(tc api.ToolCall) map[string]any {
return map[string]any{
"id": tc.ID,
"name": tc.Function.Name,
"args": TraceJSON(tc.Function.Arguments),
}
}
func traceToolCalls(tcs []api.ToolCall) []map[string]any {
if len(tcs) == 0 {
return nil
}
out := make([]map[string]any, 0, len(tcs))
for _, tc := range tcs {
out = append(out, TraceToolCall(tc))
}
return out
}

View File

@@ -449,6 +449,16 @@ func (c *Client) Version(ctx context.Context) (string, error) {
return version.Version, nil return version.Version, nil
} }
// CloudStatusExperimental returns whether cloud features are disabled on the server.
func (c *Client) CloudStatusExperimental(ctx context.Context) (*StatusResponse, error) {
var status StatusResponse
if err := c.do(ctx, http.MethodGet, "/api/status", nil, &status); err != nil {
return nil, err
}
return &status, nil
}
// Signout will signout a client for a local ollama server. // Signout will signout a client for a local ollama server.
func (c *Client) Signout(ctx context.Context) error { func (c *Client) Signout(ctx context.Context) error {
return c.do(ctx, http.MethodPost, "/api/signout", nil, nil) return c.do(ctx, http.MethodPost, "/api/signout", nil, nil)

View File

@@ -436,6 +436,7 @@ type ToolProperty struct {
Description string `json:"description,omitempty"` Description string `json:"description,omitempty"`
Enum []any `json:"enum,omitempty"` Enum []any `json:"enum,omitempty"`
Properties *ToolPropertiesMap `json:"properties,omitempty"` Properties *ToolPropertiesMap `json:"properties,omitempty"`
Required []string `json:"required,omitempty"`
} }
// ToTypeScriptType converts a ToolProperty to a TypeScript type string // ToTypeScriptType converts a ToolProperty to a TypeScript type string
@@ -834,6 +835,16 @@ type TokenResponse struct {
Token string `json:"token"` Token string `json:"token"`
} }
type CloudStatus struct {
Disabled bool `json:"disabled"`
Source string `json:"source"`
}
// StatusResponse is the response from [Client.CloudStatusExperimental].
type StatusResponse struct {
Cloud CloudStatus `json:"cloud"`
}
// GenerateResponse is the response passed into [GenerateResponseFunc]. // GenerateResponse is the response passed into [GenerateResponseFunc].
type GenerateResponse struct { type GenerateResponse struct {
// Model is the model name that generated the response. // Model is the model name that generated the response.

View File

@@ -75,9 +75,9 @@ The `-dev` flag enables:
CI builds with Xcode 14.1 for OS compatibility prior to v13. If you want to manually build v11+ support, you can download the older Xcode [here](https://developer.apple.com/services-account/download?path=/Developer_Tools/Xcode_14.1/Xcode_14.1.xip), extract, then `mv ./Xcode.app /Applications/Xcode_14.1.0.app` then activate with: CI builds with Xcode 14.1 for OS compatibility prior to v13. If you want to manually build v11+ support, you can download the older Xcode [here](https://developer.apple.com/services-account/download?path=/Developer_Tools/Xcode_14.1/Xcode_14.1.xip), extract, then `mv ./Xcode.app /Applications/Xcode_14.1.0.app` then activate with:
``` ```
export CGO_CFLAGS=-mmacosx-version-min=12.0 export CGO_CFLAGS="-O3 -mmacosx-version-min=12.0"
export CGO_CXXFLAGS=-mmacosx-version-min=12.0 export CGO_CXXFLAGS="-O3 -mmacosx-version-min=12.0"
export CGO_LDFLAGS=-mmacosx-version-min=12.0 export CGO_LDFLAGS="-mmacosx-version-min=12.0"
export SDKROOT=/Applications/Xcode_14.1.0.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk export SDKROOT=/Applications/Xcode_14.1.0.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk
export DEVELOPER_DIR=/Applications/Xcode_14.1.0.app/Contents/Developer export DEVELOPER_DIR=/Applications/Xcode_14.1.0.app/Contents/Developer
``` ```

View File

@@ -35,6 +35,7 @@ import (
var ( var (
wv = &Webview{} wv = &Webview{}
uiServerPort int uiServerPort int
appStore *store.Store
) )
var debug = strings.EqualFold(os.Getenv("OLLAMA_DEBUG"), "true") || os.Getenv("OLLAMA_DEBUG") == "1" var debug = strings.EqualFold(os.Getenv("OLLAMA_DEBUG"), "true") || os.Getenv("OLLAMA_DEBUG") == "1"
@@ -208,6 +209,7 @@ func main() {
uiServerPort = port uiServerPort = port
st := &store.Store{} st := &store.Store{}
appStore = st
// Enable CORS in development mode // Enable CORS in development mode
if devMode { if devMode {
@@ -253,6 +255,8 @@ func main() {
done <- osrv.Run(octx) done <- osrv.Run(octx)
}() }()
upd := &updater.Updater{Store: st}
uiServer := ui.Server{ uiServer := ui.Server{
Token: token, Token: token,
Restart: func() { Restart: func() {
@@ -267,6 +271,10 @@ func main() {
ToolRegistry: toolRegistry, ToolRegistry: toolRegistry,
Dev: devMode, Dev: devMode,
Logger: slog.Default(), Logger: slog.Default(),
Updater: upd,
UpdateAvailableFunc: func() {
UpdateAvailable("")
},
} }
srv := &http.Server{ srv := &http.Server{
@@ -284,8 +292,20 @@ func main() {
slog.Debug("background desktop server done") slog.Debug("background desktop server done")
}() }()
updater := &updater.Updater{Store: st} upd.StartBackgroundUpdaterChecker(ctx, UpdateAvailable)
updater.StartBackgroundUpdaterChecker(ctx, UpdateAvailable)
// Check for pending updates on startup (show tray notification if update is ready)
if updater.IsUpdatePending() {
// On Windows, the tray is initialized in osRun(). Calling UpdateAvailable
// before that would dereference a nil tray callback.
// TODO: refactor so the update check runs after platform init on all platforms.
if runtime.GOOS == "windows" {
slog.Debug("update pending on startup, deferring tray notification until tray initialization")
} else {
slog.Debug("update pending on startup, showing tray notification")
UpdateAvailable("")
}
}
hasCompletedFirstRun, err := st.HasCompletedFirstRun() hasCompletedFirstRun, err := st.HasCompletedFirstRun()
if err != nil { if err != nil {
@@ -348,6 +368,17 @@ func startHiddenTasks() {
// CLI triggered app startup use-case // CLI triggered app startup use-case
slog.Info("deferring pending update for fast startup") slog.Info("deferring pending update for fast startup")
} else { } else {
// Check if auto-update is enabled before automatically upgrading
settings, err := appStore.Settings()
if err != nil {
slog.Warn("failed to load settings for upgrade check", "error", err)
} else if !settings.AutoUpdateEnabled {
slog.Info("auto-update disabled, skipping automatic upgrade at startup")
// Still show tray notification so user knows update is ready
UpdateAvailable("")
return
}
if err := updater.DoUpgradeAtStartup(); err != nil { if err := updater.DoUpgradeAtStartup(); err != nil {
slog.Info("unable to perform upgrade at startup", "error", err) slog.Info("unable to perform upgrade at startup", "error", err)
// Make sure the restart to upgrade menu shows so we can attempt an interactive upgrade to get authorization // Make sure the restart to upgrade menu shows so we can attempt an interactive upgrade to get authorization

View File

@@ -154,6 +154,10 @@ func handleURLSchemeRequest(urlScheme string) {
} }
func UpdateAvailable(ver string) error { func UpdateAvailable(ver string) error {
if app.t == nil {
slog.Debug("tray not yet initialized, skipping update notification")
return nil
}
return app.t.UpdateAvailable(ver) return app.t.UpdateAvailable(ver)
} }
@@ -165,6 +169,14 @@ func osRun(shutdown func(), hasCompletedFirstRun, startHidden bool) {
log.Fatalf("Failed to start: %s", err) log.Fatalf("Failed to start: %s", err)
} }
// Check for pending updates now that the tray is initialized.
// The platform-independent check in app.go fires before osRun,
// when app.t is still nil, so we must re-check here.
if updater.IsUpdatePending() {
slog.Debug("update pending on startup, showing tray notification")
UpdateAvailable("")
}
signals := make(chan os.Signal, 1) signals := make(chan os.Signal, 1)
signal.Notify(signals, syscall.SIGINT, syscall.SIGTERM) signal.Notify(signals, syscall.SIGINT, syscall.SIGTERM)

View File

@@ -41,6 +41,11 @@ type InferenceCompute struct {
VRAM string VRAM string
} }
type InferenceInfo struct {
Computes []InferenceCompute
DefaultContextLength int
}
func New(s *store.Store, devMode bool) *Server { func New(s *store.Store, devMode bool) *Server {
p := resolvePath("ollama") p := resolvePath("ollama")
return &Server{store: s, bin: p, dev: devMode} return &Server{store: s, bin: p, dev: devMode}
@@ -205,6 +210,11 @@ func (s *Server) cmd(ctx context.Context) (*exec.Cmd, error) {
return nil, err return nil, err
} }
cloudDisabled, err := s.store.CloudDisabled()
if err != nil {
return nil, err
}
cmd := commandContext(ctx, s.bin, "serve") cmd := commandContext(ctx, s.bin, "serve")
cmd.Stdout, cmd.Stderr = s.log, s.log cmd.Stdout, cmd.Stderr = s.log, s.log
@@ -230,6 +240,11 @@ func (s *Server) cmd(ctx context.Context) (*exec.Cmd, error) {
if settings.ContextLength > 0 { if settings.ContextLength > 0 {
env["OLLAMA_CONTEXT_LENGTH"] = strconv.Itoa(settings.ContextLength) env["OLLAMA_CONTEXT_LENGTH"] = strconv.Itoa(settings.ContextLength)
} }
if cloudDisabled {
env["OLLAMA_NO_CLOUD"] = "1"
} else {
env["OLLAMA_NO_CLOUD"] = "0"
}
cmd.Env = []string{} cmd.Env = []string{}
for k, v := range env { for k, v := range env {
cmd.Env = append(cmd.Env, k+"="+v) cmd.Env = append(cmd.Env, k+"="+v)
@@ -262,9 +277,12 @@ func openRotatingLog() (io.WriteCloser, error) {
// Attempt to retrieve inference compute information from the server // Attempt to retrieve inference compute information from the server
// log. Set ctx to timeout to control how long to wait for the logs to appear // log. Set ctx to timeout to control how long to wait for the logs to appear
func GetInferenceComputer(ctx context.Context) ([]InferenceCompute, error) { func GetInferenceInfo(ctx context.Context) (*InferenceInfo, error) {
inference := []InferenceCompute{} info := &InferenceInfo{}
marker := regexp.MustCompile(`inference compute.*library=`) computeMarker := regexp.MustCompile(`inference compute.*library=`)
defaultCtxMarker := regexp.MustCompile(`vram-based default context`)
defaultCtxRegex := regexp.MustCompile(`default_num_ctx=(\d+)`)
q := `inference compute.*%s=["]([^"]*)["]` q := `inference compute.*%s=["]([^"]*)["]`
nq := `inference compute.*%s=(\S+)\s` nq := `inference compute.*%s=(\S+)\s`
type regex struct { type regex struct {
@@ -330,8 +348,8 @@ func GetInferenceComputer(ctx context.Context) ([]InferenceCompute, error) {
scanner := bufio.NewScanner(file) scanner := bufio.NewScanner(file)
for scanner.Scan() { for scanner.Scan() {
line := scanner.Text() line := scanner.Text()
match := marker.FindStringSubmatch(line) // Check for inference compute lines
if len(match) > 0 { if computeMarker.MatchString(line) {
ic := InferenceCompute{ ic := InferenceCompute{
Library: get("library", line), Library: get("library", line),
Variant: get("variant", line), Variant: get("variant", line),
@@ -342,12 +360,25 @@ func GetInferenceComputer(ctx context.Context) ([]InferenceCompute, error) {
} }
slog.Info("Matched", "inference compute", ic) slog.Info("Matched", "inference compute", ic)
inference = append(inference, ic) info.Computes = append(info.Computes, ic)
} else { continue
// Break out on first non matching line after we start matching
if len(inference) > 0 {
return inference, nil
} }
// Check for default context length line
if defaultCtxMarker.MatchString(line) {
match := defaultCtxRegex.FindStringSubmatch(line)
if len(match) > 1 {
numCtx, err := strconv.Atoi(match[1])
if err == nil {
info.DefaultContextLength = numCtx
slog.Info("Matched default context length", "default_num_ctx", numCtx)
}
}
return info, nil
}
// If we've found compute info but hit a non-matching line, return what we have
// This handles older server versions that don't log the default context line
if len(info.Computes) > 0 {
return info, nil
} }
} }
time.Sleep(100 * time.Millisecond) time.Sleep(100 * time.Millisecond)

View File

@@ -111,7 +111,7 @@ func TestServerCmd(t *testing.T) {
for _, want := range tt.want { for _, want := range tt.want {
found := false found := false
for _, env := range cmd.Env { for _, env := range cmd.Env {
if strings.Contains(env, want) { if strings.HasPrefix(env, want) {
found = true found = true
break break
} }
@@ -123,7 +123,7 @@ func TestServerCmd(t *testing.T) {
for _, dont := range tt.dont { for _, dont := range tt.dont {
for _, env := range cmd.Env { for _, env := range cmd.Env {
if strings.Contains(env, dont) { if strings.HasPrefix(env, dont) {
t.Errorf("unexpected environment variable: %s", env) t.Errorf("unexpected environment variable: %s", env)
} }
} }
@@ -136,44 +136,119 @@ func TestServerCmd(t *testing.T) {
} }
} }
func TestGetInferenceComputer(t *testing.T) { func TestServerCmdCloudSettingEnv(t *testing.T) {
tests := []struct {
name string
envValue string
configContent string
want string
}{
{
name: "default cloud enabled",
want: "OLLAMA_NO_CLOUD=0",
},
{
name: "env disables cloud",
envValue: "1",
want: "OLLAMA_NO_CLOUD=1",
},
{
name: "config disables cloud",
configContent: `{"disable_ollama_cloud": true}`,
want: "OLLAMA_NO_CLOUD=1",
},
{
name: "invalid env disables cloud",
envValue: "invalid",
want: "OLLAMA_NO_CLOUD=1",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
tmpHome := t.TempDir()
t.Setenv("HOME", tmpHome)
t.Setenv("USERPROFILE", tmpHome)
t.Setenv("OLLAMA_NO_CLOUD", tt.envValue)
if tt.configContent != "" {
configDir := filepath.Join(tmpHome, ".ollama")
if err := os.MkdirAll(configDir, 0o755); err != nil {
t.Fatalf("mkdir config dir: %v", err)
}
configPath := filepath.Join(configDir, "server.json")
if err := os.WriteFile(configPath, []byte(tt.configContent), 0o644); err != nil {
t.Fatalf("write config: %v", err)
}
}
st := &store.Store{DBPath: filepath.Join(t.TempDir(), "db.sqlite")}
defer st.Close()
s := &Server{store: st}
cmd, err := s.cmd(t.Context())
if err != nil {
t.Fatalf("s.cmd() error = %v", err)
}
found := false
for _, env := range cmd.Env {
if env == tt.want {
found = true
break
}
}
if !found {
t.Fatalf("expected environment variable %q in command env", tt.want)
}
})
}
}
func TestGetInferenceInfo(t *testing.T) {
tests := []struct { tests := []struct {
name string name string
log string log string
exp []InferenceCompute expComputes []InferenceCompute
expDefaultCtxLen int
}{ }{
{ {
name: "metal", name: "metal",
log: `time=2025-06-30T09:23:07.374-07:00 level=DEBUG source=sched.go:108 msg="starting llm scheduler" log: `time=2025-06-30T09:23:07.374-07:00 level=DEBUG source=sched.go:108 msg="starting llm scheduler"
time=2025-06-30T09:23:07.416-07:00 level=INFO source=types.go:130 msg="inference compute" id=0 library=metal variant="" compute="" driver=0.0 name="" total="96.0 GiB" available="96.0 GiB" time=2025-06-30T09:23:07.416-07:00 level=INFO source=types.go:130 msg="inference compute" id=0 library=metal variant="" compute="" driver=0.0 name="" total="96.0 GiB" available="96.0 GiB"
time=2025-06-30T09:23:07.417-07:00 level=INFO source=routes.go:1721 msg="vram-based default context" total_vram="96.0 GiB" default_num_ctx=262144
time=2025-06-30T09:25:56.197-07:00 level=DEBUG source=ggml.go:155 msg="key not found" key=general.alignment default=32 time=2025-06-30T09:25:56.197-07:00 level=DEBUG source=ggml.go:155 msg="key not found" key=general.alignment default=32
`, `,
exp: []InferenceCompute{{ expComputes: []InferenceCompute{{
Library: "metal", Library: "metal",
Driver: "0.0", Driver: "0.0",
VRAM: "96.0 GiB", VRAM: "96.0 GiB",
}}, }},
expDefaultCtxLen: 262144,
}, },
{ {
name: "cpu", name: "cpu",
log: `time=2025-07-01T17:59:51.470Z level=INFO source=gpu.go:377 msg="no compatible GPUs were discovered" log: `time=2025-07-01T17:59:51.470Z level=INFO source=gpu.go:377 msg="no compatible GPUs were discovered"
time=2025-07-01T17:59:51.470Z level=INFO source=types.go:130 msg="inference compute" id=0 library=cpu variant="" compute="" driver=0.0 name="" total="31.3 GiB" available="30.4 GiB" time=2025-07-01T17:59:51.470Z level=INFO source=types.go:130 msg="inference compute" id=0 library=cpu variant="" compute="" driver=0.0 name="" total="31.3 GiB" available="30.4 GiB"
time=2025-07-01T17:59:51.471Z level=INFO source=routes.go:1721 msg="vram-based default context" total_vram="31.3 GiB" default_num_ctx=32768
[GIN] 2025/07/01 - 18:00:09 | 200 | 50.263µs | 100.126.204.152 | HEAD "/" [GIN] 2025/07/01 - 18:00:09 | 200 | 50.263µs | 100.126.204.152 | HEAD "/"
`, `,
exp: []InferenceCompute{{ expComputes: []InferenceCompute{{
Library: "cpu", Library: "cpu",
Driver: "0.0", Driver: "0.0",
VRAM: "31.3 GiB", VRAM: "31.3 GiB",
}}, }},
expDefaultCtxLen: 32768,
}, },
{ {
name: "cuda1", name: "cuda1",
log: `time=2025-07-01T19:33:43.162Z level=DEBUG source=amd_linux.go:419 msg="amdgpu driver not detected /sys/module/amdgpu" log: `time=2025-07-01T19:33:43.162Z level=DEBUG source=amd_linux.go:419 msg="amdgpu driver not detected /sys/module/amdgpu"
releasing cuda driver library releasing cuda driver library
time=2025-07-01T19:33:43.162Z level=INFO source=types.go:130 msg="inference compute" id=GPU-452cac9f-6960-839c-4fb3-0cec83699196 library=cuda variant=v12 compute=6.1 driver=12.7 name="NVIDIA GeForce GT 1030" total="3.9 GiB" available="3.9 GiB" time=2025-07-01T19:33:43.162Z level=INFO source=types.go:130 msg="inference compute" id=GPU-452cac9f-6960-839c-4fb3-0cec83699196 library=cuda variant=v12 compute=6.1 driver=12.7 name="NVIDIA GeForce GT 1030" total="3.9 GiB" available="3.9 GiB"
time=2025-07-01T19:33:43.163Z level=INFO source=routes.go:1721 msg="vram-based default context" total_vram="3.9 GiB" default_num_ctx=4096
[GIN] 2025/07/01 - 18:00:09 | 200 | 50.263µs | 100.126.204.152 | HEAD "/" [GIN] 2025/07/01 - 18:00:09 | 200 | 50.263µs | 100.126.204.152 | HEAD "/"
`, `,
exp: []InferenceCompute{{ expComputes: []InferenceCompute{{
Library: "cuda", Library: "cuda",
Variant: "v12", Variant: "v12",
Compute: "6.1", Compute: "6.1",
@@ -181,6 +256,7 @@ time=2025-07-01T19:33:43.162Z level=INFO source=types.go:130 msg="inference comp
Name: "NVIDIA GeForce GT 1030", Name: "NVIDIA GeForce GT 1030",
VRAM: "3.9 GiB", VRAM: "3.9 GiB",
}}, }},
expDefaultCtxLen: 4096,
}, },
{ {
name: "frank", name: "frank",
@@ -188,9 +264,10 @@ time=2025-07-01T19:33:43.162Z level=INFO source=types.go:130 msg="inference comp
releasing cuda driver library releasing cuda driver library
time=2025-07-01T19:36:13.315Z level=INFO source=types.go:130 msg="inference compute" id=GPU-d6de3398-9932-6902-11ec-fee8e424c8a2 library=cuda variant=v12 compute=7.5 driver=12.8 name="NVIDIA GeForce RTX 2080 Ti" total="10.6 GiB" available="10.4 GiB" time=2025-07-01T19:36:13.315Z level=INFO source=types.go:130 msg="inference compute" id=GPU-d6de3398-9932-6902-11ec-fee8e424c8a2 library=cuda variant=v12 compute=7.5 driver=12.8 name="NVIDIA GeForce RTX 2080 Ti" total="10.6 GiB" available="10.4 GiB"
time=2025-07-01T19:36:13.315Z level=INFO source=types.go:130 msg="inference compute" id=GPU-9abb57639fa80c50 library=rocm variant="" compute=gfx1030 driver=6.3 name=1002:73bf total="16.0 GiB" available="1.3 GiB" time=2025-07-01T19:36:13.315Z level=INFO source=types.go:130 msg="inference compute" id=GPU-9abb57639fa80c50 library=rocm variant="" compute=gfx1030 driver=6.3 name=1002:73bf total="16.0 GiB" available="1.3 GiB"
time=2025-07-01T19:36:13.316Z level=INFO source=routes.go:1721 msg="vram-based default context" total_vram="26.6 GiB" default_num_ctx=32768
[GIN] 2025/07/01 - 18:00:09 | 200 | 50.263µs | 100.126.204.152 | HEAD "/" [GIN] 2025/07/01 - 18:00:09 | 200 | 50.263µs | 100.126.204.152 | HEAD "/"
`, `,
exp: []InferenceCompute{ expComputes: []InferenceCompute{
{ {
Library: "cuda", Library: "cuda",
Variant: "v12", Variant: "v12",
@@ -207,6 +284,20 @@ time=2025-07-01T19:33:43.162Z level=INFO source=types.go:130 msg="inference comp
VRAM: "16.0 GiB", VRAM: "16.0 GiB",
}, },
}, },
expDefaultCtxLen: 32768,
},
{
name: "missing_default_context",
log: `time=2025-06-30T09:23:07.374-07:00 level=DEBUG source=sched.go:108 msg="starting llm scheduler"
time=2025-06-30T09:23:07.416-07:00 level=INFO source=types.go:130 msg="inference compute" id=0 library=metal variant="" compute="" driver=0.0 name="" total="96.0 GiB" available="96.0 GiB"
time=2025-06-30T09:25:56.197-07:00 level=DEBUG source=ggml.go:155 msg="key not found" key=general.alignment default=32
`,
expComputes: []InferenceCompute{{
Library: "metal",
Driver: "0.0",
VRAM: "96.0 GiB",
}},
expDefaultCtxLen: 0, // No default context line, should return 0
}, },
} }
for _, tt := range tests { for _, tt := range tests {
@@ -219,18 +310,21 @@ time=2025-07-01T19:33:43.162Z level=INFO source=types.go:130 msg="inference comp
} }
ctx, cancel := context.WithTimeout(t.Context(), 10*time.Millisecond) ctx, cancel := context.WithTimeout(t.Context(), 10*time.Millisecond)
defer cancel() defer cancel()
ics, err := GetInferenceComputer(ctx) info, err := GetInferenceInfo(ctx)
if err != nil { if err != nil {
t.Fatalf(" failed to get inference compute: %v", err) t.Fatalf("failed to get inference info: %v", err)
} }
if !reflect.DeepEqual(ics, tt.exp) { if !reflect.DeepEqual(info.Computes, tt.expComputes) {
t.Fatalf("got:\n%#v\nwant:\n%#v", ics, tt.exp) t.Fatalf("computes mismatch\ngot:\n%#v\nwant:\n%#v", info.Computes, tt.expComputes)
}
if info.DefaultContextLength != tt.expDefaultCtxLen {
t.Fatalf("default context length mismatch: got %d, want %d", info.DefaultContextLength, tt.expDefaultCtxLen)
} }
}) })
} }
} }
func TestGetInferenceComputerTimeout(t *testing.T) { func TestGetInferenceInfoTimeout(t *testing.T) {
ctx, cancel := context.WithTimeout(t.Context(), 10*time.Millisecond) ctx, cancel := context.WithTimeout(t.Context(), 10*time.Millisecond)
defer cancel() defer cancel()
tmpDir := t.TempDir() tmpDir := t.TempDir()
@@ -239,7 +333,7 @@ func TestGetInferenceComputerTimeout(t *testing.T) {
if err != nil { if err != nil {
t.Fatalf("failed to write log file %s: %s", serverLogPath, err) t.Fatalf("failed to write log file %s: %s", serverLogPath, err)
} }
_, err = GetInferenceComputer(ctx) _, err = GetInferenceInfo(ctx)
if err == nil { if err == nil {
t.Fatal("expected timeout") t.Fatal("expected timeout")
} }

128
app/store/cloud_config.go Normal file
View File

@@ -0,0 +1,128 @@
//go:build windows || darwin
package store
import (
"encoding/json"
"errors"
"fmt"
"os"
"path/filepath"
"github.com/ollama/ollama/envconfig"
)
const serverConfigFilename = "server.json"
type serverConfig struct {
DisableOllamaCloud bool `json:"disable_ollama_cloud,omitempty"`
}
// CloudDisabled returns whether cloud features should be disabled.
// The source of truth is: OLLAMA_NO_CLOUD OR ~/.ollama/server.json:disable_ollama_cloud.
func (s *Store) CloudDisabled() (bool, error) {
disabled, _, err := s.CloudStatus()
return disabled, err
}
// CloudStatus returns whether cloud is disabled and the source of that decision.
// Source is one of: "none", "env", "config", "both".
func (s *Store) CloudStatus() (bool, string, error) {
if err := s.ensureDB(); err != nil {
return false, "", err
}
configDisabled, err := readServerConfigCloudDisabled()
if err != nil {
return false, "", err
}
envDisabled := envconfig.NoCloudEnv()
return envDisabled || configDisabled, cloudStatusSource(envDisabled, configDisabled), nil
}
// SetCloudEnabled writes the cloud setting to ~/.ollama/server.json.
func (s *Store) SetCloudEnabled(enabled bool) error {
if err := s.ensureDB(); err != nil {
return err
}
return setCloudEnabled(enabled)
}
func setCloudEnabled(enabled bool) error {
configPath, err := serverConfigPath()
if err != nil {
return err
}
if err := os.MkdirAll(filepath.Dir(configPath), 0o755); err != nil {
return fmt.Errorf("create server config directory: %w", err)
}
configMap := map[string]any{}
if data, err := os.ReadFile(configPath); err == nil {
if err := json.Unmarshal(data, &configMap); err != nil {
// If the existing file is invalid JSON, overwrite with a fresh object.
configMap = map[string]any{}
}
} else if !errors.Is(err, os.ErrNotExist) {
return fmt.Errorf("read server config: %w", err)
}
configMap["disable_ollama_cloud"] = !enabled
data, err := json.MarshalIndent(configMap, "", " ")
if err != nil {
return fmt.Errorf("marshal server config: %w", err)
}
data = append(data, '\n')
if err := os.WriteFile(configPath, data, 0o644); err != nil {
return fmt.Errorf("write server config: %w", err)
}
return nil
}
func readServerConfigCloudDisabled() (bool, error) {
configPath, err := serverConfigPath()
if err != nil {
return false, err
}
data, err := os.ReadFile(configPath)
if err != nil {
if errors.Is(err, os.ErrNotExist) {
return false, nil
}
return false, fmt.Errorf("read server config: %w", err)
}
var cfg serverConfig
// Invalid or unexpected JSON should not block startup; treat as default.
if json.Unmarshal(data, &cfg) == nil {
return cfg.DisableOllamaCloud, nil
}
return false, nil
}
func serverConfigPath() (string, error) {
home, err := os.UserHomeDir()
if err != nil {
return "", fmt.Errorf("resolve home directory: %w", err)
}
return filepath.Join(home, ".ollama", serverConfigFilename), nil
}
func cloudStatusSource(envDisabled bool, configDisabled bool) string {
switch {
case envDisabled && configDisabled:
return "both"
case envDisabled:
return "env"
case configDisabled:
return "config"
default:
return "none"
}
}

View File

@@ -0,0 +1,130 @@
//go:build windows || darwin
package store
import (
"encoding/json"
"os"
"path/filepath"
"testing"
)
func TestCloudDisabled(t *testing.T) {
tests := []struct {
name string
envValue string
configContent string
wantDisabled bool
wantSource string
}{
{
name: "default enabled",
wantDisabled: false,
wantSource: "none",
},
{
name: "env disables cloud",
envValue: "1",
wantDisabled: true,
wantSource: "env",
},
{
name: "config disables cloud",
configContent: `{"disable_ollama_cloud": true}`,
wantDisabled: true,
wantSource: "config",
},
{
name: "env and config",
envValue: "1",
configContent: `{"disable_ollama_cloud": false}`,
wantDisabled: true,
wantSource: "env",
},
{
name: "invalid config is ignored",
configContent: `{bad`,
wantDisabled: false,
wantSource: "none",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
tmpHome := t.TempDir()
setTestHome(t, tmpHome)
t.Setenv("OLLAMA_NO_CLOUD", tt.envValue)
if tt.configContent != "" {
configDir := filepath.Join(tmpHome, ".ollama")
if err := os.MkdirAll(configDir, 0o755); err != nil {
t.Fatalf("mkdir config dir: %v", err)
}
configPath := filepath.Join(configDir, serverConfigFilename)
if err := os.WriteFile(configPath, []byte(tt.configContent), 0o644); err != nil {
t.Fatalf("write config: %v", err)
}
}
s := &Store{DBPath: filepath.Join(tmpHome, "db.sqlite")}
defer s.Close()
disabled, err := s.CloudDisabled()
if err != nil {
t.Fatalf("CloudDisabled() error = %v", err)
}
if disabled != tt.wantDisabled {
t.Fatalf("CloudDisabled() = %v, want %v", disabled, tt.wantDisabled)
}
statusDisabled, source, err := s.CloudStatus()
if err != nil {
t.Fatalf("CloudStatus() error = %v", err)
}
if statusDisabled != tt.wantDisabled {
t.Fatalf("CloudStatus() disabled = %v, want %v", statusDisabled, tt.wantDisabled)
}
if source != tt.wantSource {
t.Fatalf("CloudStatus() source = %v, want %v", source, tt.wantSource)
}
})
}
}
func TestSetCloudEnabled(t *testing.T) {
tmpHome := t.TempDir()
setTestHome(t, tmpHome)
configDir := filepath.Join(tmpHome, ".ollama")
if err := os.MkdirAll(configDir, 0o755); err != nil {
t.Fatalf("mkdir config dir: %v", err)
}
configPath := filepath.Join(configDir, serverConfigFilename)
if err := os.WriteFile(configPath, []byte(`{"another_key":"value","disable_ollama_cloud":true}`), 0o644); err != nil {
t.Fatalf("seed config: %v", err)
}
s := &Store{DBPath: filepath.Join(tmpHome, "db.sqlite")}
defer s.Close()
if err := s.SetCloudEnabled(true); err != nil {
t.Fatalf("SetCloudEnabled(true) error = %v", err)
}
data, err := os.ReadFile(configPath)
if err != nil {
t.Fatalf("read config: %v", err)
}
var got map[string]any
if err := json.Unmarshal(data, &got); err != nil {
t.Fatalf("unmarshal config: %v", err)
}
if got["disable_ollama_cloud"] != false {
t.Fatalf("disable_ollama_cloud = %v, want false", got["disable_ollama_cloud"])
}
if got["another_key"] != "value" {
t.Fatalf("another_key = %v, want value", got["another_key"])
}
}

View File

@@ -9,12 +9,12 @@ import (
"strings" "strings"
"time" "time"
sqlite3 "github.com/mattn/go-sqlite3" _ "github.com/mattn/go-sqlite3"
) )
// currentSchemaVersion defines the current database schema version. // currentSchemaVersion defines the current database schema version.
// Increment this when making schema changes that require migrations. // Increment this when making schema changes that require migrations.
const currentSchemaVersion = 12 const currentSchemaVersion = 16
// database wraps the SQLite connection. // database wraps the SQLite connection.
// SQLite handles its own locking for concurrent access: // SQLite handles its own locking for concurrent access:
@@ -73,7 +73,7 @@ func (db *database) init() error {
agent BOOLEAN NOT NULL DEFAULT 0, agent BOOLEAN NOT NULL DEFAULT 0,
tools BOOLEAN NOT NULL DEFAULT 0, tools BOOLEAN NOT NULL DEFAULT 0,
working_dir TEXT NOT NULL DEFAULT '', working_dir TEXT NOT NULL DEFAULT '',
context_length INTEGER NOT NULL DEFAULT 4096, context_length INTEGER NOT NULL DEFAULT 0,
window_width INTEGER NOT NULL DEFAULT 0, window_width INTEGER NOT NULL DEFAULT 0,
window_height INTEGER NOT NULL DEFAULT 0, window_height INTEGER NOT NULL DEFAULT 0,
config_migrated BOOLEAN NOT NULL DEFAULT 0, config_migrated BOOLEAN NOT NULL DEFAULT 0,
@@ -82,9 +82,12 @@ func (db *database) init() error {
websearch_enabled BOOLEAN NOT NULL DEFAULT 0, websearch_enabled BOOLEAN NOT NULL DEFAULT 0,
selected_model TEXT NOT NULL DEFAULT '', selected_model TEXT NOT NULL DEFAULT '',
sidebar_open BOOLEAN NOT NULL DEFAULT 0, sidebar_open BOOLEAN NOT NULL DEFAULT 0,
last_home_view TEXT NOT NULL DEFAULT 'launch',
think_enabled BOOLEAN NOT NULL DEFAULT 0, think_enabled BOOLEAN NOT NULL DEFAULT 0,
think_level TEXT NOT NULL DEFAULT '', think_level TEXT NOT NULL DEFAULT '',
cloud_setting_migrated BOOLEAN NOT NULL DEFAULT 0,
remote TEXT NOT NULL DEFAULT '', -- deprecated remote TEXT NOT NULL DEFAULT '', -- deprecated
auto_update_enabled BOOLEAN NOT NULL DEFAULT 1,
schema_version INTEGER NOT NULL DEFAULT %d schema_version INTEGER NOT NULL DEFAULT %d
); );
@@ -244,6 +247,30 @@ func (db *database) migrate() error {
return fmt.Errorf("migrate v11 to v12: %w", err) return fmt.Errorf("migrate v11 to v12: %w", err)
} }
version = 12 version = 12
case 12:
// add cloud_setting_migrated column to settings table
if err := db.migrateV12ToV13(); err != nil {
return fmt.Errorf("migrate v12 to v13: %w", err)
}
version = 13
case 13:
// change default context_length from 4096 to 0 (VRAM-based tiered defaults)
if err := db.migrateV13ToV14(); err != nil {
return fmt.Errorf("migrate v13 to v14: %w", err)
}
version = 14
case 14:
// add auto_update_enabled column to settings table
if err := db.migrateV14ToV15(); err != nil {
return fmt.Errorf("migrate v14 to v15: %w", err)
}
version = 15
case 15:
// add last_home_view column to settings table
if err := db.migrateV15ToV16(); err != nil {
return fmt.Errorf("migrate v15 to v16: %w", err)
}
version = 16
default: default:
// If we have a version we don't recognize, just set it to current // If we have a version we don't recognize, just set it to current
// This might happen during development // This might happen during development
@@ -452,6 +479,67 @@ func (db *database) migrateV11ToV12() error {
return nil return nil
} }
// migrateV12ToV13 adds cloud_setting_migrated to settings.
func (db *database) migrateV12ToV13() error {
_, err := db.conn.Exec(`ALTER TABLE settings ADD COLUMN cloud_setting_migrated BOOLEAN NOT NULL DEFAULT 0`)
if err != nil && !duplicateColumnError(err) {
return fmt.Errorf("add cloud_setting_migrated column: %w", err)
}
_, err = db.conn.Exec(`UPDATE settings SET schema_version = 13`)
if err != nil {
return fmt.Errorf("update schema version: %w", err)
}
return nil
}
// migrateV13ToV14 changes the default context_length from 4096 to 0.
// When context_length is 0, the ollama server uses VRAM-based tiered defaults.
func (db *database) migrateV13ToV14() error {
_, err := db.conn.Exec(`UPDATE settings SET context_length = 0 WHERE context_length = 4096`)
if err != nil {
return fmt.Errorf("update context_length default: %w", err)
}
_, err = db.conn.Exec(`UPDATE settings SET schema_version = 14`)
if err != nil {
return fmt.Errorf("update schema version: %w", err)
}
return nil
}
// migrateV14ToV15 adds the auto_update_enabled column to the settings table
func (db *database) migrateV14ToV15() error {
_, err := db.conn.Exec(`ALTER TABLE settings ADD COLUMN auto_update_enabled BOOLEAN NOT NULL DEFAULT 1`)
if err != nil && !duplicateColumnError(err) {
return fmt.Errorf("add auto_update_enabled column: %w", err)
}
_, err = db.conn.Exec(`UPDATE settings SET schema_version = 15`)
if err != nil {
return fmt.Errorf("update schema version: %w", err)
}
return nil
}
// migrateV15ToV16 adds the last_home_view column to the settings table
func (db *database) migrateV15ToV16() error {
_, err := db.conn.Exec(`ALTER TABLE settings ADD COLUMN last_home_view TEXT NOT NULL DEFAULT 'launch'`)
if err != nil && !duplicateColumnError(err) {
return fmt.Errorf("add last_home_view column: %w", err)
}
_, err = db.conn.Exec(`UPDATE settings SET schema_version = 16`)
if err != nil {
return fmt.Errorf("update schema version: %w", err)
}
return nil
}
// cleanupOrphanedData removes orphaned records that may exist due to the foreign key bug // cleanupOrphanedData removes orphaned records that may exist due to the foreign key bug
func (db *database) cleanupOrphanedData() error { func (db *database) cleanupOrphanedData() error {
_, err := db.conn.Exec(` _, err := db.conn.Exec(`
@@ -482,19 +570,11 @@ func (db *database) cleanupOrphanedData() error {
} }
func duplicateColumnError(err error) bool { func duplicateColumnError(err error) bool {
if sqlite3Err, ok := err.(sqlite3.Error); ok { return err != nil && strings.Contains(err.Error(), "duplicate column name")
return sqlite3Err.Code == sqlite3.ErrError &&
strings.Contains(sqlite3Err.Error(), "duplicate column name")
}
return false
} }
func columnNotExists(err error) bool { func columnNotExists(err error) bool {
if sqlite3Err, ok := err.(sqlite3.Error); ok { return err != nil && strings.Contains(err.Error(), "no such column")
return sqlite3Err.Code == sqlite3.ErrError &&
strings.Contains(sqlite3Err.Error(), "no such column")
}
return false
} }
func (db *database) getAllChats() ([]Chat, error) { func (db *database) getAllChats() ([]Chat, error) {
@@ -1108,9 +1188,9 @@ func (db *database) getSettings() (Settings, error) {
var s Settings var s Settings
err := db.conn.QueryRow(` err := db.conn.QueryRow(`
SELECT expose, survey, browser, models, agent, tools, working_dir, context_length, airplane_mode, turbo_enabled, websearch_enabled, selected_model, sidebar_open, think_enabled, think_level SELECT expose, survey, browser, models, agent, tools, working_dir, context_length, turbo_enabled, websearch_enabled, selected_model, sidebar_open, last_home_view, think_enabled, think_level, auto_update_enabled
FROM settings FROM settings
`).Scan(&s.Expose, &s.Survey, &s.Browser, &s.Models, &s.Agent, &s.Tools, &s.WorkingDir, &s.ContextLength, &s.AirplaneMode, &s.TurboEnabled, &s.WebSearchEnabled, &s.SelectedModel, &s.SidebarOpen, &s.ThinkEnabled, &s.ThinkLevel) `).Scan(&s.Expose, &s.Survey, &s.Browser, &s.Models, &s.Agent, &s.Tools, &s.WorkingDir, &s.ContextLength, &s.TurboEnabled, &s.WebSearchEnabled, &s.SelectedModel, &s.SidebarOpen, &s.LastHomeView, &s.ThinkEnabled, &s.ThinkLevel, &s.AutoUpdateEnabled)
if err != nil { if err != nil {
return Settings{}, fmt.Errorf("get settings: %w", err) return Settings{}, fmt.Errorf("get settings: %w", err)
} }
@@ -1119,16 +1199,58 @@ func (db *database) getSettings() (Settings, error) {
} }
func (db *database) setSettings(s Settings) error { func (db *database) setSettings(s Settings) error {
lastHomeView := strings.ToLower(strings.TrimSpace(s.LastHomeView))
validLaunchView := map[string]struct{}{
"launch": {},
"openclaw": {},
"claude": {},
"codex": {},
"opencode": {},
"droid": {},
"pi": {},
}
if lastHomeView != "chat" {
if _, ok := validLaunchView[lastHomeView]; !ok {
lastHomeView = "launch"
}
}
_, err := db.conn.Exec(` _, err := db.conn.Exec(`
UPDATE settings UPDATE settings
SET expose = ?, survey = ?, browser = ?, models = ?, agent = ?, tools = ?, working_dir = ?, context_length = ?, airplane_mode = ?, turbo_enabled = ?, websearch_enabled = ?, selected_model = ?, sidebar_open = ?, think_enabled = ?, think_level = ? SET expose = ?, survey = ?, browser = ?, models = ?, agent = ?, tools = ?, working_dir = ?, context_length = ?, turbo_enabled = ?, websearch_enabled = ?, selected_model = ?, sidebar_open = ?, last_home_view = ?, think_enabled = ?, think_level = ?, auto_update_enabled = ?
`, s.Expose, s.Survey, s.Browser, s.Models, s.Agent, s.Tools, s.WorkingDir, s.ContextLength, s.AirplaneMode, s.TurboEnabled, s.WebSearchEnabled, s.SelectedModel, s.SidebarOpen, s.ThinkEnabled, s.ThinkLevel) `, s.Expose, s.Survey, s.Browser, s.Models, s.Agent, s.Tools, s.WorkingDir, s.ContextLength, s.TurboEnabled, s.WebSearchEnabled, s.SelectedModel, s.SidebarOpen, lastHomeView, s.ThinkEnabled, s.ThinkLevel, s.AutoUpdateEnabled)
if err != nil { if err != nil {
return fmt.Errorf("set settings: %w", err) return fmt.Errorf("set settings: %w", err)
} }
return nil return nil
} }
func (db *database) isCloudSettingMigrated() (bool, error) {
var migrated bool
err := db.conn.QueryRow("SELECT cloud_setting_migrated FROM settings").Scan(&migrated)
if err != nil {
return false, fmt.Errorf("get cloud setting migration status: %w", err)
}
return migrated, nil
}
func (db *database) setCloudSettingMigrated(migrated bool) error {
_, err := db.conn.Exec("UPDATE settings SET cloud_setting_migrated = ?", migrated)
if err != nil {
return fmt.Errorf("set cloud setting migration status: %w", err)
}
return nil
}
func (db *database) getAirplaneMode() (bool, error) {
var airplaneMode bool
err := db.conn.QueryRow("SELECT airplane_mode FROM settings").Scan(&airplaneMode)
if err != nil {
return false, fmt.Errorf("get airplane_mode: %w", err)
}
return airplaneMode, nil
}
func (db *database) getWindowSize() (int, int, error) { func (db *database) getWindowSize() (int, int, error) {
var width, height int var width, height int
err := db.conn.QueryRow("SELECT window_width, window_height FROM settings").Scan(&width, &height) err := db.conn.QueryRow("SELECT window_width, window_height FROM settings").Scan(&width, &height)

View File

@@ -98,6 +98,82 @@ func TestSchemaMigrations(t *testing.T) {
}) })
} }
func TestMigrationV13ToV14ContextLength(t *testing.T) {
tmpDir := t.TempDir()
dbPath := filepath.Join(tmpDir, "test.db")
db, err := newDatabase(dbPath)
if err != nil {
t.Fatalf("failed to create database: %v", err)
}
defer db.Close()
_, err = db.conn.Exec("UPDATE settings SET context_length = 4096, schema_version = 13")
if err != nil {
t.Fatalf("failed to seed v13 settings row: %v", err)
}
if err := db.migrate(); err != nil {
t.Fatalf("migration from v13 to v14 failed: %v", err)
}
var contextLength int
if err := db.conn.QueryRow("SELECT context_length FROM settings").Scan(&contextLength); err != nil {
t.Fatalf("failed to read context_length: %v", err)
}
if contextLength != 0 {
t.Fatalf("expected context_length to migrate to 0, got %d", contextLength)
}
version, err := db.getSchemaVersion()
if err != nil {
t.Fatalf("failed to get schema version: %v", err)
}
if version != currentSchemaVersion {
t.Fatalf("expected schema version %d, got %d", currentSchemaVersion, version)
}
}
func TestMigrationV15ToV16LastHomeViewDefaultsToLaunch(t *testing.T) {
tmpDir := t.TempDir()
dbPath := filepath.Join(tmpDir, "test.db")
db, err := newDatabase(dbPath)
if err != nil {
t.Fatalf("failed to create database: %v", err)
}
defer db.Close()
if _, err := db.conn.Exec(`
ALTER TABLE settings DROP COLUMN last_home_view;
UPDATE settings SET schema_version = 15;
`); err != nil {
t.Fatalf("failed to seed v15 settings row: %v", err)
}
if err := db.migrate(); err != nil {
t.Fatalf("migration from v15 to v16 failed: %v", err)
}
var lastHomeView string
if err := db.conn.QueryRow("SELECT last_home_view FROM settings").Scan(&lastHomeView); err != nil {
t.Fatalf("failed to read last_home_view: %v", err)
}
if lastHomeView != "launch" {
t.Fatalf("expected last_home_view to default to launch after migration, got %q", lastHomeView)
}
version, err := db.getSchemaVersion()
if err != nil {
t.Fatalf("failed to get schema version: %v", err)
}
if version != currentSchemaVersion {
t.Fatalf("expected schema version %d, got %d", currentSchemaVersion, version)
}
}
func TestChatDeletionWithCascade(t *testing.T) { func TestChatDeletionWithCascade(t *testing.T) {
t.Run("chat deletion cascades to related messages", func(t *testing.T) { t.Run("chat deletion cascades to related messages", func(t *testing.T) {
tmpDir := t.TempDir() tmpDir := t.TempDir()

View File

@@ -127,6 +127,65 @@ func TestNoConfigToMigrate(t *testing.T) {
} }
} }
func TestCloudMigrationFromAirplaneMode(t *testing.T) {
tmpHome := t.TempDir()
setTestHome(t, tmpHome)
t.Setenv("OLLAMA_NO_CLOUD", "")
dbPath := filepath.Join(tmpHome, "db.sqlite")
db, err := newDatabase(dbPath)
if err != nil {
t.Fatalf("failed to create database: %v", err)
}
if _, err := db.conn.Exec("UPDATE settings SET airplane_mode = 1, cloud_setting_migrated = 0"); err != nil {
db.Close()
t.Fatalf("failed to seed airplane migration state: %v", err)
}
db.Close()
s := Store{DBPath: dbPath}
defer s.Close()
// Trigger DB initialization + one-time cloud migration.
if _, err := s.ID(); err != nil {
t.Fatalf("failed to initialize store: %v", err)
}
disabled, err := s.CloudDisabled()
if err != nil {
t.Fatalf("CloudDisabled() error: %v", err)
}
if !disabled {
t.Fatal("expected cloud to be disabled after migrating airplane_mode=true")
}
configPath := filepath.Join(tmpHome, ".ollama", serverConfigFilename)
data, err := os.ReadFile(configPath)
if err != nil {
t.Fatalf("failed to read migrated server config: %v", err)
}
var cfg map[string]any
if err := json.Unmarshal(data, &cfg); err != nil {
t.Fatalf("failed to parse migrated server config: %v", err)
}
if cfg["disable_ollama_cloud"] != true {
t.Fatalf("disable_ollama_cloud = %v, want true", cfg["disable_ollama_cloud"])
}
var airplaneMode, migrated bool
if err := s.db.conn.QueryRow("SELECT airplane_mode, cloud_setting_migrated FROM settings").Scan(&airplaneMode, &migrated); err != nil {
t.Fatalf("failed to read migration flags from DB: %v", err)
}
if !airplaneMode {
t.Fatal("expected legacy airplane_mode value to remain unchanged")
}
if !migrated {
t.Fatal("expected cloud_setting_migrated to be true")
}
}
const ( const (
v1Schema = ` v1Schema = `
CREATE TABLE IF NOT EXISTS settings ( CREATE TABLE IF NOT EXISTS settings (

View File

@@ -149,9 +149,6 @@ type Settings struct {
// ContextLength specifies the context length for the ollama server (using OLLAMA_CONTEXT_LENGTH) // ContextLength specifies the context length for the ollama server (using OLLAMA_CONTEXT_LENGTH)
ContextLength int ContextLength int
// AirplaneMode when true, turns off Ollama Turbo features and only uses local models
AirplaneMode bool
// TurboEnabled indicates if Ollama Turbo features are enabled // TurboEnabled indicates if Ollama Turbo features are enabled
TurboEnabled bool TurboEnabled bool
@@ -169,6 +166,12 @@ type Settings struct {
// SidebarOpen indicates if the chat sidebar is open // SidebarOpen indicates if the chat sidebar is open
SidebarOpen bool SidebarOpen bool
// LastHomeView stores the preferred home route target ("chat" or integration name)
LastHomeView string
// AutoUpdateEnabled indicates if automatic updates should be downloaded
AutoUpdateEnabled bool
} }
type Store struct { type Store struct {
@@ -259,6 +262,40 @@ func (s *Store) ensureDB() error {
} }
} }
// Run one-time migration from legacy airplane_mode behavior.
if err := s.migrateCloudSetting(database); err != nil {
return fmt.Errorf("migrate cloud setting: %w", err)
}
return nil
}
// migrateCloudSetting migrates legacy airplane_mode into server.json exactly once.
// After this, cloud state is sourced from server.json OR OLLAMA_NO_CLOUD.
func (s *Store) migrateCloudSetting(database *database) error {
migrated, err := database.isCloudSettingMigrated()
if err != nil {
return err
}
if migrated {
return nil
}
airplaneMode, err := database.getAirplaneMode()
if err != nil {
return err
}
if airplaneMode {
if err := setCloudEnabled(false); err != nil {
return fmt.Errorf("migrate airplane_mode to cloud disabled: %w", err)
}
}
if err := database.setCloudSettingMigrated(true); err != nil {
return err
}
return nil return nil
} }
@@ -355,6 +392,10 @@ func (s *Store) Settings() (Settings, error) {
} }
} }
if settings.LastHomeView == "" {
settings.LastHomeView = "launch"
}
return settings, nil return settings, nil
} }

View File

@@ -81,6 +81,32 @@ func TestStore(t *testing.T) {
} }
}) })
t.Run("settings default home view is launch", func(t *testing.T) {
loaded, err := s.Settings()
if err != nil {
t.Fatal(err)
}
if loaded.LastHomeView != "launch" {
t.Fatalf("expected default LastHomeView to be launch, got %q", loaded.LastHomeView)
}
})
t.Run("settings empty home view falls back to launch", func(t *testing.T) {
if err := s.SetSettings(Settings{LastHomeView: ""}); err != nil {
t.Fatal(err)
}
loaded, err := s.Settings()
if err != nil {
t.Fatal(err)
}
if loaded.LastHomeView != "launch" {
t.Fatalf("expected empty LastHomeView to fall back to launch, got %q", loaded.LastHomeView)
}
})
t.Run("window size", func(t *testing.T) { t.Run("window size", func(t *testing.T) {
if err := s.SetWindowSize(1024, 768); err != nil { if err := s.SetWindowSize(1024, 768); err != nil {
t.Fatal(err) t.Fatal(err)

View File

@@ -0,0 +1,11 @@
//go:build windows || darwin
package store
import "testing"
func setTestHome(t *testing.T, home string) {
t.Helper()
t.Setenv("HOME", home)
t.Setenv("USERPROFILE", home)
}

View File

@@ -13,7 +13,7 @@ CREATE TABLE IF NOT EXISTS settings (
agent BOOLEAN NOT NULL DEFAULT 0, agent BOOLEAN NOT NULL DEFAULT 0,
tools BOOLEAN NOT NULL DEFAULT 0, tools BOOLEAN NOT NULL DEFAULT 0,
working_dir TEXT NOT NULL DEFAULT '', working_dir TEXT NOT NULL DEFAULT '',
context_length INTEGER NOT NULL DEFAULT 4096, context_length INTEGER NOT NULL DEFAULT 0,
window_width INTEGER NOT NULL DEFAULT 0, window_width INTEGER NOT NULL DEFAULT 0,
window_height INTEGER NOT NULL DEFAULT 0, window_height INTEGER NOT NULL DEFAULT 0,
config_migrated BOOLEAN NOT NULL DEFAULT 0, config_migrated BOOLEAN NOT NULL DEFAULT 0,

35
app/tools/cloud_policy.go Normal file
View File

@@ -0,0 +1,35 @@
//go:build windows || darwin
package tools
import (
"context"
"errors"
"github.com/ollama/ollama/api"
internalcloud "github.com/ollama/ollama/internal/cloud"
)
// ensureCloudEnabledForTool checks cloud policy from the connected Ollama server.
// If policy cannot be determined, this fails closed and blocks the operation.
func ensureCloudEnabledForTool(ctx context.Context, operation string) error {
// Reuse shared message formatting; policy evaluation is still done via
// the connected server's /api/status endpoint below.
disabledMessage := internalcloud.DisabledError(operation)
client, err := api.ClientFromEnvironment()
if err != nil {
return errors.New(disabledMessage + " (unable to verify server cloud policy)")
}
status, err := client.CloudStatusExperimental(ctx)
if err != nil {
return errors.New(disabledMessage + " (unable to verify server cloud policy)")
}
if status.Cloud.Disabled {
return errors.New(disabledMessage)
}
return nil
}

View File

@@ -0,0 +1,73 @@
//go:build windows || darwin
package tools
import (
"context"
"net/http"
"net/http/httptest"
"strings"
"testing"
)
func TestEnsureCloudEnabledForTool(t *testing.T) {
const op = "web search is unavailable"
const disabledPrefix = "ollama cloud is disabled: web search is unavailable"
t.Run("enabled allows tool execution", func(t *testing.T) {
ts := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.URL.Path != "/api/status" {
http.NotFound(w, r)
return
}
w.Header().Set("Content-Type", "application/json")
_, _ = w.Write([]byte(`{"cloud":{"disabled":false,"source":"none"}}`))
}))
t.Cleanup(ts.Close)
t.Setenv("OLLAMA_HOST", ts.URL)
if err := ensureCloudEnabledForTool(context.Background(), op); err != nil {
t.Fatalf("expected nil error, got %v", err)
}
})
t.Run("disabled blocks tool execution", func(t *testing.T) {
ts := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.URL.Path != "/api/status" {
http.NotFound(w, r)
return
}
w.Header().Set("Content-Type", "application/json")
_, _ = w.Write([]byte(`{"cloud":{"disabled":true,"source":"config"}}`))
}))
t.Cleanup(ts.Close)
t.Setenv("OLLAMA_HOST", ts.URL)
err := ensureCloudEnabledForTool(context.Background(), op)
if err == nil {
t.Fatal("expected error, got nil")
}
if got := err.Error(); got != disabledPrefix {
t.Fatalf("unexpected error: %q", got)
}
})
t.Run("status unavailable fails closed", func(t *testing.T) {
ts := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
http.NotFound(w, r)
}))
t.Cleanup(ts.Close)
t.Setenv("OLLAMA_HOST", ts.URL)
err := ensureCloudEnabledForTool(context.Background(), op)
if err == nil {
t.Fatal("expected error, got nil")
}
if got := err.Error(); !strings.Contains(got, disabledPrefix) {
t.Fatalf("expected disabled prefix, got %q", got)
}
if got := err.Error(); !strings.Contains(got, "unable to verify server cloud policy") {
t.Fatalf("expected verification failure detail, got %q", got)
}
})
}

View File

@@ -77,6 +77,10 @@ func (w *WebFetch) Execute(ctx context.Context, args map[string]any) (any, strin
} }
func performWebFetch(ctx context.Context, targetURL string) (*FetchResponse, error) { func performWebFetch(ctx context.Context, targetURL string) (*FetchResponse, error) {
if err := ensureCloudEnabledForTool(ctx, "web fetch is unavailable"); err != nil {
return nil, err
}
reqBody := FetchRequest{URL: targetURL} reqBody := FetchRequest{URL: targetURL}
jsonBody, err := json.Marshal(reqBody) jsonBody, err := json.Marshal(reqBody)
if err != nil { if err != nil {

View File

@@ -93,6 +93,10 @@ func (w *WebSearch) Execute(ctx context.Context, args map[string]any) (any, stri
} }
func performWebSearch(ctx context.Context, query string, maxResults int) (*SearchResponse, error) { func performWebSearch(ctx context.Context, query string, maxResults int) (*SearchResponse, error) {
if err := ensureCloudEnabledForTool(ctx, "web search is unavailable"); err != nil {
return nil, err
}
reqBody := SearchRequest{Query: query, MaxResults: maxResults} reqBody := SearchRequest{Query: query, MaxResults: maxResults}
jsonBody, err := json.Marshal(reqBody) jsonBody, err := json.Marshal(reqBody)

View File

@@ -289,10 +289,12 @@ export class InferenceCompute {
} }
export class InferenceComputeResponse { export class InferenceComputeResponse {
inferenceComputes: InferenceCompute[]; inferenceComputes: InferenceCompute[];
defaultContextLength: number;
constructor(source: any = {}) { constructor(source: any = {}) {
if ('string' === typeof source) source = JSON.parse(source); if ('string' === typeof source) source = JSON.parse(source);
this.inferenceComputes = this.convertValues(source["inferenceComputes"], InferenceCompute); this.inferenceComputes = this.convertValues(source["inferenceComputes"], InferenceCompute);
this.defaultContextLength = source["defaultContextLength"];
} }
convertValues(a: any, classs: any, asMap: boolean = false): any { convertValues(a: any, classs: any, asMap: boolean = false): any {
@@ -406,13 +408,14 @@ export class Settings {
Tools: boolean; Tools: boolean;
WorkingDir: string; WorkingDir: string;
ContextLength: number; ContextLength: number;
AirplaneMode: boolean;
TurboEnabled: boolean; TurboEnabled: boolean;
WebSearchEnabled: boolean; WebSearchEnabled: boolean;
ThinkEnabled: boolean; ThinkEnabled: boolean;
ThinkLevel: string; ThinkLevel: string;
SelectedModel: string; SelectedModel: string;
SidebarOpen: boolean; SidebarOpen: boolean;
LastHomeView: string;
AutoUpdateEnabled: boolean;
constructor(source: any = {}) { constructor(source: any = {}) {
if ('string' === typeof source) source = JSON.parse(source); if ('string' === typeof source) source = JSON.parse(source);
@@ -424,13 +427,14 @@ export class Settings {
this.Tools = source["Tools"]; this.Tools = source["Tools"];
this.WorkingDir = source["WorkingDir"]; this.WorkingDir = source["WorkingDir"];
this.ContextLength = source["ContextLength"]; this.ContextLength = source["ContextLength"];
this.AirplaneMode = source["AirplaneMode"];
this.TurboEnabled = source["TurboEnabled"]; this.TurboEnabled = source["TurboEnabled"];
this.WebSearchEnabled = source["WebSearchEnabled"]; this.WebSearchEnabled = source["WebSearchEnabled"];
this.ThinkEnabled = source["ThinkEnabled"]; this.ThinkEnabled = source["ThinkEnabled"];
this.ThinkLevel = source["ThinkLevel"]; this.ThinkLevel = source["ThinkLevel"];
this.SelectedModel = source["SelectedModel"]; this.SelectedModel = source["SelectedModel"];
this.SidebarOpen = source["SidebarOpen"]; this.SidebarOpen = source["SidebarOpen"];
this.LastHomeView = source["LastHomeView"];
this.AutoUpdateEnabled = source["AutoUpdateEnabled"];
} }
} }
export class SettingsResponse { export class SettingsResponse {
@@ -548,14 +552,12 @@ export class Error {
} }
} }
export class ModelUpstreamResponse { export class ModelUpstreamResponse {
digest?: string; stale: boolean;
pushTime: number;
error?: string; error?: string;
constructor(source: any = {}) { constructor(source: any = {}) {
if ('string' === typeof source) source = JSON.parse(source); if ('string' === typeof source) source = JSON.parse(source);
this.digest = source["digest"]; this.stale = source["stale"];
this.pushTime = source["pushTime"];
this.error = source["error"]; this.error = source["error"];
} }
} }

View File

@@ -0,0 +1,7 @@
<?xml version="1.0" encoding="UTF-8"?>
<!-- Generated by Pixelmator Pro 3.6.17 -->
<svg width="1200" height="1200" viewBox="0 0 1200 1200" xmlns="http://www.w3.org/2000/svg">
<g id="g314">
<path id="path147" fill="#d97757" stroke="none" d="M 233.959793 800.214905 L 468.644287 668.536987 L 472.590637 657.100647 L 468.644287 650.738403 L 457.208069 650.738403 L 417.986633 648.322144 L 283.892639 644.69812 L 167.597321 639.865845 L 54.926208 633.825623 L 26.577238 627.785339 L 3.3e-05 592.751709 L 2.73832 575.27533 L 26.577238 559.248352 L 60.724873 562.228149 L 136.187973 567.382629 L 249.422867 575.194763 L 331.570496 580.026978 L 453.261841 592.671082 L 472.590637 592.671082 L 475.328857 584.859009 L 468.724915 580.026978 L 463.570557 575.194763 L 346.389313 495.785217 L 219.543671 411.865906 L 153.100723 363.543762 L 117.181267 339.060425 L 99.060455 316.107361 L 91.248367 266.01355 L 123.865784 230.093994 L 167.677887 233.073853 L 178.872513 236.053772 L 223.248367 270.201477 L 318.040283 343.570496 L 441.825592 434.738342 L 459.946411 449.798706 L 467.194672 444.64447 L 468.080597 441.020203 L 459.946411 427.409485 L 392.617493 305.718323 L 320.778564 181.932983 L 288.80542 130.630859 L 280.348999 99.865845 C 277.369171 87.221436 275.194641 76.590698 275.194641 63.624268 L 312.322174 13.20813 L 332.8591 6.604126 L 382.389313 13.20813 L 403.248352 31.328979 L 434.013519 101.71814 L 483.865753 212.537048 L 561.181274 363.221497 L 583.812134 407.919434 L 595.892639 449.315491 L 600.40271 461.959839 L 608.214783 461.959839 L 608.214783 454.711609 L 614.577271 369.825623 L 626.335632 265.61084 L 637.771851 131.516846 L 641.718201 93.745117 L 660.402832 48.483276 L 697.530334 24.000122 L 726.52356 37.852417 L 750.362549 72 L 747.060486 94.067139 L 732.886047 186.201416 L 705.100708 330.52356 L 686.979919 427.167847 L 697.530334 427.167847 L 709.61084 415.087341 L 758.496704 350.174561 L 840.644348 247.490051 L 876.885925 206.738342 L 919.167847 161.71814 L 946.308838 140.29541 L 997.61084 140.29541 L 1035.38269 196.429626 L 1018.469849 254.416199 L 965.637634 321.422852 L 921.825562 378.201538 L 859.006714 462.765259 L 819.785278 530.41626 L 823.409424 535.812073 L 832.75177 534.92627 L 974.657776 504.724915 L 1051.328979 490.872559 L 1142.818848 475.167786 L 1184.214844 494.496582 L 1188.724854 514.147644 L 1172.456421 554.335693 L 1074.604126 578.496765 L 959.838989 601.449829 L 788.939636 641.879272 L 786.845764 643.409485 L 789.261841 646.389343 L 866.255127 653.637634 L 899.194702 655.409424 L 979.812134 655.409424 L 1129.932861 666.604187 L 1169.154419 692.537109 L 1192.671265 724.268677 L 1188.724854 748.429688 L 1128.322144 779.194641 L 1046.818848 759.865845 L 856.590759 714.604126 L 791.355774 698.335754 L 782.335693 698.335754 L 782.335693 703.731567 L 836.69812 756.885986 L 936.322205 846.845581 L 1061.073975 962.81897 L 1067.436279 991.490112 L 1051.409424 1014.120911 L 1034.496704 1011.704712 L 924.885986 929.234924 L 882.604126 892.107544 L 786.845764 811.48999 L 780.483276 811.48999 L 780.483276 819.946289 L 802.550415 852.241699 L 919.087341 1027.409424 L 925.127625 1081.127686 L 916.671204 1098.604126 L 886.469849 1109.154419 L 853.288696 1103.114136 L 785.073914 1007.355835 L 714.684631 899.516785 L 657.906067 802.872498 L 650.979858 806.81897 L 617.476624 1167.704834 L 601.771851 1186.147705 L 565.530212 1200 L 535.328857 1177.046997 L 519.302124 1139.919556 L 535.328857 1066.550537 L 554.657776 970.792053 L 570.362488 894.68457 L 584.536926 800.134277 L 592.993347 768.724976 L 592.429626 766.630859 L 585.503479 767.516968 L 514.22821 865.369263 L 405.825531 1011.865906 L 320.053711 1103.677979 L 299.516815 1111.812256 L 263.919525 1093.369263 L 267.221497 1060.429688 L 287.114136 1031.114136 L 405.825531 880.107361 L 477.422913 786.52356 L 523.651062 732.483276 L 523.328918 724.671265 L 520.590698 724.671265 L 205.288605 929.395935 L 149.154434 936.644409 L 124.993355 914.01355 L 127.973183 876.885986 L 139.409409 864.80542 L 234.201385 799.570435 L 233.879227 799.8927 Z"/>
</g>
</svg>

After

Width:  |  Height:  |  Size: 4.0 KiB

View File

@@ -0,0 +1 @@
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 320 320"><path fill="#fff" d="m297.06 130.97c7.26-21.79 4.76-45.66-6.85-65.48-17.46-30.4-52.56-46.04-86.84-38.68-15.25-17.18-37.16-26.95-60.13-26.81-35.04-.08-66.13 22.48-76.91 55.82-22.51 4.61-41.94 18.7-53.31 38.67-17.59 30.32-13.58 68.54 9.92 94.54-7.26 21.79-4.76 45.66 6.85 65.48 17.46 30.4 52.56 46.04 86.84 38.68 15.24 17.18 37.16 26.95 60.13 26.8 35.06.09 66.16-22.49 76.94-55.86 22.51-4.61 41.94-18.7 53.31-38.67 17.57-30.32 13.55-68.51-9.94-94.51zm-120.28 168.11c-14.03.02-27.62-4.89-38.39-13.88.49-.26 1.34-.73 1.89-1.07l63.72-36.8c3.26-1.85 5.26-5.32 5.24-9.07v-89.83l26.93 15.55c.29.14.48.42.52.74v74.39c-.04 33.08-26.83 59.9-59.91 59.97zm-128.84-55.03c-7.03-12.14-9.56-26.37-7.15-40.18.47.28 1.3.79 1.89 1.13l63.72 36.8c3.23 1.89 7.23 1.89 10.47 0l77.79-44.92v31.1c.02.32-.13.63-.38.83l-64.41 37.19c-28.69 16.52-65.33 6.7-81.92-21.95zm-16.77-139.09c7-12.16 18.05-21.46 31.21-26.29 0 .55-.03 1.52-.03 2.2v73.61c-.02 3.74 1.98 7.21 5.23 9.06l77.79 44.91-26.93 15.55c-.27.18-.61.21-.91.08l-64.42-37.22c-28.63-16.58-38.45-53.21-21.95-81.89zm221.26 51.49-77.79-44.92 26.93-15.54c.27-.18.61-.21.91-.08l64.42 37.19c28.68 16.57 38.51 53.26 21.94 81.94-7.01 12.14-18.05 21.44-31.2 26.28v-75.81c.03-3.74-1.96-7.2-5.2-9.06zm26.8-40.34c-.47-.29-1.3-.79-1.89-1.13l-63.72-36.8c-3.23-1.89-7.23-1.89-10.47 0l-77.79 44.92v-31.1c-.02-.32.13-.63.38-.83l64.41-37.16c28.69-16.55 65.37-6.7 81.91 22 6.99 12.12 9.52 26.31 7.15 40.1zm-168.51 55.43-26.94-15.55c-.29-.14-.48-.42-.52-.74v-74.39c.02-33.12 26.89-59.96 60.01-59.94 14.01 0 27.57 4.92 38.34 13.88-.49.26-1.33.73-1.89 1.07l-63.72 36.8c-3.26 1.85-5.26 5.31-5.24 9.06l-.04 89.79zm14.63-31.54 34.65-20.01 34.65 20v40.01l-34.65 20-34.65-20z"/></svg>

After

Width:  |  Height:  |  Size: 1.7 KiB

View File

@@ -0,0 +1 @@
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 320 320"><path d="m297.06 130.97c7.26-21.79 4.76-45.66-6.85-65.48-17.46-30.4-52.56-46.04-86.84-38.68-15.25-17.18-37.16-26.95-60.13-26.81-35.04-.08-66.13 22.48-76.91 55.82-22.51 4.61-41.94 18.7-53.31 38.67-17.59 30.32-13.58 68.54 9.92 94.54-7.26 21.79-4.76 45.66 6.85 65.48 17.46 30.4 52.56 46.04 86.84 38.68 15.24 17.18 37.16 26.95 60.13 26.8 35.06.09 66.16-22.49 76.94-55.86 22.51-4.61 41.94-18.7 53.31-38.67 17.57-30.32 13.55-68.51-9.94-94.51zm-120.28 168.11c-14.03.02-27.62-4.89-38.39-13.88.49-.26 1.34-.73 1.89-1.07l63.72-36.8c3.26-1.85 5.26-5.32 5.24-9.07v-89.83l26.93 15.55c.29.14.48.42.52.74v74.39c-.04 33.08-26.83 59.9-59.91 59.97zm-128.84-55.03c-7.03-12.14-9.56-26.37-7.15-40.18.47.28 1.3.79 1.89 1.13l63.72 36.8c3.23 1.89 7.23 1.89 10.47 0l77.79-44.92v31.1c.02.32-.13.63-.38.83l-64.41 37.19c-28.69 16.52-65.33 6.7-81.92-21.95zm-16.77-139.09c7-12.16 18.05-21.46 31.21-26.29 0 .55-.03 1.52-.03 2.2v73.61c-.02 3.74 1.98 7.21 5.23 9.06l77.79 44.91-26.93 15.55c-.27.18-.61.21-.91.08l-64.42-37.22c-28.63-16.58-38.45-53.21-21.95-81.89zm221.26 51.49-77.79-44.92 26.93-15.54c.27-.18.61-.21.91-.08l64.42 37.19c28.68 16.57 38.51 53.26 21.94 81.94-7.01 12.14-18.05 21.44-31.2 26.28v-75.81c.03-3.74-1.96-7.2-5.2-9.06zm26.8-40.34c-.47-.29-1.3-.79-1.89-1.13l-63.72-36.8c-3.23-1.89-7.23-1.89-10.47 0l-77.79 44.92v-31.1c-.02-.32.13-.63.38-.83l64.41-37.16c28.69-16.55 65.37-6.7 81.91 22 6.99 12.12 9.52 26.31 7.15 40.1zm-168.51 55.43-26.94-15.55c-.29-.14-.48-.42-.52-.74v-74.39c.02-33.12 26.89-59.96 60.01-59.94 14.01 0 27.57 4.92 38.34 13.88-.49.26-1.33.73-1.89 1.07l-63.72 36.8c-3.26 1.85-5.26 5.31-5.24 9.06l-.04 89.79zm14.63-31.54 34.65-20.01 34.65 20v40.01l-34.65 20-34.65-20z"/></svg>

After

Width:  |  Height:  |  Size: 1.7 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 6.2 KiB

View File

@@ -0,0 +1,242 @@
<svg version="1.2" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 500 500" width="500" height="500">
<style>
.s0 { fill: #f6f4f4 }
.s1 { fill: #0b0303 }
.s2 { fill: #ef0011 }
.s3 { fill: #f3e2e2 }
.s4 { fill: #f00212 }
.s5 { fill: #ba000d }
.s6 { fill: #faf1f1 }
.s7 { fill: #0b0100 }
.s8 { fill: #fbedee }
.s9 { fill: #faeaea }
.s10 { fill: #ab797d }
.s11 { fill: #f8eaea }
.s12 { fill: #902021 }
.s13 { fill: #f9eeee }
.s14 { fill: #f6ecec }
.s15 { fill: #080201 }
.s16 { fill: #150100 }
.s17 { fill: #f2e7e7 }
.s18 { fill: #fbe7e8 }
.s19 { fill: #060101 }
.s20 { fill: #f5e7e7 }
.s21 { fill: #fa999e }
.s22 { fill: #c46064 }
.s23 { fill: #180300 }
.s24 { fill: #f6dcdd }
.s25 { fill: #f2e6e6 }
.s26 { fill: #110200 }
.s27 { fill: #eb0011 }
.s28 { fill: #e20010 }
.s29 { fill: #ea0011 }
.s30 { fill: #760007 }
.s31 { fill: #f00514 }
.s32 { fill: #fcebeb }
.s33 { fill: #ecd6d6 }
.s34 { fill: #f5e3e3 }
.s35 { fill: #f5e4e4 }
.s36 { fill: #faf6f6 }
.s37 { fill: #e50010 }
.s38 { fill: #d5000f }
.s39 { fill: #f2e2e3 }
.s40 { fill: #ef1018 }
.s41 { fill: #f4e8e9 }
.s42 { fill: #ef0513 }
.s43 { fill: #f5e5e5 }
.s44 { fill: #f00413 }
.s45 { fill: #f4e9ea }
.s46 { fill: #ed0011 }
.s47 { fill: #e80011 }
.s48 { fill: #e60613 }
.s49 { fill: #f0d6d6 }
.s50 { fill: #fca9ac }
.s51 { fill: #9c000c }
.s52 { fill: #73393b }
</style>
<g>
<path fill-rule="evenodd" class="s0" d="m166.5 52.5q3.5 0 7 0 2.75 2.99 1.5 7-21.27 45.61-20.5 96 39.99 2.76 72 26.5 7.87 6.86 13.5 15.5 42.88-56.39 103.5-92.5 47.35-25.46 101-25 14.52 0.38 23.5 11.5 3.19 7.74 2 16-1.81 7.18-4.5 14-1 0-1 1-5.04 6.05-9 13-1 0-1 1 0 0.5 0 1-12.42 12.15-28.5 19-6.02 36.27-41.5 45-0.83 2.75 0 5 19.02-12.85 41.5-9 10.85-8.09 23.5-13 15.01-6.37 31-2.5 14.09 7.43 14 23.5-2.83 23.25-15.5 43-6.42 9.92-14 19-10.04 8.8-19.5 18-72.02 48.88-156.5 27-19.63 9.6-41.5 10.5-4.59 1.27-9 3 2 1 4 2 20.09-1.11 35 12 25.46 6.95 37.5 30.5 1.26 5.69-1 11-3.38 3.79-7.5 6.5 5.74 10.07 1.5 20.5-7.55 7.47-17.5 3.5-11.01-5.34-22.5-9.5-18.26 10-38.5 13-15.5 0-31 0-26.62-4.54-51-17-4.17 1.33-8 3.5-7.23 5.87-15 11-8.62 2.58-13.5-4.5-1.82 2.32-4.5 3.5-6.06 2.24-12 3.5-7.5 0-15 0-27.42-2.56-50-18.5-18-17.25-23-41.5 0-11.5 0-23 4.12-22.7 25-33 6.95-16.67 22-26.5-20.39-20.8-14.5-49.5 7.01-26.98 28.5-44.5 7.56-5.27 15-10.5-13.09-30.88-7.5-64 3.16-15.57 14.5-26.5 6.85-2.48 8 4.5-6.59 39.53 11 75.5 7.99-0.49 16-2 2.42-34.57 14.5-67.5 8.51-22.23 27.5-36z"/>
</g>
<g>
<path fill-rule="evenodd" class="s1" d="m113.5 401.5q0.48-5.1-1-10-0.91 0.19-1 1-2.46 1.74-5 3.5 5.65 9.54-5 13-32.21 5.55-61-10-32.89-23.11-29.5-63.5 2.96-22.67 23.5-32 7.99-19.75 27-29.5-27.65-23.7-15.5-58.5 7.33-16.82 20.5-29.5 10.79-8.14 22-15.5-16.49-37.08-5.5-76 3.19-6.13 7.5-11.5 1.48-0.89 2 1-5.69 41.09 12.5 78.5 1 1 2 2 9.97-3.24 20.5-4 2 0 4 0 0-7.5 0-15 0.99-42.22 24.5-77 6.12-7.12 14-12-4.65 13.43-10 27-11.93 37.6-9.5 77 49.38 0.7 83.5 36 2.75 4.5 5.5 9 38.99-52.24 93-88.5 45.84-29.03 100-32.5 15.69-1.56 29 6.5 5.68 7.29 3.5 16.5-10.38 33.62-43.5 45-4.39 37.33-41 45-0.79 8.63-6 15.5 1.91 1.83 4.5 2.5 22.27-17.25 50.5-14.5 12.93-9.41 28-15 36.22-8.28 31.5 28.5-15.19 51.69-62.5 77.5-65.92 35.87-138 15.5-19.67 10.42-42 10.5-8.39 2.88-17 5 3.58 6.08 10 9 20.92-1.14 36 13 22.67 5.23 34.5 25.5 3.33 7.13-3.5 11.5-3.88 1.8-8 3 7.36 8.45 6.5 19.5-4.43 5.66-11.5 3.5-12.84-5.67-26-10.5-39.4 21.02-83 10.5-18.85-5.78-36.5-14.5-13.65 4.14-23.5 14.5-9.51 3.74-11-6.5z"/>
</g>
<g>
<path fill-rule="evenodd" class="s2" d="m153.5 173.5q24.62 1.46 46 13.5 12.11 8.1 17.5 21.5 0.74 2.45 0.5 5 0.09 0.81 1 1 1.48-4.9 1-10 5.04 10.48 1.5 22-9.81 27.86-35.5 42.5-26.17 14.97-56 19.5-2.77-0.4-2 1 2.86 1.27 6 1 25.64 1.53 48.5-10 0.34 10.08 2 20 1.08 5.76 5 10 1 1.5 0 3-31.11 20.84-68.5 17.5-23.7-5.7-32.5-28.5-4.39-9.18-3.5-19 15.41 6.23 32 4.5-20.68-6.39-39-18-34.81-27.22-12.5-65.5 11.84-14.83 29-23 4.21 7.66 11.5 12.5 3 1 6 0-26.04-34.62-29-78-0.13-8.46 2-16.5 1 6.5 2 13 3.43 39.53 24.5 73 2.03 2.28 4.5 4 0.5-1.25 1-2.5-1.27-6.54-5-12 0.5-0.75 1-1.5 9.72-3.43 20-4 0.55 10.34 8 17.5 1.94 0.74 4 0.5-17.8-64.6 16.5-122 0.98-1.79 1.5 0-28.21 56.64-13.5 118 1.08 1.43 2.5 0.5 2.21-4.98 2-10.5z"/>
</g>
<g>
<path fill-rule="evenodd" class="s3" d="m454.5 97.5q-18.37-2.97-37-1.5-16.14 2.08-32 5.5 32.38-14.09 67-7.5 1.98 1.22 2 3.5z"/>
</g>
<g>
<path fill-rule="evenodd" class="s4" d="m454.5 97.5q-1.33 11.18-8.5 20-21.81 26.28-55.5 32-1.11-0.2-2 0.5 2.31 2.82 5.5 4.5 1 2 0 4-9.56 11.3-19.5 20 19.71-8.72 31-27 2.68-0.43 5 1-14.24 30.97-48 36.5-9.93 1.71-20 1.5-6.8-0.48-13 1 5.81 6.92 14 11-10.78 16.03-27 26.5 27.16-7.4 38-33.5 4.34 1.35 9 1-9.08 23.84-33 33.5-18.45 6.41-38 7 22.59 8.92 45-1 12.05-5.52 24-11 9.01-1.79 17 2.5 5.28-4.38 11-8 12.8-6.07 27-5 0 0.5 0 1-19.34 2.69-34 15.5 0.5 0.25 1 0.5 17.79-8.09 36-15 2.71-0.79 5-2 2.5-1 5-2 5.53-4.04 11-8 11.7-4.18 24-6.5 7.78-1.36 15 1.5-2.97 18.45-13.5 34-34.92 49.37-94.5 62.5-59.27 12.45-108-23-15.53-12.52-21.5-31.5-2.47-14.26 4-27-3.15 24.41 14 42-4.92-10.28-7-22-1.97-17.63 7-33 47.28-69.5 125.5-100 15.86-3.42 32-5.5 18.63-1.47 37 1.5z"/>
</g>
<g>
<path fill-rule="evenodd" class="s5" d="m86.5 112.5q-1-6.5-2-13 0.7-5.34 3.5-10-1.8 11.32-1.5 23z"/>
</g>
<g>
<path fill-rule="evenodd" class="s6" d="m433.5 97.5q2.22-0.39 4 1-10 13.75-27 14-0.24-2.06 0.5-4 10.3-7.78 22.5-11z"/>
</g>
<g>
<path fill-rule="evenodd" class="s7" d="m407.5 101.5q2.55-0.24 5 0.5-52.87 18.31-84.5 64.5-6.94 7.95-17 11-9.38-2.38-5-11 40.38-48.62 101.5-65z"/>
</g>
<g>
<path fill-rule="evenodd" class="s8" d="m402.5 112.5q3 0 6 0-2.56 8.8-12 7-0.22-1.58 0.5-3 2.72-2.22 5.5-4z"/>
</g>
<g>
</g>
<g>
</g>
<g>
</g>
<g>
</g>
<g>
</g>
<g>
</g>
<g>
</g>
<g>
<path fill-rule="evenodd" class="s9" d="m390.5 149.5q7.77 0.52 15 2-11.29 18.28-31 27 9.94-8.7 19.5-20 1-2 0-4-3.19-1.68-5.5-4.5 0.89-0.7 2-0.5z"/>
</g>
<g>
<path fill-rule="evenodd" class="s10" d="m131.5 145.5q0 7.5 0 15-2 0-4 0 1.06-1.36 3-1-0.48-7.29 1-14z"/>
</g>
<g>
<path fill-rule="evenodd" class="s11" d="m219.5 204.5q-1 4.5-2 9 0.24-2.55-0.5-5-5.39-13.4-17.5-21.5-21.38-12.04-46-13.5 0-2 0-4 36.7-0.86 61.5 26 3.06 4.11 4.5 9z"/>
</g>
<g>
<path fill-rule="evenodd" class="s12" d="m329.5 191.5q6.2-1.48 13-1-3.5 1-7 2-2.9-0.97-6-1z"/>
</g>
<g>
<path fill-rule="evenodd" class="s13" d="m329.5 191.5q3.1 0.03 6 1 9.55 1.31 19 3-10.84 26.1-38 33.5 16.22-10.47 27-26.5-8.19-4.08-14-11z"/>
</g>
<g>
<path fill-rule="evenodd" class="s14" d="m479.5 199.5q-7.22-2.86-15-1.5-12.3 2.32-24 6.5 15.6-13.11 36-11.5 3.63 2.26 3 6.5z"/>
</g>
<g>
<path fill-rule="evenodd" class="s15" d="m193.5 216.5q-12.01 1.52-22 8-2.83 1.29-5.5 3-4.79-4.57-6.5-11-5.04 2.2-9.5-1-3.47-6.4 3.5-3 4.4 0.05 8-2.5 9.22-9.73 21-16 6.3-3.24 12 1-2.9 1.22-6 1.5 2.61 5.74 4.5 12 0.75 3.97 0.5 8z"/>
</g>
<g>
<path fill-rule="evenodd" class="s16" d="m458.5 200.5q3.04-0.24 6 0.5-18.02 7.05-33 19-1 1-2 0 11.53-14.3 29-19.5z"/>
</g>
<g>
<path fill-rule="evenodd" class="s17" d="m178.5 202.5q6.85-0.63 4.5 6-7.6 5.09-6-4 1.08-0.82 1.5-2z"/>
</g>
<g>
<path fill-rule="evenodd" class="s18" d="m469.5 201.5q-2.26 13.65-14.5 22-0.47-2.11 1-4 7.08-8.82 13.5-18z"/>
</g>
<g>
<path fill-rule="evenodd" class="s19" d="m74.5 208.5q8.22-0.2 16 2.5 11.8 4.26 23.5 8.5 5.65-0.63 8-6 2.41 11.83-9.5 13 0.55 3.61 2 7-0.5 1-1 2-4.67-0.94-9.5-1-9.96 0.44-19.5 2.5-5.05-3.55-6.5-9.5-0.75-7.48-0.5-15-6.47 0.15-3-4z"/>
</g>
<g>
<path fill-rule="evenodd" class="s20" d="m429.5 212.5q-2.5 1-5 2-4 0-8 0-14.2-1.07-27 5 15.27-12.44 35-9.5 2.72 1.14 5 2.5z"/>
</g>
<g>
<path fill-rule="evenodd" class="s21" d="m219.5 204.5q0.48 5.1-1 10-0.91-0.19-1-1 1-4.5 2-9z"/>
</g>
<g>
<path fill-rule="evenodd" class="s22" d="m416.5 215.5q0-0.5 0-1 4 0 8 0-2.29 1.21-5 2-1.06-1.36-3-1z"/>
</g>
<g>
<path fill-rule="evenodd" class="s23" d="m416.5 215.5q1.94-0.36 3 1-18.21 6.91-36 15-0.5-0.25-1-0.5 14.66-12.81 34-15.5z"/>
</g>
<g>
<path fill-rule="evenodd" class="s24" d="m193.5 216.5q4.39 1.3 9 3-0.79 1.04-2 1.5-14.77-0.13-29 3.5 9.99-6.48 22-8z"/>
</g>
<g>
<path fill-rule="evenodd" class="s25" d="m98.5 219.5q6.09-0.98 6 5-3.04 0.24-6-0.5-1.84-2.24 0-4.5z"/>
</g>
<g>
<path fill-rule="evenodd" class="s26" d="m176.5 229.5q8.85-1.14 16 4-4.98 1.75-10 0-13.56 14.3-33 19.5-28.06 8.2-55 1 3.32-6.4 10-5.5-0.71 1.47-2 2.5 36.58 4.24 69-14 4.68-2.13 1-5 2.35-0.91 4-2.5z"/>
</g>
<g>
<path fill-rule="evenodd" class="s27" d="m231.5 238.5q1.31-0.2 2 1-3.13 28.62 15 51-16.25 6.75-27-7.5-1-1-2 0 14.73 29.34 46 18.5 1.79 0.52 0 1.5-37.63 16.82-50.5-22.5-5.1-26.48 16.5-42z"/>
</g>
<g>
<path fill-rule="evenodd" class="s28" d="m243.5 259.5q5.88 3.62 10.5 9 12.96 18.46 32.5 29.5-31.51-7.75-43-38.5z"/>
</g>
<g>
<path fill-rule="evenodd" class="s29" d="m203.5 266.5q1.31-0.2 2 1-2.48 22.08 12 39-6.99 1.35-14 0.5 4.59 4.08 10 7-8.71 0.28-14.5-6.5-16.98-22.76 4.5-41z"/>
</g>
<g>
<path fill-rule="evenodd" class="s27" d="m58.5 284.5q9.6-2.17 14.5 6 5.15 14.18-1 28-11.05-13.14-27.5-17.5 5.15-9.9 14-16.5z"/>
</g>
<g>
<path fill-rule="evenodd" class="s30" d="m129.5 288.5q2 1 4 2-3.14 0.27-6-1-0.77-1.4 2-1z"/>
</g>
<g>
<path fill-rule="evenodd" class="s31" d="m56.5 313.5q3.43 5.43 8 10-4.88 0.44-8 4-1.11-0.2-2 0.5 28.91 1.65 38 28.5 0.45 3.16-1 6-11.02-7.01-23-12.5-4.75-3.75-9.5-7.5 1.47 7.42 7 13 8.34 27.18 32 43 0.99 2.41-1.5 3.5-40.25 5.58-66.5-25.5-15.67-22.01-8-48 10.46-23.87 34.5-15z"/>
</g>
<g>
<path fill-rule="evenodd" class="s32" d="m45.5 317.5q4.03-0.25 8 0.5 2.46 4.16-2 6-6.04 2.01-9-3.5 1.26-1.85 3-3z"/>
</g>
<g>
<path fill-rule="evenodd" class="s33" d="m56.5 313.5q4.91 3.14 9.5 7 0.88 2.25-1.5 3-4.57-4.57-8-10z"/>
</g>
<g>
<path fill-rule="evenodd" class="s34" d="m198.5 319.5q-11.1 11.56-27 15.5-15.75 4.88-32 2.5 28.81-3.69 54-18.5 2.65-0.96 5 0.5z"/>
</g>
<g>
<path fill-rule="evenodd" class="s4" d="m198.5 319.5q1.44 0.68 2.5 2 2.41 8.23 6 16 1.2 2.64-0.5 5-30.65 21.41-68 18.5-25.16-6.17-32.5-30.5 6.96 4.99 15.5 6.5 8.99 0.75 18 0.5 16.25 2.38 32-2.5 15.9-3.94 27-15.5z"/>
</g>
<g>
<path fill-rule="evenodd" class="s35" d="m92.5 356.5q-9.09-26.85-38-28.5 0.89-0.7 2-0.5 25.47-4.89 35.5 19 0.75 4.98 0.5 10z"/>
</g>
<g>
<path fill-rule="evenodd" class="s36" d="m72.5 335.5q3.62-0.38 5 3-4.22 1.83-5-3z"/>
</g>
<g>
<path fill-rule="evenodd" class="s37" d="m223.5 336.5q5.59-0.48 11 1-4.04 4.16-8.5 8-5.99-3.8-2.5-9z"/>
</g>
<g>
<path fill-rule="evenodd" class="s38" d="m90.5 334.5q0.59-1.54 2-0.5 3.94 5.45 9 10 7 6 14 12-6.91-1.7-13-6-6.21-7.72-12-15.5z"/>
</g>
<g>
<path fill-rule="evenodd" class="s39" d="m261.5 346.5q-3.54-2.44-8-3.5-6.98-0.75-14-0.5 0.63-1.08 2-1.5 13.82-2.52 26 4-2.63 1.98-6 1.5z"/>
</g>
<g>
<path fill-rule="evenodd" class="s40" d="m239.5 342.5q7.02-0.25 14 0.5 4.46 1.06 8 3.5-5.2 2.35-10 5.5-3.88 4.65-9 7.5-9.89-3.09-9.5-13 2.36-3.63 6.5-4z"/>
</g>
<g>
<path fill-rule="evenodd" class="s41" d="m214.5 349.5q-21.43 15.48-48 16 22.82-5.9 43-18.5 3.64-1.12 5 2.5z"/>
</g>
<g>
<path fill-rule="evenodd" class="s42" d="m214.5 349.5q5.96 7.2 13.5 13 1 1 0 2-28.58 23.34-65.5 20.5-18.15-4.24-27.5-19.5 1.13 0.94 2.5 1.5 14.7 1.42 29-1.5 26.57-0.52 48-16z"/>
</g>
<g>
<path fill-rule="evenodd" class="s43" d="m302.5 373.5q-14.74-16.73-37-19-4.55 0.25-9 1 25.3-10.24 43.5 11 2.85 2.91 2.5 7z"/>
</g>
<g>
<path fill-rule="evenodd" class="s44" d="m302.5 373.5q0.21 2.44-2 3.5-28.69 7.6-50.5-12.5-0.06-6.71 6.5-9 4.45-0.75 9-1 22.26 2.27 37 19z"/>
</g>
<g>
<path fill-rule="evenodd" class="s45" d="m100.5 356.5q5.42 2.71 11 5.5-13.04 7.54-18.5 21.5-7.57-7.14-10.5-17 5.58 1.54 10 5.5 4.2 0.84 5.5-3.5 1.41-5.99 2.5-12z"/>
</g>
<g>
<path fill-rule="evenodd" class="s8" d="m83.5 394.5q-18.9-10.15-29.5-29-1.54-3.52-2-7 5.79 2.39 10 7 7.82 16.63 21.5 29z"/>
</g>
<g>
<path fill-rule="evenodd" class="s46" d="m232.5 365.5q17.6 6.19 10.5 23-10.6 10.42-25.5 11.5-25.94 3.21-49-9 36.75-1.65 64-25.5z"/>
</g>
<g>
<path fill-rule="evenodd" class="s47" d="m113.5 367.5q7.7-0.01 9.5 7-9.69 7.19-18.5 15.5-7.23 5.76-5.5-3.5 3.12-12.84 14.5-19z"/>
</g>
<g>
<path fill-rule="evenodd" class="s29" d="m126.5 380.5q7.88-0.4 12 6.5-8.5 7.25-17 14.5-5.62-12.55 5-21z"/>
</g>
<g>
<path fill-rule="evenodd" class="s48" d="m283.5 385.5q3.22 2.95 7 5.5 2.8 4.03 6 7.5 0.42 2.77-2 4-15.5-9.75-31-19.5-1.79-0.98 0-1.5 9.96 2.49 20 4z"/>
</g>
<g>
<path fill-rule="evenodd" class="s49" d="m283.5 385.5q8.71-1.27 11.5 7 1.22 2.9 1.5 6-3.2-3.47-6-7.5-3.78-2.55-7-5.5z"/>
</g>
<g>
<path fill-rule="evenodd" class="s50" d="m83.5 394.5q1.88-0.06 3 1.5-2.25 0.88-3-1.5z"/>
</g>
<g>
<path fill-rule="evenodd" class="s51" d="m258.5 392.5q3.51 0.41 0 2.5-2.33 1.93-5 2 2.61-2.28 5-4.5z"/>
</g>
<g>
<path fill-rule="evenodd" class="s52" d="m111.5 392.5q0.09-0.81 1-1 1.48 4.9 1 10-1-4.5-2-9z"/>
</g>
</svg>

After

Width:  |  Height:  |  Size: 13 KiB

View File

@@ -0,0 +1,7 @@
<svg xmlns="http://www.w3.org/2000/svg" version="1.1" xmlns:xlink="http://www.w3.org/1999/xlink" width="512" height="512"><svg width="512" height="512" viewBox="0 0 512 512" fill="none" xmlns="http://www.w3.org/2000/svg">
<rect width="512" height="512" fill="#131010"></rect>
<path d="M320 224V352H192V224H320Z" fill="#5A5858"></path>
<path fill-rule="evenodd" clip-rule="evenodd" d="M384 416H128V96H384V416ZM320 160H192V352H320V160Z" fill="white"></path>
</svg><style>@media (prefers-color-scheme: light) { :root { filter: none; } }
@media (prefers-color-scheme: dark) { :root { filter: none; } }
</style></svg>

After

Width:  |  Height:  |  Size: 612 B

View File

@@ -0,0 +1,9 @@
<?xml version="1.0" encoding="UTF-8"?>
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 800 800">
<rect width="800" height="800" rx="160" fill="#fff"/>
<path fill="#000" fill-rule="evenodd" d="
M165.29 165.29 H517.36 V400 H400 V517.36 H282.65 V634.72 H165.29 Z
M282.65 282.65 V400 H400 V282.65 Z
"/>
<path fill="#000" d="M517.36 400 H634.72 V634.72 H517.36 Z"/>
</svg>

After

Width:  |  Height:  |  Size: 389 B

View File

@@ -0,0 +1,9 @@
<?xml version="1.0" encoding="UTF-8"?>
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 800 800">
<rect width="800" height="800" rx="160" fill="#000"/>
<path fill="#fff" fill-rule="evenodd" d="
M165.29 165.29 H517.36 V400 H400 V517.36 H282.65 V634.72 H165.29 Z
M282.65 282.65 V400 H400 V282.65 Z
"/>
<path fill="#fff" d="M517.36 400 H634.72 V634.72 H517.36 Z"/>
</svg>

After

Width:  |  Height:  |  Size: 389 B

View File

@@ -4,7 +4,6 @@ import {
ChatEvent, ChatEvent,
DownloadEvent, DownloadEvent,
ErrorEvent, ErrorEvent,
InferenceCompute,
InferenceComputeResponse, InferenceComputeResponse,
ModelCapabilitiesResponse, ModelCapabilitiesResponse,
Model, Model,
@@ -27,6 +26,12 @@ declare module "@/gotypes" {
Model.prototype.isCloud = function (): boolean { Model.prototype.isCloud = function (): boolean {
return this.model.endsWith("cloud"); return this.model.endsWith("cloud");
}; };
export type CloudStatusSource = "env" | "config" | "both" | "none";
export interface CloudStatusResponse {
disabled: boolean;
source: CloudStatusSource;
}
// Helper function to convert Uint8Array to base64 // Helper function to convert Uint8Array to base64
function uint8ArrayToBase64(uint8Array: Uint8Array): string { function uint8ArrayToBase64(uint8Array: Uint8Array): string {
const chunkSize = 0x8000; // 32KB chunks to avoid stack overflow const chunkSize = 0x8000; // 32KB chunks to avoid stack overflow
@@ -156,7 +161,7 @@ export async function getModels(query?: string): Promise<Model[]> {
// Add query if it's in the registry and not already in the list // Add query if it's in the registry and not already in the list
if (!exactMatch) { if (!exactMatch) {
const result = await getModelUpstreamInfo(new Model({ model: query })); const result = await getModelUpstreamInfo(new Model({ model: query }));
const existsUpstream = !!result.digest && !result.error; const existsUpstream = result.exists;
if (existsUpstream) { if (existsUpstream) {
filteredModels.push(new Model({ model: query })); filteredModels.push(new Model({ model: query }));
} }
@@ -285,6 +290,28 @@ export async function updateSettings(settings: Settings): Promise<{
}; };
} }
export async function updateCloudSetting(
enabled: boolean,
): Promise<CloudStatusResponse> {
const response = await fetch(`${API_BASE}/api/v1/cloud`, {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify({ enabled }),
});
if (!response.ok) {
const error = await response.text();
throw new Error(error || "Failed to update cloud setting");
}
const data = await response.json();
return {
disabled: Boolean(data.disabled),
source: (data.source as CloudStatusSource) || "none",
};
}
export async function renameChat(chatId: string, title: string): Promise<void> { export async function renameChat(chatId: string, title: string): Promise<void> {
const response = await fetch(`${API_BASE}/api/v1/chat/${chatId}/rename`, { const response = await fetch(`${API_BASE}/api/v1/chat/${chatId}/rename`, {
method: "PUT", method: "PUT",
@@ -312,7 +339,7 @@ export async function deleteChat(chatId: string): Promise<void> {
// Get upstream information for model staleness checking // Get upstream information for model staleness checking
export async function getModelUpstreamInfo( export async function getModelUpstreamInfo(
model: Model, model: Model,
): Promise<{ digest?: string; pushTime: number; error?: string }> { ): Promise<{ stale: boolean; exists: boolean; error?: string }> {
try { try {
const response = await fetch(`${API_BASE}/api/v1/model/upstream`, { const response = await fetch(`${API_BASE}/api/v1/model/upstream`, {
method: "POST", method: "POST",
@@ -326,22 +353,22 @@ export async function getModelUpstreamInfo(
if (!response.ok) { if (!response.ok) {
console.warn( console.warn(
`Failed to check upstream digest for ${model.model}: ${response.status}`, `Failed to check upstream for ${model.model}: ${response.status}`,
); );
return { pushTime: 0 }; return { stale: false, exists: false };
} }
const data = await response.json(); const data = await response.json();
if (data.error) { if (data.error) {
console.warn(`Upstream digest check: ${data.error}`); console.warn(`Upstream check: ${data.error}`);
return { error: data.error, pushTime: 0 }; return { stale: false, exists: false, error: data.error };
} }
return { digest: data.digest, pushTime: data.pushTime || 0 }; return { stale: !!data.stale, exists: true };
} catch (error) { } catch (error) {
console.warn(`Error checking model staleness:`, error); console.warn(`Error checking model staleness:`, error);
return { pushTime: 0 }; return { stale: false, exists: false };
} }
} }
@@ -379,7 +406,7 @@ export async function* pullModel(
} }
} }
export async function getInferenceCompute(): Promise<InferenceCompute[]> { export async function getInferenceCompute(): Promise<InferenceComputeResponse> {
const response = await fetch(`${API_BASE}/api/v1/inference-compute`); const response = await fetch(`${API_BASE}/api/v1/inference-compute`);
if (!response.ok) { if (!response.ok) {
throw new Error( throw new Error(
@@ -388,8 +415,7 @@ export async function getInferenceCompute(): Promise<InferenceCompute[]> {
} }
const data = await response.json(); const data = await response.json();
const inferenceComputeResponse = new InferenceComputeResponse(data); return new InferenceComputeResponse(data);
return inferenceComputeResponse.inferenceComputes || [];
} }
export async function fetchHealth(): Promise<boolean> { export async function fetchHealth(): Promise<boolean> {
@@ -414,3 +440,16 @@ export async function fetchHealth(): Promise<boolean> {
return false; return false;
} }
} }
export async function getCloudStatus(): Promise<CloudStatusResponse | null> {
const response = await fetch(`${API_BASE}/api/v1/cloud`);
if (!response.ok) {
throw new Error(`Failed to fetch cloud status: ${response.status}`);
}
const data = await response.json();
return {
disabled: Boolean(data.disabled),
source: (data.source as CloudStatusSource) || "none",
};
}

View File

@@ -17,11 +17,15 @@ import {
} from "@/hooks/useChats"; } from "@/hooks/useChats";
import { useNavigate } from "@tanstack/react-router"; import { useNavigate } from "@tanstack/react-router";
import { useSelectedModel } from "@/hooks/useSelectedModel"; import { useSelectedModel } from "@/hooks/useSelectedModel";
import { useHasVisionCapability } from "@/hooks/useModelCapabilities"; import {
useHasVisionCapability,
useHasToolsCapability,
} from "@/hooks/useModelCapabilities";
import { useUser } from "@/hooks/useUser"; import { useUser } from "@/hooks/useUser";
import { DisplayLogin } from "@/components/DisplayLogin"; import { DisplayLogin } from "@/components/DisplayLogin";
import { ErrorEvent, Message } from "@/gotypes"; import { ErrorEvent, Message } from "@/gotypes";
import { useSettings } from "@/hooks/useSettings"; import { useSettings } from "@/hooks/useSettings";
import { useCloudStatus } from "@/hooks/useCloudStatus";
import { ThinkButton } from "./ThinkButton"; import { ThinkButton } from "./ThinkButton";
import { ErrorMessage } from "./ErrorMessage"; import { ErrorMessage } from "./ErrorMessage";
import { processFiles } from "@/utils/fileValidation"; import { processFiles } from "@/utils/fileValidation";
@@ -141,19 +145,14 @@ function ChatForm({
const { const {
settings: { settings: {
webSearchEnabled, webSearchEnabled,
airplaneMode,
thinkEnabled, thinkEnabled,
thinkLevel: settingsThinkLevel, thinkLevel: settingsThinkLevel,
}, },
setSettings, setSettings,
} = useSettings(); } = useSettings();
const { cloudDisabled } = useCloudStatus();
// current supported models for web search const supportsWebSearch = useHasToolsCapability(selectedModel?.model);
const modelLower = selectedModel?.model.toLowerCase() || "";
const supportsWebSearch =
modelLower.startsWith("gpt-oss") ||
modelLower.startsWith("qwen3") ||
modelLower.startsWith("deepseek-v3");
// Use per-chat thinking level instead of global // Use per-chat thinking level instead of global
const thinkLevel: ThinkingLevel = const thinkLevel: ThinkingLevel =
settingsThinkLevel === "none" || !settingsThinkLevel settingsThinkLevel === "none" || !settingsThinkLevel
@@ -180,6 +179,12 @@ function ChatForm({
setSettings, setSettings,
]); ]);
useEffect(() => {
if (cloudDisabled && webSearchEnabled) {
setSettings({ WebSearchEnabled: false });
}
}, [cloudDisabled, webSearchEnabled, setSettings]);
const removeFile = (index: number) => { const removeFile = (index: number) => {
setMessage((prev) => ({ setMessage((prev) => ({
...prev, ...prev,
@@ -234,19 +239,19 @@ function ChatForm({
// Determine if login banner should be shown // Determine if login banner should be shown
const shouldShowLoginBanner = const shouldShowLoginBanner =
!cloudDisabled &&
!isLoadingUser && !isLoadingUser &&
!isAuthenticated && !isAuthenticated &&
((webSearchEnabled && supportsWebSearch) || ((webSearchEnabled && supportsWebSearch) || selectedModel?.isCloud());
(selectedModel?.isCloud() && !airplaneMode));
// Determine which feature to highlight in the banner // Determine which feature to highlight in the banner
const getActiveFeatureForBanner = () => { const getActiveFeatureForBanner = () => {
if (cloudDisabled) return null;
if (!isAuthenticated) { if (!isAuthenticated) {
if (loginPromptFeature) return loginPromptFeature; if (loginPromptFeature) return loginPromptFeature;
if (webSearchEnabled && selectedModel?.isCloud() && !airplaneMode) if (webSearchEnabled && selectedModel?.isCloud()) return "webSearch";
return "webSearch";
if (webSearchEnabled) return "webSearch"; if (webSearchEnabled) return "webSearch";
if (selectedModel?.isCloud() && !airplaneMode) return "turbo"; if (selectedModel?.isCloud()) return "turbo";
} }
return null; return null;
}; };
@@ -269,11 +274,12 @@ function ChatForm({
useEffect(() => { useEffect(() => {
if ( if (
isAuthenticated || isAuthenticated ||
(!webSearchEnabled && !!selectedModel?.isCloud() && !airplaneMode) cloudDisabled ||
(!webSearchEnabled && !!selectedModel?.isCloud())
) { ) {
setLoginPromptFeature(null); setLoginPromptFeature(null);
} }
}, [isAuthenticated, webSearchEnabled, selectedModel, airplaneMode]); }, [isAuthenticated, webSearchEnabled, selectedModel, cloudDisabled]);
// When entering edit mode, populate the composition with existing data // When entering edit mode, populate the composition with existing data
useEffect(() => { useEffect(() => {
@@ -465,20 +471,27 @@ function ChatForm({
const handleSubmit = async () => { const handleSubmit = async () => {
if (!message.content.trim() || isStreaming || isDownloading) return; if (!message.content.trim() || isStreaming || isDownloading) return;
if (cloudDisabled && selectedModel?.isCloud()) {
return;
}
// Check if cloud mode is enabled but user is not authenticated // Check if cloud mode is enabled but user is not authenticated
if (shouldShowLoginBanner) { if (shouldShowLoginBanner) {
return; return;
} }
// Prepare attachments for submission // Prepare attachments for submission, excluding unsupported images
const attachmentsToSend: FileAttachment[] = message.attachments.map( const attachmentsToSend: FileAttachment[] = message.attachments
(att) => ({ .filter(
(att) => hasVisionCapability || !isImageFile(att.filename),
)
.map((att) => ({
filename: att.filename, filename: att.filename,
data: att.data || new Uint8Array(0), // Empty data for existing files data: att.data || new Uint8Array(0), // Empty data for existing files
}), }));
);
const useWebSearch = supportsWebSearch && webSearchEnabled && !airplaneMode; const useWebSearch =
supportsWebSearch && webSearchEnabled && !cloudDisabled;
const useThink = modelSupportsThinkingLevels const useThink = modelSupportsThinkingLevels
? thinkLevel ? thinkLevel
: supportsThinkToggling : supportsThinkToggling
@@ -725,10 +738,17 @@ function ChatForm({
)} )}
{(message.attachments.length > 0 || message.fileErrors.length > 0) && ( {(message.attachments.length > 0 || message.fileErrors.length > 0) && (
<div className="flex gap-2 overflow-x-auto px-3 pt pb-3 w-full scrollbar-hide"> <div className="flex gap-2 overflow-x-auto px-3 pt pb-3 w-full scrollbar-hide">
{message.attachments.map((attachment, index) => ( {message.attachments.map((attachment, index) => {
const isUnsupportedImage =
!hasVisionCapability && isImageFile(attachment.filename);
return (
<div <div
key={attachment.id} key={attachment.id}
className="group flex items-center gap-2 py-2 px-3 rounded-lg bg-neutral-50 dark:bg-neutral-700/50 hover:bg-neutral-100 dark:hover:bg-neutral-700 transition-colors flex-shrink-0" className={`group flex items-center gap-2 py-2 px-3 rounded-lg transition-colors flex-shrink-0 ${
isUnsupportedImage
? "bg-red-50 dark:bg-red-900/20 border border-red-200 dark:border-red-800"
: "bg-neutral-50 dark:bg-neutral-700/50 hover:bg-neutral-100 dark:hover:bg-neutral-700"
}`}
> >
{isImageFile(attachment.filename) ? ( {isImageFile(attachment.filename) ? (
<ImageThumbnail <ImageThumbnail
@@ -753,9 +773,16 @@ function ChatForm({
/> />
</svg> </svg>
)} )}
<span className="text-sm text-neutral-700 dark:text-neutral-300 max-w-[150px] truncate"> <div className="flex flex-col min-w-0">
<span className={`text-sm max-w-36 truncate ${isUnsupportedImage ? "text-red-700 dark:text-red-300" : "text-neutral-700 dark:text-neutral-300"}`}>
{attachment.filename} {attachment.filename}
</span> </span>
{isUnsupportedImage && (
<span className="text-xs text-red-600 dark:text-red-400 opacity-75">
This model does not support images
</span>
)}
</div>
<button <button
type="button" type="button"
onClick={() => removeFile(index)} onClick={() => removeFile(index)}
@@ -777,7 +804,8 @@ function ChatForm({
</svg> </svg>
</button> </button>
</div> </div>
))} );
})}
{message.fileErrors.map((fileError, index) => ( {message.fileErrors.map((fileError, index) => (
<div <div
key={`error-${index}`} key={`error-${index}`}
@@ -899,7 +927,7 @@ function ChatForm({
)} )}
<WebSearchButton <WebSearchButton
ref={webSearchButtonRef} ref={webSearchButtonRef}
isVisible={supportsWebSearch && airplaneMode === false} isVisible={supportsWebSearch && cloudDisabled === false}
isActive={webSearchEnabled} isActive={webSearchEnabled}
onToggle={() => { onToggle={() => {
if (!webSearchEnabled && !isAuthenticated) { if (!webSearchEnabled && !isAuthenticated) {
@@ -940,6 +968,7 @@ function ChatForm({
!isDownloading && !isDownloading &&
(!message.content.trim() || (!message.content.trim() ||
shouldShowLoginBanner || shouldShowLoginBanner ||
(cloudDisabled && selectedModel?.isCloud()) ||
message.fileErrors.length > 0) message.fileErrors.length > 0)
} }
className={`flex items-center justify-center h-9 w-9 rounded-full disabled:cursor-default cursor-pointer bg-black text-white dark:bg-white dark:text-black disabled:opacity-10 focus:outline-none focus:ring-2 focus:ring-blue-500`} className={`flex items-center justify-center h-9 w-9 rounded-full disabled:cursor-default cursor-pointer bg-black text-white dark:bg-white dark:text-black disabled:opacity-10 focus:outline-none focus:ring-2 focus:ring-blue-500`}

View File

@@ -6,12 +6,13 @@ import { getChat } from "@/api";
import { Link } from "@/components/ui/link"; import { Link } from "@/components/ui/link";
import { useState, useRef, useEffect, useCallback, useMemo } from "react"; import { useState, useRef, useEffect, useCallback, useMemo } from "react";
import { ChatsResponse } from "@/gotypes"; import { ChatsResponse } from "@/gotypes";
import { CogIcon } from "@heroicons/react/24/outline"; import { CogIcon, RocketLaunchIcon } from "@heroicons/react/24/outline";
// there's a hidden debug feature to copy a chat's data to the clipboard by // there's a hidden debug feature to copy a chat's data to the clipboard by
// holding shift and clicking this many times within this many seconds // holding shift and clicking this many times within this many seconds
const DEBUG_SHIFT_CLICKS_REQUIRED = 5; const DEBUG_SHIFT_CLICKS_REQUIRED = 5;
const DEBUG_SHIFT_CLICK_WINDOW_MS = 7000; // 7 seconds const DEBUG_SHIFT_CLICK_WINDOW_MS = 7000; // 7 seconds
const launchSidebarRequestedKey = "ollama.launchSidebarRequested";
interface ChatSidebarProps { interface ChatSidebarProps {
currentChatId?: string; currentChatId?: string;
@@ -267,8 +268,7 @@ export function ChatSidebar({ currentChatId }: ChatSidebarProps) {
<Link <Link
href="/c/new" href="/c/new"
mask={{ to: "/" }} mask={{ to: "/" }}
className={`flex w-full items-center gap-3 rounded-lg px-2 py-2 text-left text-sm text-neutral-700 hover:bg-neutral-100 dark:hover:bg-neutral-800 dark:text-neutral-100 ${ className={`flex w-full items-center gap-3 rounded-lg px-2 py-2 text-left text-sm text-neutral-700 hover:bg-neutral-100 dark:hover:bg-neutral-800 dark:text-neutral-100 ${currentChatId === "new" ? "bg-neutral-100 dark:bg-neutral-800" : ""
currentChatId === "new" ? "bg-neutral-100 dark:bg-neutral-800" : ""
}`} }`}
draggable={false} draggable={false}
> >
@@ -283,6 +283,23 @@ export function ChatSidebar({ currentChatId }: ChatSidebarProps) {
</svg> </svg>
<span className="truncate">New Chat</span> <span className="truncate">New Chat</span>
</Link> </Link>
<Link
to="/c/$chatId"
params={{ chatId: "launch" }}
onClick={() => {
if (currentChatId !== "launch") {
sessionStorage.setItem(launchSidebarRequestedKey, "1");
}
}}
className={`flex w-full items-center gap-3 rounded-lg px-2 py-2 text-left text-sm text-neutral-700 hover:bg-neutral-100 dark:hover:bg-neutral-800 dark:text-neutral-100 cursor-pointer ${currentChatId === "launch"
? "bg-neutral-100 dark:bg-neutral-800"
: ""
}`}
draggable={false}
>
<RocketLaunchIcon className="h-5 w-5 stroke-current" />
<span className="truncate">Launch</span>
</Link>
{isWindows && ( {isWindows && (
<Link <Link
href="/settings" href="/settings"
@@ -304,8 +321,7 @@ export function ChatSidebar({ currentChatId }: ChatSidebarProps) {
{group.chats.map((chat) => ( {group.chats.map((chat) => (
<div <div
key={chat.id} key={chat.id}
className={`allow-context-menu flex items-center relative text-sm text-neutral-800 dark:text-neutral-400 rounded-lg hover:bg-neutral-100 dark:hover:bg-neutral-800 ${ className={`allow-context-menu flex items-center relative text-sm text-neutral-800 dark:text-neutral-400 rounded-lg hover:bg-neutral-100 dark:hover:bg-neutral-800 ${chat.id === currentChatId
chat.id === currentChatId
? "bg-neutral-100 text-black dark:bg-neutral-800" ? "bg-neutral-100 text-black dark:bg-neutral-800"
: "" : ""
}`} }`}

View File

@@ -10,6 +10,7 @@ interface CopyButtonProps {
showLabels?: boolean; showLabels?: boolean;
className?: string; className?: string;
title?: string; title?: string;
onCopy?: () => void;
} }
const CopyButton: React.FC<CopyButtonProps> = ({ const CopyButton: React.FC<CopyButtonProps> = ({
@@ -20,6 +21,7 @@ const CopyButton: React.FC<CopyButtonProps> = ({
showLabels = false, showLabels = false,
className = "", className = "",
title = "", title = "",
onCopy,
}) => { }) => {
const [isCopied, setIsCopied] = useState(false); const [isCopied, setIsCopied] = useState(false);
@@ -48,12 +50,14 @@ const CopyButton: React.FC<CopyButtonProps> = ({
} }
setIsCopied(true); setIsCopied(true);
onCopy?.();
setTimeout(() => setIsCopied(false), 2000); setTimeout(() => setIsCopied(false), 2000);
} catch (error) { } catch (error) {
console.error("Clipboard API failed, falling back to plain text", error); console.error("Clipboard API failed, falling back to plain text", error);
try { try {
await navigator.clipboard.writeText(content); await navigator.clipboard.writeText(content);
setIsCopied(true); setIsCopied(true);
onCopy?.();
setTimeout(() => setIsCopied(false), 2000); setTimeout(() => setIsCopied(false), 2000);
} catch (fallbackError) { } catch (fallbackError) {
console.error("Fallback copy also failed:", fallbackError); console.error("Fallback copy also failed:", fallbackError);

View File

@@ -0,0 +1,133 @@
import { useSettings } from "@/hooks/useSettings";
import CopyButton from "@/components/CopyButton";
interface LaunchCommand {
id: string;
name: string;
command: string;
description: string;
icon: string;
darkIcon?: string;
iconClassName?: string;
borderless?: boolean;
}
const LAUNCH_COMMANDS: LaunchCommand[] = [
{
id: "openclaw",
name: "OpenClaw",
command: "ollama launch openclaw",
description: "Personal AI with 100+ skills",
icon: "/launch-icons/openclaw.svg",
},
{
id: "claude",
name: "Claude",
command: "ollama launch claude",
description: "Anthropic's coding tool with subagents",
icon: "/launch-icons/claude.svg",
iconClassName: "h-7 w-7",
},
{
id: "codex",
name: "Codex",
command: "ollama launch codex",
description: "OpenAI's open-source coding agent",
icon: "/launch-icons/codex.svg",
darkIcon: "/launch-icons/codex-dark.svg",
iconClassName: "h-7 w-7",
},
{
id: "opencode",
name: "OpenCode",
command: "ollama launch opencode",
description: "Anomaly's open-source coding agent",
icon: "/launch-icons/opencode.svg",
iconClassName: "h-7 w-7 rounded",
},
{
id: "droid",
name: "Droid",
command: "ollama launch droid",
description: "Factory's coding agent across terminal and IDEs",
icon: "/launch-icons/droid.svg",
},
{
id: "pi",
name: "Pi",
command: "ollama launch pi",
description: "Minimal AI agent toolkit with plugin support",
icon: "/launch-icons/pi.svg",
darkIcon: "/launch-icons/pi-dark.svg",
iconClassName: "h-7 w-7",
},
];
export default function LaunchCommands() {
const isWindows = navigator.platform.toLowerCase().includes("win");
const { setSettings } = useSettings();
const renderCommandCard = (item: LaunchCommand) => (
<div key={item.command} className="w-full text-left">
<div className="flex items-start gap-4 sm:gap-5">
<div
aria-hidden="true"
className={`flex h-10 w-10 shrink-0 items-center justify-center rounded-lg overflow-hidden ${item.borderless ? "" : "border border-neutral-200 bg-white dark:border-neutral-700 dark:bg-neutral-900"}`}
>
{item.darkIcon ? (
<picture>
<source srcSet={item.darkIcon} media="(prefers-color-scheme: dark)" />
<img src={item.icon} alt="" className={`${item.iconClassName ?? "h-8 w-8"} rounded-sm`} />
</picture>
) : (
<img src={item.icon} alt="" className={item.borderless ? "h-full w-full rounded-xl" : `${item.iconClassName ?? "h-8 w-8"} rounded-sm`} />
)}
</div>
<div className="min-w-0 flex-1">
<span className="text-sm font-medium text-neutral-900 dark:text-neutral-100">
{item.name}
</span>
<p className="mt-0.5 text-xs text-neutral-500 dark:text-neutral-400">
{item.description}
</p>
<div className="mt-2 flex items-center gap-2 rounded-xl border-neutral-200 dark:border-neutral-700 bg-neutral-50 dark:bg-neutral-800 px-3 py-2">
<code className="min-w-0 flex-1 truncate text-xs text-neutral-600 dark:text-neutral-300">
{item.command}
</code>
<CopyButton
content={item.command}
size="md"
title="Copy command to clipboard"
className="text-neutral-500 dark:text-neutral-400 hover:text-neutral-700 dark:hover:text-neutral-200 hover:bg-neutral-200/60 dark:hover:bg-neutral-700/70"
onCopy={() => {
setSettings({ LastHomeView: item.id }).catch(() => { });
}}
/>
</div>
</div>
</div>
</div>
);
return (
<main className="flex h-screen w-full flex-col relative">
<section
className={`flex-1 overflow-y-auto overscroll-contain relative min-h-0 ${isWindows ? "xl:pt-4" : "xl:pt-8"}`}
>
<div className="max-w-[730px] mx-auto w-full px-4 pt-4 pb-20 sm:px-6 sm:pt-6 sm:pb-24 lg:px-8 lg:pt-8 lg:pb-28">
<h1 className="text-xl font-semibold text-neutral-900 dark:text-neutral-100">
Launch
</h1>
<p className="mt-1 text-sm text-neutral-500 dark:text-neutral-400">
Copy a command and run it in your terminal.
</p>
<div className="mt-6 grid gap-7">
{LAUNCH_COMMANDS.map(renderCommandCard)}
</div>
</div>
</section>
</main>
);
}

View File

@@ -536,7 +536,7 @@ function ToolCallDisplay({
let args: Record<string, unknown> | null = null; let args: Record<string, unknown> | null = null;
try { try {
args = JSON.parse(toolCall.function.arguments) as Record<string, unknown>; args = JSON.parse(toolCall.function.arguments) as Record<string, unknown>;
} catch (e) { } catch {
args = null; args = null;
} }
const query = args && typeof args.query === "string" ? args.query : ""; const query = args && typeof args.query === "string" ? args.query : "";
@@ -562,7 +562,7 @@ function ToolCallDisplay({
let args: Record<string, unknown> | null = null; let args: Record<string, unknown> | null = null;
try { try {
args = JSON.parse(toolCall.function.arguments) as Record<string, unknown>; args = JSON.parse(toolCall.function.arguments) as Record<string, unknown>;
} catch (e) { } catch {
args = null; args = null;
} }
const url = args && typeof args.url === "string" ? args.url : ""; const url = args && typeof args.url === "string" ? args.url : "";

View File

@@ -73,7 +73,7 @@ export default function MessageList({
? String(args.url).trim() ? String(args.url).trim()
: ""; : "";
if (candidate) lastQuery = candidate; if (candidate) lastQuery = candidate;
} catch {} } catch { /* ignored */ }
} }
} }
} }

View File

@@ -8,7 +8,7 @@ import {
} from "react"; } from "react";
import { Model } from "@/gotypes"; import { Model } from "@/gotypes";
import { useSelectedModel } from "@/hooks/useSelectedModel"; import { useSelectedModel } from "@/hooks/useSelectedModel";
import { useSettings } from "@/hooks/useSettings"; import { useCloudStatus } from "@/hooks/useCloudStatus";
import { useQueryClient } from "@tanstack/react-query"; import { useQueryClient } from "@tanstack/react-query";
import { getModelUpstreamInfo } from "@/api"; import { getModelUpstreamInfo } from "@/api";
import { ArrowDownTrayIcon } from "@heroicons/react/24/outline"; import { ArrowDownTrayIcon } from "@heroicons/react/24/outline";
@@ -34,7 +34,7 @@ export const ModelPicker = forwardRef<
chatId, chatId,
searchQuery, searchQuery,
); );
const { settings } = useSettings(); const { cloudDisabled } = useCloudStatus();
const dropdownRef = useRef<HTMLDivElement>(null); const dropdownRef = useRef<HTMLDivElement>(null);
const searchInputRef = useRef<HTMLInputElement>(null); const searchInputRef = useRef<HTMLInputElement>(null);
const queryClient = useQueryClient(); const queryClient = useQueryClient();
@@ -61,24 +61,7 @@ export const ModelPicker = forwardRef<
try { try {
const upstreamInfo = await getModelUpstreamInfo(model); const upstreamInfo = await getModelUpstreamInfo(model);
// Compare local digest with upstream digest if (upstreamInfo.stale) {
let isStale =
model.digest &&
upstreamInfo.digest &&
model.digest !== upstreamInfo.digest;
// If the model has a modified time and upstream has a push time,
// check if the model was modified after the push time - if so, it's not stale
if (isStale && model.modified_at && upstreamInfo.pushTime > 0) {
const modifiedAtTime =
new Date(model.modified_at as string | number | Date).getTime() /
1000;
if (modifiedAtTime > upstreamInfo.pushTime) {
isStale = false;
}
}
if (isStale) {
const currentStaleModels = const currentStaleModels =
queryClient.getQueryData<Map<string, boolean>>(["staleModels"]) || queryClient.getQueryData<Map<string, boolean>>(["staleModels"]) ||
new Map(); new Map();
@@ -219,7 +202,7 @@ export const ModelPicker = forwardRef<
models={models} models={models}
selectedModel={selectedModel} selectedModel={selectedModel}
onModelSelect={handleModelSelect} onModelSelect={handleModelSelect}
airplaneMode={settings.airplaneMode} cloudDisabled={cloudDisabled}
isOpen={isOpen} isOpen={isOpen}
/> />
</div> </div>
@@ -233,13 +216,13 @@ export const ModelList = forwardRef(function ModelList(
models, models,
selectedModel, selectedModel,
onModelSelect, onModelSelect,
airplaneMode, cloudDisabled,
isOpen, isOpen,
}: { }: {
models: Model[]; models: Model[];
selectedModel: Model | null; selectedModel: Model | null;
onModelSelect: (model: Model) => void; onModelSelect: (model: Model) => void;
airplaneMode: boolean; cloudDisabled: boolean;
isOpen: boolean; isOpen: boolean;
}, },
ref, ref,
@@ -348,7 +331,7 @@ export const ModelList = forwardRef(function ModelList(
</svg> </svg>
)} )}
{model.digest === undefined && {model.digest === undefined &&
(airplaneMode || !model.isCloud()) && ( (cloudDisabled || !model.isCloud()) && (
<ArrowDownTrayIcon <ArrowDownTrayIcon
className="h-4 w-4 text-neutral-500 dark:text-neutral-400" className="h-4 w-4 text-neutral-500 dark:text-neutral-400"
strokeWidth={1.75} strokeWidth={1.75}

View File

@@ -11,15 +11,24 @@ import {
FolderIcon, FolderIcon,
BoltIcon, BoltIcon,
WrenchIcon, WrenchIcon,
CloudIcon,
XMarkIcon, XMarkIcon,
CogIcon, CogIcon,
ArrowLeftIcon, ArrowLeftIcon,
ArrowDownTrayIcon,
} from "@heroicons/react/20/solid"; } from "@heroicons/react/20/solid";
import { Settings as SettingsType } from "@/gotypes"; import { Settings as SettingsType } from "@/gotypes";
import { useNavigate } from "@tanstack/react-router"; import { useNavigate } from "@tanstack/react-router";
import { useUser } from "@/hooks/useUser"; import { useUser } from "@/hooks/useUser";
import { useCloudStatus } from "@/hooks/useCloudStatus";
import { useQuery, useMutation, useQueryClient } from "@tanstack/react-query"; import { useQuery, useMutation, useQueryClient } from "@tanstack/react-query";
import { getSettings, updateSettings } from "@/api"; import {
getSettings,
type CloudStatusResponse,
updateCloudSetting,
updateSettings,
getInferenceCompute,
} from "@/api";
function AnimatedDots() { function AnimatedDots() {
return ( return (
@@ -53,6 +62,11 @@ export default function Settings() {
const [connectionError, setConnectionError] = useState<string | null>(null); const [connectionError, setConnectionError] = useState<string | null>(null);
const [pollingInterval, setPollingInterval] = useState<number | null>(null); const [pollingInterval, setPollingInterval] = useState<number | null>(null);
const navigate = useNavigate(); const navigate = useNavigate();
const {
cloudDisabled,
cloudStatus,
isLoading: cloudStatusLoading,
} = useCloudStatus();
const { const {
data: settingsData, data: settingsData,
@@ -65,6 +79,13 @@ export default function Settings() {
const settings = settingsData?.settings || null; const settings = settingsData?.settings || null;
const { data: inferenceComputeResponse } = useQuery({
queryKey: ["inferenceCompute"],
queryFn: getInferenceCompute,
});
const defaultContextLength = inferenceComputeResponse?.defaultContextLength;
const updateSettingsMutation = useMutation({ const updateSettingsMutation = useMutation({
mutationFn: updateSettings, mutationFn: updateSettings,
onSuccess: () => { onSuccess: () => {
@@ -74,6 +95,50 @@ export default function Settings() {
}, },
}); });
const updateCloudMutation = useMutation({
mutationFn: (enabled: boolean) => updateCloudSetting(enabled),
onMutate: async (enabled: boolean) => {
await queryClient.cancelQueries({ queryKey: ["cloudStatus"] });
const previous = queryClient.getQueryData<CloudStatusResponse | null>([
"cloudStatus",
]);
const envForcesDisabled =
previous?.source === "env" || previous?.source === "both";
queryClient.setQueryData<CloudStatusResponse | null>(
["cloudStatus"],
previous
? {
...previous,
disabled: !enabled || envForcesDisabled,
}
: {
disabled: !enabled,
source: "config",
},
);
return { previous };
},
onError: (_error, _enabled, context) => {
if (context?.previous !== undefined) {
queryClient.setQueryData(["cloudStatus"], context.previous);
}
},
onSuccess: (status) => {
queryClient.setQueryData<CloudStatusResponse | null>(
["cloudStatus"],
status,
);
queryClient.invalidateQueries({ queryKey: ["models"] });
queryClient.invalidateQueries({ queryKey: ["cloudStatus"] });
setShowSaved(true);
setTimeout(() => setShowSaved(false), 1500);
},
});
useEffect(() => { useEffect(() => {
refetchUser(); refetchUser();
}, []); // eslint-disable-line react-hooks/exhaustive-deps }, []); // eslint-disable-line react-hooks/exhaustive-deps
@@ -148,13 +213,18 @@ export default function Settings() {
Models: "", Models: "",
Agent: false, Agent: false,
Tools: false, Tools: false,
ContextLength: 4096, ContextLength: 0,
AirplaneMode: false, AutoUpdateEnabled: true,
}); });
updateSettingsMutation.mutate(defaultSettings); updateSettingsMutation.mutate(defaultSettings);
} }
}; };
const cloudOverriddenByEnv =
cloudStatus?.source === "env" || cloudStatus?.source === "both";
const cloudToggleDisabled =
cloudStatusLoading || updateCloudMutation.isPending || cloudOverriddenByEnv;
const handleConnectOllamaAccount = async () => { const handleConnectOllamaAccount = async () => {
setConnectionError(null); setConnectionError(null);
@@ -203,6 +273,10 @@ export default function Settings() {
} }
const isWindows = navigator.platform.toLowerCase().includes("win"); const isWindows = navigator.platform.toLowerCase().includes("win");
const handleCloseSettings = () => {
const chatId = settings.LastHomeView === "chat" ? "new" : "launch";
navigate({ to: "/c/$chatId", params: { chatId } });
};
return ( return (
<main className="flex h-screen w-full flex-col select-none dark:bg-neutral-900"> <main className="flex h-screen w-full flex-col select-none dark:bg-neutral-900">
@@ -216,7 +290,7 @@ export default function Settings() {
> >
{isWindows && ( {isWindows && (
<button <button
onClick={() => navigate({ to: "/" })} onClick={handleCloseSettings}
className="hover:bg-neutral-100 mr-3 dark:hover:bg-neutral-800 rounded-full p-1.5" className="hover:bg-neutral-100 mr-3 dark:hover:bg-neutral-800 rounded-full p-1.5"
> >
<ArrowLeftIcon className="w-5 h-5 dark:text-white" /> <ArrowLeftIcon className="w-5 h-5 dark:text-white" />
@@ -226,7 +300,7 @@ export default function Settings() {
</h1> </h1>
{!isWindows && ( {!isWindows && (
<button <button
onClick={() => navigate({ to: "/" })} onClick={handleCloseSettings}
className="p-1 hover:bg-neutral-100 mr-3 dark:hover:bg-neutral-800 rounded-full" className="p-1 hover:bg-neutral-100 mr-3 dark:hover:bg-neutral-800 rounded-full"
> >
<XMarkIcon className="w-6 h-6 dark:text-white" /> <XMarkIcon className="w-6 h-6 dark:text-white" />
@@ -237,7 +311,7 @@ export default function Settings() {
<div className="space-y-4 max-w-2xl mx-auto"> <div className="space-y-4 max-w-2xl mx-auto">
{/* Connect Ollama Account */} {/* Connect Ollama Account */}
<div className="overflow-hidden rounded-xl bg-white dark:bg-neutral-800"> <div className="overflow-hidden rounded-xl bg-white dark:bg-neutral-800">
<div className="p-4 border-b border-neutral-200 dark:border-neutral-800"> <div className="p-4">
<Field> <Field>
{isLoading ? ( {isLoading ? (
// Loading skeleton, this will only happen if the app started recently // Loading skeleton, this will only happen if the app started recently
@@ -344,6 +418,57 @@ export default function Settings() {
{/* Local Configuration */} {/* Local Configuration */}
<div className="relative overflow-hidden rounded-xl bg-white dark:bg-neutral-800"> <div className="relative overflow-hidden rounded-xl bg-white dark:bg-neutral-800">
<div className="space-y-4 p-4"> <div className="space-y-4 p-4">
<Field>
<div className="flex items-start justify-between gap-4">
<div className="flex items-start space-x-3 flex-1">
<CloudIcon className="mt-1 h-5 w-5 flex-shrink-0 text-black dark:text-neutral-100" />
<div>
<Label>Cloud</Label>
<Description>
{cloudOverriddenByEnv
? "The OLLAMA_NO_CLOUD environment variable is currently forcing cloud off."
: "Enable cloud models and web search."}
</Description>
</div>
</div>
<div className="flex-shrink-0">
<Switch
checked={!cloudDisabled}
disabled={cloudToggleDisabled}
onChange={(checked) => {
if (cloudOverriddenByEnv) {
return;
}
updateCloudMutation.mutate(checked);
}}
/>
</div>
</div>
</Field>
{/* Auto Update */}
<Field>
<div className="flex items-start justify-between gap-4">
<div className="flex items-start space-x-3 flex-1">
<ArrowDownTrayIcon className="mt-1 h-5 w-5 flex-shrink-0 text-black dark:text-neutral-100" />
<div>
<Label>Auto-download updates</Label>
<Description>
{settings.AutoUpdateEnabled
? "Automatically download updates when available."
: "Updates will not be downloaded automatically."}
</Description>
</div>
</div>
<div className="flex-shrink-0">
<Switch
checked={settings.AutoUpdateEnabled}
onChange={(checked) => handleChange("AutoUpdateEnabled", checked)}
/>
</div>
</div>
</Field>
{/* Expose Ollama */} {/* Expose Ollama */}
<Field> <Field>
<div className="flex items-start justify-between gap-4"> <div className="flex items-start justify-between gap-4">
@@ -419,13 +544,11 @@ export default function Settings() {
</Description> </Description>
<div className="mt-3"> <div className="mt-3">
<Slider <Slider
value={(() => { value={settings.ContextLength || defaultContextLength || 0}
// Otherwise use the settings value
return settings.ContextLength || 4096;
})()}
onChange={(value) => { onChange={(value) => {
handleChange("ContextLength", value); handleChange("ContextLength", value);
}} }}
disabled={!defaultContextLength}
options={[ options={[
{ value: 4096, label: "4k" }, { value: 4096, label: "4k" },
{ value: 8192, label: "8k" }, { value: 8192, label: "8k" },
@@ -440,35 +563,6 @@ export default function Settings() {
</div> </div>
</div> </div>
</Field> </Field>
{/* Airplane Mode */}
<Field>
<div className="flex items-start justify-between gap-4">
<div className="flex items-start space-x-3 flex-1">
<svg
className="mt-1 h-5 w-5 flex-shrink-0 text-black dark:text-neutral-100"
viewBox="0 0 21.5508 17.9033"
fill="currentColor"
>
<path d="M21.5508 8.94727C21.542 7.91895 20.1445 7.17188 18.4658 7.17188L14.9238 7.17188C14.4316 7.17188 14.2471 7.09277 13.957 6.75879L8.05078 0.316406C7.86621 0.105469 7.6377 0 7.37402 0L6.35449 0C6.12598 0 5.99414 0.202148 6.1084 0.448242L9.14941 7.17188L4.68457 7.68164L3.09375 4.76367C2.97949 4.54395 2.78613 4.44727 2.49609 4.44727L2.11816 4.44727C1.88965 4.44727 1.74023 4.59668 1.74023 4.8252L1.74023 13.0693C1.74023 13.2979 1.88965 13.4385 2.11816 13.4385L2.49609 13.4385C2.78613 13.4385 2.97949 13.3418 3.09375 13.1309L4.68457 10.2129L9.14941 10.7227L6.1084 17.4463C5.99414 17.6836 6.12598 17.8945 6.35449 17.8945L7.37402 17.8945C7.6377 17.8945 7.86621 17.7803 8.05078 17.5781L13.957 11.127C14.2471 10.8018 14.4316 10.7227 14.9238 10.7227L18.4658 10.7227C20.1445 10.7227 21.542 9.9668 21.5508 8.94727Z" />
</svg>
<div>
<Label>Airplane mode</Label>
<Description>
Airplane mode keeps data local, disabling cloud models
and web search.
</Description>
</div>
</div>
<div className="flex-shrink-0">
<Switch
checked={settings.AirplaneMode}
onChange={(checked) =>
handleChange("AirplaneMode", checked)
}
/>
</div>
</div>
</Field>
</div> </div>
</div> </div>

View File

@@ -65,7 +65,7 @@ export const BadgeButton = forwardRef(function BadgeButton(
), ),
ref: React.ForwardedRef<HTMLElement>, ref: React.ForwardedRef<HTMLElement>,
) { ) {
let classes = clsx( const classes = clsx(
className, className,
"group relative inline-flex rounded-md focus:not-data-focus:outline-hidden data-focus:outline-2 data-focus:outline-offset-2 data-focus:outline-blue-500", "group relative inline-flex rounded-md focus:not-data-focus:outline-hidden data-focus:outline-2 data-focus:outline-offset-2 data-focus:outline-blue-500",
); );

View File

@@ -171,7 +171,7 @@ export const Button = forwardRef(function Button(
{ color, outline, plain, className, children, ...props }: ButtonProps, { color, outline, plain, className, children, ...props }: ButtonProps,
ref: React.ForwardedRef<HTMLElement>, ref: React.ForwardedRef<HTMLElement>,
) { ) {
let classes = clsx( const classes = clsx(
className, className,
styles.base, styles.base,
outline outline

View File

@@ -6,10 +6,11 @@ export interface SliderProps {
value?: number; value?: number;
onChange?: (value: number) => void; onChange?: (value: number) => void;
className?: string; className?: string;
disabled?: boolean;
} }
const Slider = React.forwardRef<HTMLDivElement, SliderProps>( const Slider = React.forwardRef<HTMLDivElement, SliderProps>(
({ label, options, value = 0, onChange }, ref) => { ({ label, options, value = 0, onChange, disabled = false }, ref) => {
const [selectedValue, setSelectedValue] = React.useState(value); const [selectedValue, setSelectedValue] = React.useState(value);
const [isDragging, setIsDragging] = React.useState(false); const [isDragging, setIsDragging] = React.useState(false);
const containerRef = React.useRef<HTMLDivElement>(null); const containerRef = React.useRef<HTMLDivElement>(null);
@@ -20,6 +21,7 @@ const Slider = React.forwardRef<HTMLDivElement, SliderProps>(
}, [value]); }, [value]);
const handleClick = (optionValue: number) => { const handleClick = (optionValue: number) => {
if (disabled) return;
setSelectedValue(optionValue); setSelectedValue(optionValue);
onChange?.(optionValue); onChange?.(optionValue);
}; };
@@ -39,6 +41,7 @@ const Slider = React.forwardRef<HTMLDivElement, SliderProps>(
}; };
const handleMouseDown = (e: React.MouseEvent) => { const handleMouseDown = (e: React.MouseEvent) => {
if (disabled) return;
setIsDragging(true); setIsDragging(true);
e.preventDefault(); e.preventDefault();
}; };
@@ -77,7 +80,7 @@ const Slider = React.forwardRef<HTMLDivElement, SliderProps>(
} }
return ( return (
<div className="space-y-2" ref={ref}> <div className={`space-y-2 ${disabled ? "opacity-50" : ""}`} ref={ref}>
{label && <label className="text-sm font-medium">{label}</label>} {label && <label className="text-sm font-medium">{label}</label>}
<div className="relative"> <div className="relative">
<div className="absolute top-[9px] left-2 right-2 h-1 bg-neutral-200 dark:bg-neutral-700 pointer-events-none rounded-full" /> <div className="absolute top-[9px] left-2 right-2 h-1 bg-neutral-200 dark:bg-neutral-700 pointer-events-none rounded-full" />
@@ -88,10 +91,11 @@ const Slider = React.forwardRef<HTMLDivElement, SliderProps>(
<button <button
onClick={() => handleClick(option.value)} onClick={() => handleClick(option.value)}
onMouseDown={handleMouseDown} onMouseDown={handleMouseDown}
className="relative px-3 py-6 -mx-3 -my-6 z-10 cursor-pointer" disabled={disabled}
className={`relative px-3 py-6 -mx-3 -my-6 z-10 ${disabled ? "cursor-not-allowed" : "cursor-pointer"}`}
> >
<div className="relative w-5 h-5 flex items-center justify-center"> <div className="relative w-5 h-5 flex items-center justify-center">
{selectedValue === option.value && ( {selectedValue === option.value && !disabled && (
<div className="w-4 h-4 bg-white dark:bg-white border border-neutral-400 dark:border-neutral-500 rounded-full cursor-grab active:cursor-grabbing" /> <div className="w-4 h-4 bg-white dark:bg-white border border-neutral-400 dark:border-neutral-500 rounded-full cursor-grab active:cursor-grabbing" />
)} )}
</div> </div>

View File

@@ -6,8 +6,8 @@ import { useSelectedModel } from "./useSelectedModel";
import { createQueryBatcher } from "./useQueryBatcher"; import { createQueryBatcher } from "./useQueryBatcher";
import { useRefetchModels } from "./useModels"; import { useRefetchModels } from "./useModels";
import { useStreamingContext } from "@/contexts/StreamingContext"; import { useStreamingContext } from "@/contexts/StreamingContext";
import { useSettings } from "./useSettings";
import { getModelCapabilities } from "@/api"; import { getModelCapabilities } from "@/api";
import { useCloudStatus } from "./useCloudStatus";
export const useChats = () => { export const useChats = () => {
return useQuery({ return useQuery({
@@ -116,11 +116,9 @@ export const useIsModelStale = (modelName: string) => {
export const useShouldShowStaleDisplay = (model: Model | null) => { export const useShouldShowStaleDisplay = (model: Model | null) => {
const isStale = useIsModelStale(model?.model || ""); const isStale = useIsModelStale(model?.model || "");
const { data: dismissedModels } = useDismissedStaleModels(); const { data: dismissedModels } = useDismissedStaleModels();
const { const { cloudDisabled } = useCloudStatus();
settings: { airplaneMode },
} = useSettings();
if (model?.isCloud() && !airplaneMode) { if (model?.isCloud() && !cloudDisabled) {
return false; return false;
} }

View File

@@ -0,0 +1,20 @@
import { useQuery } from "@tanstack/react-query";
import { getCloudStatus, type CloudStatusResponse } from "@/api";
export function useCloudStatus() {
const cloudQuery = useQuery<CloudStatusResponse | null>({
queryKey: ["cloudStatus"],
queryFn: getCloudStatus,
retry: false,
staleTime: 60 * 1000,
});
return {
cloudStatus: cloudQuery.data,
cloudDisabled: cloudQuery.data?.disabled ?? false,
isKnown: cloudQuery.data !== null && cloudQuery.data !== undefined,
isLoading: cloudQuery.isLoading,
isError: cloudQuery.isError,
error: cloudQuery.error,
};
}

View File

@@ -20,3 +20,8 @@ export function useHasVisionCapability(modelName: string | undefined) {
const { data: capabilitiesResponse } = useModelCapabilities(modelName); const { data: capabilitiesResponse } = useModelCapabilities(modelName);
return capabilitiesResponse?.capabilities?.includes("vision") ?? false; return capabilitiesResponse?.capabilities?.includes("vision") ?? false;
} }
export function useHasToolsCapability(modelName: string | undefined) {
const { data: capabilitiesResponse } = useModelCapabilities(modelName);
return capabilitiesResponse?.capabilities?.includes("tools") ?? false;
}

View File

@@ -2,11 +2,11 @@ import { useQuery } from "@tanstack/react-query";
import { Model } from "@/gotypes"; import { Model } from "@/gotypes";
import { getModels } from "@/api"; import { getModels } from "@/api";
import { mergeModels } from "@/utils/mergeModels"; import { mergeModels } from "@/utils/mergeModels";
import { useSettings } from "./useSettings";
import { useMemo } from "react"; import { useMemo } from "react";
import { useCloudStatus } from "./useCloudStatus";
export function useModels(searchQuery = "") { export function useModels(searchQuery = "") {
const { settings } = useSettings(); const { cloudDisabled } = useCloudStatus();
const localQuery = useQuery<Model[], Error>({ const localQuery = useQuery<Model[], Error>({
queryKey: ["models", searchQuery], queryKey: ["models", searchQuery],
queryFn: () => getModels(searchQuery), queryFn: () => getModels(searchQuery),
@@ -20,7 +20,7 @@ export function useModels(searchQuery = "") {
}); });
const allModels = useMemo(() => { const allModels = useMemo(() => {
const models = mergeModels(localQuery.data || [], settings.airplaneMode); const models = mergeModels(localQuery.data || [], cloudDisabled);
if (searchQuery && searchQuery.trim()) { if (searchQuery && searchQuery.trim()) {
const query = searchQuery.toLowerCase().trim(); const query = searchQuery.toLowerCase().trim();
@@ -40,7 +40,7 @@ export function useModels(searchQuery = "") {
} }
return models; return models;
}, [localQuery.data, searchQuery, settings.airplaneMode]); }, [localQuery.data, searchQuery, cloudDisabled]);
return { return {
...localQuery, ...localQuery,

View File

@@ -7,6 +7,7 @@ import { Model } from "@/gotypes";
import { FEATURED_MODELS } from "@/utils/mergeModels"; import { FEATURED_MODELS } from "@/utils/mergeModels";
import { getTotalVRAM } from "@/utils/vram.ts"; import { getTotalVRAM } from "@/utils/vram.ts";
import { getInferenceCompute } from "@/api"; import { getInferenceCompute } from "@/api";
import { useCloudStatus } from "./useCloudStatus";
export function recommendDefaultModel(totalVRAM: number): string { export function recommendDefaultModel(totalVRAM: number): string {
const vram = Math.max(0, Number(totalVRAM) || 0); const vram = Math.max(0, Number(totalVRAM) || 0);
@@ -22,16 +23,19 @@ export function recommendDefaultModel(totalVRAM: number): string {
export function useSelectedModel(currentChatId?: string, searchQuery?: string) { export function useSelectedModel(currentChatId?: string, searchQuery?: string) {
const { settings, setSettings } = useSettings(); const { settings, setSettings } = useSettings();
const { data: models = [], isLoading } = useModels(searchQuery || ""); const { data: models = [], isLoading } = useModels(searchQuery || "");
const { cloudDisabled } = useCloudStatus();
const { data: chatData, isLoading: isChatLoading } = useChat( const { data: chatData, isLoading: isChatLoading } = useChat(
currentChatId && currentChatId !== "new" ? currentChatId : "", currentChatId && currentChatId !== "new" ? currentChatId : "",
); );
const { data: inferenceComputes = [] } = useQuery({ const { data: inferenceComputeResponse } = useQuery({
queryKey: ["inference-compute"], queryKey: ["inferenceCompute"],
queryFn: getInferenceCompute, queryFn: getInferenceCompute,
enabled: !settings.selectedModel, // Only fetch if no model is selected enabled: !settings.selectedModel, // Only fetch if no model is selected
}); });
const inferenceComputes = inferenceComputeResponse?.inferenceComputes || [];
const totalVRAM = useMemo( const totalVRAM = useMemo(
() => getTotalVRAM(inferenceComputes), () => getTotalVRAM(inferenceComputes),
[inferenceComputes], [inferenceComputes],
@@ -46,12 +50,11 @@ export function useSelectedModel(currentChatId?: string, searchQuery?: string) {
const restoredChatRef = useRef<string | null>(null); const restoredChatRef = useRef<string | null>(null);
const selectedModel: Model | null = useMemo(() => { const selectedModel: Model | null = useMemo(() => {
// if airplane mode is on and selected model ends with cloud, // If cloud is disabled and selected model ends with cloud, switch to a local default.
// switch to recommended default model if (cloudDisabled && settings.selectedModel?.endsWith("cloud")) {
if (settings.airplaneMode && settings.selectedModel?.endsWith("cloud")) {
return ( return (
models.find((m) => m.model === recommendedModel) || models.find((m) => m.model === recommendedModel) ||
models.find((m) => m.isCloud) || models.find((m) => !m.isCloud()) ||
models.find((m) => m.digest === undefined || m.digest === "") || models.find((m) => m.digest === undefined || m.digest === "") ||
models[0] || models[0] ||
null null
@@ -68,7 +71,7 @@ export function useSelectedModel(currentChatId?: string, searchQuery?: string) {
"qwen3-coder:480b", "qwen3-coder:480b",
]; ];
const shouldMigrate = const shouldMigrate =
!settings.airplaneMode && !cloudDisabled &&
settings.turboEnabled && settings.turboEnabled &&
baseModelsToMigrate.includes(settings.selectedModel); baseModelsToMigrate.includes(settings.selectedModel);
@@ -96,13 +99,18 @@ export function useSelectedModel(currentChatId?: string, searchQuery?: string) {
})) || })) ||
null null
); );
}, [models, settings.selectedModel, settings.airplaneMode, recommendedModel]); }, [
models,
settings.selectedModel,
cloudDisabled,
recommendedModel,
]);
useEffect(() => { useEffect(() => {
if (!selectedModel) return; if (!selectedModel) return;
if ( if (
settings.airplaneMode && cloudDisabled &&
settings.selectedModel?.endsWith("cloud") && settings.selectedModel?.endsWith("cloud") &&
selectedModel.model !== settings.selectedModel selectedModel.model !== settings.selectedModel
) { ) {
@@ -110,13 +118,17 @@ export function useSelectedModel(currentChatId?: string, searchQuery?: string) {
} }
if ( if (
!settings.airplaneMode && !cloudDisabled &&
settings.turboEnabled && settings.turboEnabled &&
selectedModel.model !== settings.selectedModel selectedModel.model !== settings.selectedModel
) { ) {
setSettings({ SelectedModel: selectedModel.model, TurboEnabled: false }); setSettings({ SelectedModel: selectedModel.model, TurboEnabled: false });
} }
}, [selectedModel, settings.airplaneMode, settings.selectedModel]); }, [
selectedModel,
cloudDisabled,
settings.selectedModel,
]);
// Set model from chat history when chat data loads // Set model from chat history when chat data loads
useEffect(() => { useEffect(() => {
@@ -169,7 +181,9 @@ export function useSelectedModel(currentChatId?: string, searchQuery?: string) {
const defaultModel = const defaultModel =
models.find((m) => m.model === recommendedModel) || models.find((m) => m.model === recommendedModel) ||
models.find((m) => m.isCloud()) || (cloudDisabled
? models.find((m) => !m.isCloud())
: models.find((m) => m.isCloud())) ||
models.find((m) => m.digest === undefined || m.digest === "") || models.find((m) => m.digest === undefined || m.digest === "") ||
models[0]; models[0];
@@ -181,6 +195,7 @@ export function useSelectedModel(currentChatId?: string, searchQuery?: string) {
inferenceComputes.length, inferenceComputes.length,
models.length, models.length,
settings.selectedModel, settings.selectedModel,
cloudDisabled,
]); ]);
// Add the selected model to the models list if it's not already there // Add the selected model to the models list if it's not already there

View File

@@ -9,7 +9,7 @@ interface SettingsState {
webSearchEnabled: boolean; webSearchEnabled: boolean;
selectedModel: string; selectedModel: string;
sidebarOpen: boolean; sidebarOpen: boolean;
airplaneMode: boolean; lastHomeView: string;
thinkEnabled: boolean; thinkEnabled: boolean;
thinkLevel: string; thinkLevel: string;
} }
@@ -22,6 +22,7 @@ type SettingsUpdate = Partial<{
ThinkLevel: string; ThinkLevel: string;
SelectedModel: string; SelectedModel: string;
SidebarOpen: boolean; SidebarOpen: boolean;
LastHomeView: string;
}>; }>;
export function useSettings() { export function useSettings() {
@@ -51,7 +52,7 @@ export function useSettings() {
thinkLevel: settingsData?.settings?.ThinkLevel ?? "none", thinkLevel: settingsData?.settings?.ThinkLevel ?? "none",
selectedModel: settingsData?.settings?.SelectedModel ?? "", selectedModel: settingsData?.settings?.SelectedModel ?? "",
sidebarOpen: settingsData?.settings?.SidebarOpen ?? false, sidebarOpen: settingsData?.settings?.SidebarOpen ?? false,
airplaneMode: settingsData?.settings?.AirplaneMode ?? false, lastHomeView: settingsData?.settings?.LastHomeView ?? "launch",
}), }),
[settingsData?.settings], [settingsData?.settings],
); );

View File

@@ -2,6 +2,7 @@ import type { QueryClient } from "@tanstack/react-query";
import { createRootRouteWithContext, Outlet } from "@tanstack/react-router"; import { createRootRouteWithContext, Outlet } from "@tanstack/react-router";
import { getSettings } from "@/api"; import { getSettings } from "@/api";
import { useQuery } from "@tanstack/react-query"; import { useQuery } from "@tanstack/react-query";
import { useCloudStatus } from "@/hooks/useCloudStatus";
function RootComponent() { function RootComponent() {
// This hook ensures settings are fetched on app startup // This hook ensures settings are fetched on app startup
@@ -9,6 +10,8 @@ function RootComponent() {
queryKey: ["settings"], queryKey: ["settings"],
queryFn: getSettings, queryFn: getSettings,
}); });
// Fetch cloud status on startup (best-effort)
useCloudStatus();
return ( return (
<div> <div>

View File

@@ -4,12 +4,37 @@ import Chat from "@/components/Chat";
import { getChat } from "@/api"; import { getChat } from "@/api";
import { SidebarLayout } from "@/components/layout/layout"; import { SidebarLayout } from "@/components/layout/layout";
import { ChatSidebar } from "@/components/ChatSidebar"; import { ChatSidebar } from "@/components/ChatSidebar";
import LaunchCommands from "@/components/LaunchCommands";
import { useEffect, useRef } from "react";
import { useSettings } from "@/hooks/useSettings";
const launchSidebarRequestedKey = "ollama.launchSidebarRequested";
const launchSidebarSeenKey = "ollama.launchSidebarSeen";
const fallbackSessionState = new Map<string, string>();
function getSessionState() {
if (typeof sessionStorage !== "undefined") {
return sessionStorage;
}
return {
getItem(key: string) {
return fallbackSessionState.get(key) ?? null;
},
setItem(key: string, value: string) {
fallbackSessionState.set(key, value);
},
removeItem(key: string) {
fallbackSessionState.delete(key);
},
};
}
export const Route = createFileRoute("/c/$chatId")({ export const Route = createFileRoute("/c/$chatId")({
component: RouteComponent, component: RouteComponent,
loader: async ({ context, params }) => { loader: async ({ context, params }) => {
// Skip loading for "new" chat // Skip loading for special non-chat views
if (params.chatId !== "new") { if (params.chatId !== "new" && params.chatId !== "launch") {
context.queryClient.ensureQueryData({ context.queryClient.ensureQueryData({
queryKey: ["chat", params.chatId], queryKey: ["chat", params.chatId],
queryFn: () => getChat(params.chatId), queryFn: () => getChat(params.chatId),
@@ -21,13 +46,70 @@ export const Route = createFileRoute("/c/$chatId")({
function RouteComponent() { function RouteComponent() {
const { chatId } = Route.useParams(); const { chatId } = Route.useParams();
const { settingsData, setSettings } = useSettings();
const previousChatIdRef = useRef<string | null>(null);
// Always call hooks at the top level - use a flag to skip data when chatId is "new" // Always call hooks at the top level - use a flag to skip data when chatId is a special view
const { const {
data: chatData, data: chatData,
isLoading: chatLoading, isLoading: chatLoading,
error: chatError, error: chatError,
} = useChat(chatId === "new" ? "" : chatId); } = useChat(chatId === "new" || chatId === "launch" ? "" : chatId);
useEffect(() => {
if (!settingsData) {
return;
}
const previousChatId = previousChatIdRef.current;
previousChatIdRef.current = chatId;
if (chatId === "launch") {
const sessionState = getSessionState();
const shouldOpenSidebar =
previousChatId !== "launch" &&
(() => {
if (sessionState.getItem(launchSidebarRequestedKey) === "1") {
sessionState.removeItem(launchSidebarRequestedKey);
sessionState.setItem(launchSidebarSeenKey, "1");
return true;
}
if (sessionState.getItem(launchSidebarSeenKey) !== "1") {
sessionState.setItem(launchSidebarSeenKey, "1");
return true;
}
return false;
})();
const updates: { LastHomeView?: string; SidebarOpen?: boolean } = {};
if (settingsData.LastHomeView !== "launch") {
updates.LastHomeView = "launch";
}
if (shouldOpenSidebar && !settingsData.SidebarOpen) {
updates.SidebarOpen = true;
}
if (Object.keys(updates).length === 0) {
return;
}
setSettings(updates).catch(() => {
// Best effort persistence for home view preference.
});
return;
}
if (settingsData.LastHomeView === "chat") {
return;
}
setSettings({ LastHomeView: "chat" }).catch(() => {
// Best effort persistence for home view preference.
});
}, [chatId, settingsData, setSettings]);
// Handle "new" chat case - just use Chat component which handles everything // Handle "new" chat case - just use Chat component which handles everything
if (chatId === "new") { if (chatId === "new") {
@@ -38,6 +120,14 @@ function RouteComponent() {
); );
} }
if (chatId === "launch") {
return (
<SidebarLayout sidebar={<ChatSidebar currentChatId={chatId} />}>
<LaunchCommands />
</SidebarLayout>
);
}
// Handle existing chat case // Handle existing chat case
if (chatLoading) { if (chatLoading) {
return ( return (

View File

@@ -1,10 +1,18 @@
import { createFileRoute, redirect } from "@tanstack/react-router"; import { createFileRoute, redirect } from "@tanstack/react-router";
import { getSettings } from "@/api";
export const Route = createFileRoute("/")({ export const Route = createFileRoute("/")({
beforeLoad: () => { beforeLoad: async ({ context }) => {
const settingsData = await context.queryClient.ensureQueryData({
queryKey: ["settings"],
queryFn: getSettings,
});
const chatId =
settingsData?.settings?.LastHomeView === "chat" ? "new" : "launch";
throw redirect({ throw redirect({
to: "/c/$chatId", to: "/c/$chatId",
params: { chatId: "new" }, params: { chatId },
mask: { mask: {
to: "/", to: "/",
}, },

View File

@@ -0,0 +1,57 @@
import { describe, expect, it, vi, beforeEach } from "vitest";
import { copyTextToClipboard } from "./clipboard";
describe("copyTextToClipboard", () => {
beforeEach(() => {
vi.restoreAllMocks();
});
it("copies via Clipboard API when available", async () => {
const writeText = vi.fn().mockResolvedValue(undefined);
vi.stubGlobal("navigator", {
clipboard: {
writeText,
},
});
const copied = await copyTextToClipboard("ollama launch claude");
expect(copied).toBe(true);
expect(writeText).toHaveBeenCalledWith("ollama launch claude");
});
it("falls back to execCommand when Clipboard API fails", async () => {
const writeText = vi.fn().mockRejectedValue(new Error("not allowed"));
vi.stubGlobal("navigator", {
clipboard: {
writeText,
},
});
const textarea = {
value: "",
setAttribute: vi.fn(),
style: {} as Record<string, string>,
focus: vi.fn(),
select: vi.fn(),
};
const appendChild = vi.fn();
const removeChild = vi.fn();
const execCommand = vi.fn().mockReturnValue(true);
vi.stubGlobal("document", {
createElement: vi.fn().mockReturnValue(textarea),
body: {
appendChild,
removeChild,
},
execCommand,
});
const copied = await copyTextToClipboard("ollama launch openclaw");
expect(copied).toBe(true);
expect(execCommand).toHaveBeenCalledWith("copy");
expect(appendChild).toHaveBeenCalled();
expect(removeChild).toHaveBeenCalled();
});
});

View File

@@ -0,0 +1,30 @@
export async function copyTextToClipboard(text: string): Promise<boolean> {
try {
await navigator.clipboard.writeText(text);
return true;
} catch (clipboardError) {
console.error(
"Clipboard API failed, falling back to execCommand",
clipboardError,
);
}
try {
const textarea = document.createElement("textarea");
textarea.value = text;
textarea.setAttribute("readonly", "true");
textarea.style.position = "fixed";
textarea.style.left = "-9999px";
textarea.style.opacity = "0";
document.body.appendChild(textarea);
textarea.focus();
textarea.select();
const copied = document.execCommand("copy");
document.body.removeChild(textarea);
return copied;
} catch (fallbackError) {
console.error("Fallback copy failed", fallbackError);
return false;
}
}

View File

@@ -29,13 +29,15 @@ describe("fileValidation", () => {
expect(result.valid).toBe(true); expect(result.valid).toBe(true);
}); });
it("should reject WebP images when vision capability is disabled", () => { it("should accept images regardless of vision capability", () => {
// Vision capability check is handled at the UI layer (ChatForm),
// not at validation time, so users can switch models without
// needing to re-upload files.
const file = createMockFile("test.webp", 1024, "image/webp"); const file = createMockFile("test.webp", 1024, "image/webp");
const result = validateFile(file, { const result = validateFile(file, {
hasVisionCapability: false, hasVisionCapability: false,
}); });
expect(result.valid).toBe(false); expect(result.valid).toBe(true);
expect(result.error).toBe("This model does not support images");
}); });
it("should accept PNG images when vision capability is enabled", () => { it("should accept PNG images when vision capability is enabled", () => {

View File

@@ -63,7 +63,6 @@ export function validateFile(
const { const {
maxFileSize = 10, maxFileSize = 10,
allowedExtensions = [...TEXT_FILE_EXTENSIONS, ...IMAGE_EXTENSIONS], allowedExtensions = [...TEXT_FILE_EXTENSIONS, ...IMAGE_EXTENSIONS],
hasVisionCapability = false,
customValidator, customValidator,
} = options; } = options;
@@ -83,10 +82,6 @@ export function validateFile(
return { valid: false, error: "File type not supported" }; return { valid: false, error: "File type not supported" };
} }
if (IMAGE_EXTENSIONS.includes(fileExtension) && !hasVisionCapability) {
return { valid: false, error: "This model does not support images" };
}
// File size validation // File size validation
if (file.size > MAX_FILE_SIZE) { if (file.size > MAX_FILE_SIZE) {
return { valid: false, error: "File too large" }; return { valid: false, error: "File too large" };

View File

@@ -41,14 +41,14 @@ describe("Model merging logic", () => {
expect(merged.length).toBe(FEATURED_MODELS.length + 2); expect(merged.length).toBe(FEATURED_MODELS.length + 2);
}); });
it("should hide cloud models in airplane mode", () => { it("should hide cloud models when cloud is disabled", () => {
const localModels: Model[] = [ const localModels: Model[] = [
new Model({ model: "gpt-oss:120b-cloud" }), new Model({ model: "gpt-oss:120b-cloud" }),
new Model({ model: "llama3:latest" }), new Model({ model: "llama3:latest" }),
new Model({ model: "mistral:latest" }), new Model({ model: "mistral:latest" }),
]; ];
const merged = mergeModels(localModels, true); // airplane mode = true const merged = mergeModels(localModels, true); // cloud disabled = true
// No cloud models should be present // No cloud models should be present
const cloudModels = merged.filter((m) => m.isCloud()); const cloudModels = merged.filter((m) => m.isCloud());

View File

@@ -2,27 +2,28 @@ import { Model } from "@/gotypes";
// Featured models list (in priority order) // Featured models list (in priority order)
export const FEATURED_MODELS = [ export const FEATURED_MODELS = [
"kimi-k2.5:cloud",
"glm-5:cloud",
"minimax-m2.7:cloud",
"gemma4:31b-cloud",
"qwen3.5:397b-cloud",
"gpt-oss:120b-cloud", "gpt-oss:120b-cloud",
"gpt-oss:20b-cloud", "gpt-oss:20b-cloud",
"deepseek-v3.1:671b-cloud", "deepseek-v3.1:671b-cloud",
"qwen3-coder:480b-cloud",
"qwen3-vl:235b-cloud",
"minimax-m2:cloud",
"glm-4.6:cloud",
"gpt-oss:120b", "gpt-oss:120b",
"gpt-oss:20b", "gpt-oss:20b",
"gemma3:27b", "gemma4:31b",
"gemma3:12b", "gemma4:26b",
"gemma3:4b", "gemma4:e4b",
"gemma3:1b", "gemma4:e2b",
"deepseek-r1:8b", "deepseek-r1:8b",
"qwen3-coder:30b", "qwen3-coder:30b",
"qwen3-vl:30b", "qwen3-vl:30b",
"qwen3-vl:8b", "qwen3-vl:8b",
"qwen3-vl:4b", "qwen3-vl:4b",
"qwen3:30b", "qwen3.5:27b",
"qwen3:8b", "qwen3.5:9b",
"qwen3:4b", "qwen3.5:4b",
]; ];
function alphabeticalSort(a: Model, b: Model): number { function alphabeticalSort(a: Model, b: Model): number {
@@ -32,7 +33,7 @@ function alphabeticalSort(a: Model, b: Model): number {
//Merges models, sorting cloud models first, then other models //Merges models, sorting cloud models first, then other models
export function mergeModels( export function mergeModels(
localModels: Model[], localModels: Model[],
airplaneMode: boolean = false, hideCloudModels: boolean = false,
): Model[] { ): Model[] {
const allModels = (localModels || []).map((model) => model); const allModels = (localModels || []).map((model) => model);
@@ -95,7 +96,7 @@ export function mergeModels(
remainingModels.sort(alphabeticalSort); remainingModels.sort(alphabeticalSort);
return airplaneMode return hideCloudModels
? [...featuredModels, ...remainingModels] ? [...featuredModels, ...remainingModels]
: [...cloudModels, ...featuredModels, ...remainingModels]; : [...cloudModels, ...featuredModels, ...remainingModels];
} }

View File

@@ -46,6 +46,7 @@ type InferenceCompute struct {
type InferenceComputeResponse struct { type InferenceComputeResponse struct {
InferenceComputes []InferenceCompute `json:"inferenceComputes"` InferenceComputes []InferenceCompute `json:"inferenceComputes"`
DefaultContextLength int `json:"defaultContextLength"`
} }
type ModelCapabilitiesResponse struct { type ModelCapabilitiesResponse struct {
@@ -132,8 +133,7 @@ type Error struct {
} }
type ModelUpstreamResponse struct { type ModelUpstreamResponse struct {
Digest string `json:"digest,omitempty"` Stale bool `json:"stale"`
PushTime int64 `json:"pushTime"`
Error string `json:"error,omitempty"` Error string `json:"error,omitempty"`
} }

View File

@@ -28,9 +28,11 @@ import (
"github.com/ollama/ollama/app/tools" "github.com/ollama/ollama/app/tools"
"github.com/ollama/ollama/app/types/not" "github.com/ollama/ollama/app/types/not"
"github.com/ollama/ollama/app/ui/responses" "github.com/ollama/ollama/app/ui/responses"
"github.com/ollama/ollama/app/updater"
"github.com/ollama/ollama/app/version" "github.com/ollama/ollama/app/version"
ollamaAuth "github.com/ollama/ollama/auth" ollamaAuth "github.com/ollama/ollama/auth"
"github.com/ollama/ollama/envconfig" "github.com/ollama/ollama/envconfig"
"github.com/ollama/ollama/manifest"
"github.com/ollama/ollama/types/model" "github.com/ollama/ollama/types/model"
_ "github.com/tkrajina/typescriptify-golang-structs/typescriptify" _ "github.com/tkrajina/typescriptify-golang-structs/typescriptify"
) )
@@ -106,6 +108,10 @@ type Server struct {
// Dev is true if the server is running in development mode // Dev is true if the server is running in development mode
Dev bool Dev bool
// Updater for checking and downloading updates
Updater *updater.Updater
UpdateAvailableFunc func()
} }
func (s *Server) log() *slog.Logger { func (s *Server) log() *slog.Logger {
@@ -150,7 +156,7 @@ func (s *Server) ollamaProxy() http.Handler {
return return
} }
target := envconfig.Host() target := envconfig.ConnectableHost()
s.log().Info("configuring ollama proxy", "target", target.String()) s.log().Info("configuring ollama proxy", "target", target.String())
newProxy := httputil.NewSingleHostReverseProxy(target) newProxy := httputil.NewSingleHostReverseProxy(target)
@@ -188,7 +194,7 @@ func (s *Server) Handler() http.Handler {
if CORS() { if CORS() {
w.Header().Set("Access-Control-Allow-Origin", "*") w.Header().Set("Access-Control-Allow-Origin", "*")
w.Header().Set("Access-Control-Allow-Methods", "GET, POST, PUT, DELETE, OPTIONS") w.Header().Set("Access-Control-Allow-Methods", "GET, POST, PUT, DELETE, OPTIONS")
w.Header().Set("Access-Control-Allow-Headers", "Content-Type, Authorization, X-Requested-With") w.Header().Set("Access-Control-Allow-Headers", "Content-Type, Authorization, User-Agent, Accept, X-Requested-With")
w.Header().Set("Access-Control-Allow-Credentials", "true") w.Header().Set("Access-Control-Allow-Credentials", "true")
// Handle preflight requests // Handle preflight requests
@@ -284,12 +290,15 @@ func (s *Server) Handler() http.Handler {
mux.Handle("POST /api/v1/model/upstream", handle(s.modelUpstream)) mux.Handle("POST /api/v1/model/upstream", handle(s.modelUpstream))
mux.Handle("GET /api/v1/settings", handle(s.getSettings)) mux.Handle("GET /api/v1/settings", handle(s.getSettings))
mux.Handle("POST /api/v1/settings", handle(s.settings)) mux.Handle("POST /api/v1/settings", handle(s.settings))
mux.Handle("GET /api/v1/cloud", handle(s.getCloudSetting))
mux.Handle("POST /api/v1/cloud", handle(s.cloudSetting))
// Ollama proxy endpoints // Ollama proxy endpoints
ollamaProxy := s.ollamaProxy() ollamaProxy := s.ollamaProxy()
mux.Handle("GET /api/tags", ollamaProxy) mux.Handle("GET /api/tags", ollamaProxy)
mux.Handle("POST /api/show", ollamaProxy) mux.Handle("POST /api/show", ollamaProxy)
mux.Handle("GET /api/version", ollamaProxy) mux.Handle("GET /api/version", ollamaProxy)
mux.Handle("GET /api/status", ollamaProxy)
mux.Handle("HEAD /api/version", ollamaProxy) mux.Handle("HEAD /api/version", ollamaProxy)
mux.Handle("POST /api/me", ollamaProxy) mux.Handle("POST /api/me", ollamaProxy)
mux.Handle("POST /api/signout", ollamaProxy) mux.Handle("POST /api/signout", ollamaProxy)
@@ -310,7 +319,7 @@ func (s *Server) handleError(w http.ResponseWriter, e error) {
if CORS() { if CORS() {
w.Header().Set("Access-Control-Allow-Origin", "*") w.Header().Set("Access-Control-Allow-Origin", "*")
w.Header().Set("Access-Control-Allow-Methods", "GET, POST, PUT, DELETE, OPTIONS") w.Header().Set("Access-Control-Allow-Methods", "GET, POST, PUT, DELETE, OPTIONS")
w.Header().Set("Access-Control-Allow-Headers", "Content-Type, Authorization, X-Requested-With") w.Header().Set("Access-Control-Allow-Headers", "Content-Type, Authorization, User-Agent, Accept, X-Requested-With")
w.Header().Set("Access-Control-Allow-Credentials", "true") w.Header().Set("Access-Control-Allow-Credentials", "true")
} }
@@ -333,8 +342,18 @@ func (t *userAgentTransport) RoundTrip(req *http.Request) (*http.Response, error
// httpClient returns an HTTP client that automatically adds the User-Agent header // httpClient returns an HTTP client that automatically adds the User-Agent header
func (s *Server) httpClient() *http.Client { func (s *Server) httpClient() *http.Client {
return userAgentHTTPClient(10 * time.Second)
}
// inferenceClient uses almost the same HTTP client, but without a timeout so
// long requests aren't truncated
func (s *Server) inferenceClient() *api.Client {
return api.NewClient(envconfig.Host(), userAgentHTTPClient(0))
}
func userAgentHTTPClient(timeout time.Duration) *http.Client {
return &http.Client{ return &http.Client{
Timeout: 10 * time.Second, Timeout: timeout,
Transport: &userAgentTransport{ Transport: &userAgentTransport{
base: http.DefaultTransport, base: http.DefaultTransport,
}, },
@@ -712,11 +731,7 @@ func (s *Server) chat(w http.ResponseWriter, r *http.Request) error {
_, cancelLoading := context.WithCancel(ctx) _, cancelLoading := context.WithCancel(ctx)
loading := false loading := false
c, err := api.ClientFromEnvironment() c := s.inferenceClient()
if err != nil {
cancelLoading()
return err
}
// Check if the model exists locally by trying to show it // Check if the model exists locally by trying to show it
// TODO (jmorganca): skip this round trip and instead just act // TODO (jmorganca): skip this round trip and instead just act
@@ -826,8 +841,9 @@ func (s *Server) chat(w http.ResponseWriter, r *http.Request) error {
if !hasAttachments { if !hasAttachments {
WebSearchEnabled := req.WebSearch != nil && *req.WebSearch WebSearchEnabled := req.WebSearch != nil && *req.WebSearch
hasToolsCapability := slices.Contains(details.Capabilities, model.CapabilityTools)
if WebSearchEnabled { if WebSearchEnabled && hasToolsCapability {
if supportsBrowserTools(req.Model) { if supportsBrowserTools(req.Model) {
browserState, ok := s.browserState(chat) browserState, ok := s.browserState(chat)
if !ok { if !ok {
@@ -837,7 +853,7 @@ func (s *Server) chat(w http.ResponseWriter, r *http.Request) error {
registry.Register(tools.NewBrowserSearch(browser)) registry.Register(tools.NewBrowserSearch(browser))
registry.Register(tools.NewBrowserOpen(browser)) registry.Register(tools.NewBrowserOpen(browser))
registry.Register(tools.NewBrowserFind(browser)) registry.Register(tools.NewBrowserFind(browser))
} else if supportsWebSearchTools(req.Model) { } else {
registry.Register(&tools.WebSearch{}) registry.Register(&tools.WebSearch{})
registry.Register(&tools.WebFetch{}) registry.Register(&tools.WebFetch{})
} }
@@ -1417,11 +1433,6 @@ func (s *Server) getSettings(w http.ResponseWriter, r *http.Request) error {
settings.Models = envconfig.Models() settings.Models = envconfig.Models()
} }
// set default context length if not set
if settings.ContextLength == 0 {
settings.ContextLength = 4096
}
// Include current runtime settings // Include current runtime settings
settings.Agent = s.Agent settings.Agent = s.Agent
settings.Tools = s.Tools settings.Tools = s.Tools
@@ -1448,6 +1459,24 @@ func (s *Server) settings(w http.ResponseWriter, r *http.Request) error {
return fmt.Errorf("failed to save settings: %w", err) return fmt.Errorf("failed to save settings: %w", err)
} }
// Handle auto-update toggle changes
if old.AutoUpdateEnabled != settings.AutoUpdateEnabled {
if !settings.AutoUpdateEnabled {
// Auto-update disabled: cancel any ongoing download
if s.Updater != nil {
s.Updater.CancelOngoingDownload()
}
} else {
// Auto-update re-enabled: show notification if update is already staged, or trigger immediate check
if (updater.IsUpdatePending() || updater.UpdateDownloaded) && s.UpdateAvailableFunc != nil {
s.UpdateAvailableFunc()
} else if s.Updater != nil {
// Trigger the background checker to run immediately
s.Updater.TriggerImmediateCheck()
}
}
}
if old.ContextLength != settings.ContextLength || if old.ContextLength != settings.ContextLength ||
old.Models != settings.Models || old.Models != settings.Models ||
old.Expose != settings.Expose { old.Expose != settings.Expose {
@@ -1460,17 +1489,51 @@ func (s *Server) settings(w http.ResponseWriter, r *http.Request) error {
}) })
} }
func (s *Server) cloudSetting(w http.ResponseWriter, r *http.Request) error {
var req struct {
Enabled bool `json:"enabled"`
}
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
return fmt.Errorf("invalid request body: %w", err)
}
if err := s.Store.SetCloudEnabled(req.Enabled); err != nil {
return fmt.Errorf("failed to persist cloud setting: %w", err)
}
s.Restart()
return s.writeCloudStatus(w)
}
func (s *Server) getCloudSetting(w http.ResponseWriter, r *http.Request) error {
return s.writeCloudStatus(w)
}
func (s *Server) writeCloudStatus(w http.ResponseWriter) error {
disabled, source, err := s.Store.CloudStatus()
if err != nil {
return fmt.Errorf("failed to load cloud status: %w", err)
}
w.Header().Set("Content-Type", "application/json")
return json.NewEncoder(w).Encode(map[string]any{
"disabled": disabled,
"source": source,
})
}
func (s *Server) getInferenceCompute(w http.ResponseWriter, r *http.Request) error { func (s *Server) getInferenceCompute(w http.ResponseWriter, r *http.Request) error {
ctx, cancel := context.WithTimeout(r.Context(), 500*time.Millisecond) ctx, cancel := context.WithTimeout(r.Context(), 500*time.Millisecond)
defer cancel() defer cancel()
serverInferenceComputes, err := server.GetInferenceComputer(ctx) info, err := server.GetInferenceInfo(ctx)
if err != nil { if err != nil {
s.log().Error("failed to get inference compute", "error", err) s.log().Error("failed to get inference info", "error", err)
return fmt.Errorf("failed to get inference compute: %w", err) return fmt.Errorf("failed to get inference info: %w", err)
} }
inferenceComputes := make([]responses.InferenceCompute, len(serverInferenceComputes)) inferenceComputes := make([]responses.InferenceCompute, len(info.Computes))
for i, ic := range serverInferenceComputes { for i, ic := range info.Computes {
inferenceComputes[i] = responses.InferenceCompute{ inferenceComputes[i] = responses.InferenceCompute{
Library: ic.Library, Library: ic.Library,
Variant: ic.Variant, Variant: ic.Variant,
@@ -1483,6 +1546,7 @@ func (s *Server) getInferenceCompute(w http.ResponseWriter, r *http.Request) err
response := responses.InferenceComputeResponse{ response := responses.InferenceComputeResponse{
InferenceComputes: inferenceComputes, InferenceComputes: inferenceComputes,
DefaultContextLength: info.DefaultContextLength,
} }
w.Header().Set("Content-Type", "application/json") w.Header().Set("Content-Type", "application/json")
@@ -1515,9 +1579,18 @@ func (s *Server) modelUpstream(w http.ResponseWriter, r *http.Request) error {
return json.NewEncoder(w).Encode(response) return json.NewEncoder(w).Encode(response)
} }
n := model.ParseName(req.Model)
stale := true
if m, err := manifest.ParseNamedManifest(n); err == nil {
if m.Digest() == digest {
stale = false
} else if pushTime > 0 && m.FileInfo().ModTime().Unix() >= pushTime {
stale = false
}
}
response := responses.ModelUpstreamResponse{ response := responses.ModelUpstreamResponse{
Digest: digest, Stale: stale,
PushTime: pushTime,
} }
w.Header().Set("Content-Type", "application/json") w.Header().Set("Content-Type", "application/json")
@@ -1615,18 +1688,6 @@ func supportsBrowserTools(model string) bool {
return strings.HasPrefix(strings.ToLower(model), "gpt-oss") return strings.HasPrefix(strings.ToLower(model), "gpt-oss")
} }
// Web search tools are simpler, providing only basic web search and fetch capabilities (e.g., "web_search", "web_fetch") without simulating a browser. Currently only qwen3 and deepseek-v3 support web search tools.
func supportsWebSearchTools(model string) bool {
model = strings.ToLower(model)
prefixes := []string{"qwen3", "deepseek-v3"}
for _, p := range prefixes {
if strings.HasPrefix(model, p) {
return true
}
}
return false
}
// buildChatRequest converts store.Chat to api.ChatRequest // buildChatRequest converts store.Chat to api.ChatRequest
func (s *Server) buildChatRequest(chat *store.Chat, model string, think any, availableTools []map[string]any) (*api.ChatRequest, error) { func (s *Server) buildChatRequest(chat *store.Chat, model string, think any, availableTools []map[string]any) (*api.ChatRequest, error) {
var msgs []api.Message var msgs []api.Message

View File

@@ -4,6 +4,7 @@ package ui
import ( import (
"bytes" "bytes"
"context"
"encoding/json" "encoding/json"
"io" "io"
"net/http" "net/http"
@@ -11,9 +12,12 @@ import (
"path/filepath" "path/filepath"
"runtime" "runtime"
"strings" "strings"
"sync/atomic"
"testing" "testing"
"github.com/ollama/ollama/api"
"github.com/ollama/ollama/app/store" "github.com/ollama/ollama/app/store"
"github.com/ollama/ollama/app/updater"
) )
func TestHandlePostApiSettings(t *testing.T) { func TestHandlePostApiSettings(t *testing.T) {
@@ -115,6 +119,107 @@ func TestHandlePostApiSettings(t *testing.T) {
} }
} }
func TestHandlePostApiCloudSetting(t *testing.T) {
tmpHome := t.TempDir()
t.Setenv("HOME", tmpHome)
t.Setenv("OLLAMA_NO_CLOUD", "")
testStore := &store.Store{
DBPath: filepath.Join(t.TempDir(), "db.sqlite"),
}
defer testStore.Close()
restartCount := 0
server := &Server{
Store: testStore,
Restart: func() {
restartCount++
},
}
for _, tc := range []struct {
name string
body string
wantEnabled bool
}{
{name: "disable cloud", body: `{"enabled": false}`, wantEnabled: false},
{name: "enable cloud", body: `{"enabled": true}`, wantEnabled: true},
} {
t.Run(tc.name, func(t *testing.T) {
req := httptest.NewRequest("POST", "/api/v1/cloud", bytes.NewBufferString(tc.body))
req.Header.Set("Content-Type", "application/json")
rr := httptest.NewRecorder()
if err := server.cloudSetting(rr, req); err != nil {
t.Fatalf("cloudSetting() error = %v", err)
}
if rr.Code != http.StatusOK {
t.Fatalf("cloudSetting() status = %d, want %d", rr.Code, http.StatusOK)
}
var got map[string]any
if err := json.Unmarshal(rr.Body.Bytes(), &got); err != nil {
t.Fatalf("cloudSetting() invalid response JSON: %v", err)
}
if got["disabled"] != !tc.wantEnabled {
t.Fatalf("response disabled = %v, want %v", got["disabled"], !tc.wantEnabled)
}
disabled, err := testStore.CloudDisabled()
if err != nil {
t.Fatalf("CloudDisabled() error = %v", err)
}
if gotEnabled := !disabled; gotEnabled != tc.wantEnabled {
t.Fatalf("cloud enabled = %v, want %v", gotEnabled, tc.wantEnabled)
}
})
}
if restartCount != 2 {
t.Fatalf("Restart called %d times, want 2", restartCount)
}
}
func TestHandleGetApiCloudSetting(t *testing.T) {
tmpHome := t.TempDir()
t.Setenv("HOME", tmpHome)
t.Setenv("OLLAMA_NO_CLOUD", "")
testStore := &store.Store{
DBPath: filepath.Join(t.TempDir(), "db.sqlite"),
}
defer testStore.Close()
if err := testStore.SetCloudEnabled(false); err != nil {
t.Fatalf("SetCloudEnabled(false) error = %v", err)
}
server := &Server{
Store: testStore,
Restart: func() {},
}
req := httptest.NewRequest("GET", "/api/v1/cloud", nil)
rr := httptest.NewRecorder()
if err := server.getCloudSetting(rr, req); err != nil {
t.Fatalf("getCloudSetting() error = %v", err)
}
if rr.Code != http.StatusOK {
t.Fatalf("getCloudSetting() status = %d, want %d", rr.Code, http.StatusOK)
}
var got map[string]any
if err := json.Unmarshal(rr.Body.Bytes(), &got); err != nil {
t.Fatalf("getCloudSetting() invalid response JSON: %v", err)
}
if got["disabled"] != true {
t.Fatalf("response disabled = %v, want true", got["disabled"])
}
if got["source"] != "config" {
t.Fatalf("response source = %v, want config", got["source"])
}
}
func TestAuthenticationMiddleware(t *testing.T) { func TestAuthenticationMiddleware(t *testing.T) {
tests := []struct { tests := []struct {
name string name string
@@ -421,3 +526,317 @@ func TestUserAgentTransport(t *testing.T) {
t.Logf("User-Agent transport successfully set: %s", receivedUA) t.Logf("User-Agent transport successfully set: %s", receivedUA)
} }
func TestInferenceClientUsesUserAgent(t *testing.T) {
var gotUserAgent atomic.Value
ts := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
gotUserAgent.Store(r.Header.Get("User-Agent"))
w.Header().Set("Content-Type", "application/json")
w.Write([]byte(`{}`))
}))
defer ts.Close()
t.Setenv("OLLAMA_HOST", ts.URL)
server := &Server{}
client := server.inferenceClient()
_, err := client.Show(context.Background(), &api.ShowRequest{Model: "test"})
if err != nil {
t.Fatalf("show request failed: %v", err)
}
receivedUA, _ := gotUserAgent.Load().(string)
expectedUA := userAgent()
if receivedUA != expectedUA {
t.Errorf("User-Agent mismatch\nExpected: %s\nReceived: %s", expectedUA, receivedUA)
}
}
func TestSupportsBrowserTools(t *testing.T) {
tests := []struct {
model string
want bool
}{
{"gpt-oss", true},
{"gpt-oss-latest", true},
{"GPT-OSS", true},
{"Gpt-Oss-v2", true},
{"qwen3", false},
{"deepseek-v3", false},
{"llama3.3", false},
{"", false},
}
for _, tt := range tests {
t.Run(tt.model, func(t *testing.T) {
if got := supportsBrowserTools(tt.model); got != tt.want {
t.Errorf("supportsBrowserTools(%q) = %v, want %v", tt.model, got, tt.want)
}
})
}
}
func TestWebSearchToolRegistration(t *testing.T) {
// Validates that the capability-gating logic in chat() correctly
// decides which tools to register based on model capabilities and
// the web search flag.
tests := []struct {
name string
webSearchEnabled bool
hasToolsCap bool
model string
wantBrowser bool // expects browser tools (gpt-oss)
wantWebSearch bool // expects basic web search/fetch tools
wantNone bool // expects no tools registered
}{
{
name: "web search enabled with tools capability - browser model",
webSearchEnabled: true,
hasToolsCap: true,
model: "gpt-oss-latest",
wantBrowser: true,
},
{
name: "web search enabled with tools capability - non-browser model",
webSearchEnabled: true,
hasToolsCap: true,
model: "qwen3",
wantWebSearch: true,
},
{
name: "web search enabled without tools capability",
webSearchEnabled: true,
hasToolsCap: false,
model: "llama3.3",
wantNone: true,
},
{
name: "web search disabled with tools capability",
webSearchEnabled: false,
hasToolsCap: true,
model: "qwen3",
wantNone: true,
},
{
name: "web search disabled without tools capability",
webSearchEnabled: false,
hasToolsCap: false,
model: "llama3.3",
wantNone: true,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
// Replicate the decision logic from chat() handler
gotBrowser := false
gotWebSearch := false
if tt.webSearchEnabled && tt.hasToolsCap {
if supportsBrowserTools(tt.model) {
gotBrowser = true
} else {
gotWebSearch = true
}
}
if tt.wantBrowser && !gotBrowser {
t.Error("expected browser tools to be registered")
}
if tt.wantWebSearch && !gotWebSearch {
t.Error("expected web search tools to be registered")
}
if tt.wantNone && (gotBrowser || gotWebSearch) {
t.Error("expected no tools to be registered")
}
if !tt.wantBrowser && gotBrowser {
t.Error("unexpected browser tools registered")
}
if !tt.wantWebSearch && gotWebSearch {
t.Error("unexpected web search tools registered")
}
})
}
}
func TestSettingsToggleAutoUpdateOff_CancelsDownload(t *testing.T) {
testStore := &store.Store{
DBPath: filepath.Join(t.TempDir(), "db.sqlite"),
}
defer testStore.Close()
// Start with auto-update enabled
settings, err := testStore.Settings()
if err != nil {
t.Fatal(err)
}
settings.AutoUpdateEnabled = true
if err := testStore.SetSettings(settings); err != nil {
t.Fatal(err)
}
upd := &updater.Updater{Store: &store.Store{
DBPath: filepath.Join(t.TempDir(), "db2.sqlite"),
}}
defer upd.Store.Close()
// We can't easily mock CancelOngoingDownload, but we can verify
// the full settings handler flow works without error
server := &Server{
Store: testStore,
Restart: func() {},
Updater: upd,
}
// Disable auto-update via settings API
settings.AutoUpdateEnabled = false
body, err := json.Marshal(settings)
if err != nil {
t.Fatal(err)
}
req := httptest.NewRequest("POST", "/api/v1/settings", bytes.NewReader(body))
req.Header.Set("Content-Type", "application/json")
rr := httptest.NewRecorder()
if err := server.settings(rr, req); err != nil {
t.Fatalf("settings() error = %v", err)
}
if rr.Code != http.StatusOK {
t.Fatalf("settings() status = %d, want %d", rr.Code, http.StatusOK)
}
// Verify settings were saved with auto-update disabled
saved, err := testStore.Settings()
if err != nil {
t.Fatal(err)
}
if saved.AutoUpdateEnabled {
t.Fatal("expected AutoUpdateEnabled to be false after toggle off")
}
}
func TestSettingsToggleAutoUpdateOn_WithPendingUpdate_ShowsNotification(t *testing.T) {
testStore := &store.Store{
DBPath: filepath.Join(t.TempDir(), "db.sqlite"),
}
defer testStore.Close()
// Start with auto-update disabled
settings, err := testStore.Settings()
if err != nil {
t.Fatal(err)
}
settings.AutoUpdateEnabled = false
if err := testStore.SetSettings(settings); err != nil {
t.Fatal(err)
}
// Simulate that an update was previously downloaded
oldVal := updater.UpdateDownloaded
updater.UpdateDownloaded = true
defer func() { updater.UpdateDownloaded = oldVal }()
var notificationCalled atomic.Bool
server := &Server{
Store: testStore,
Restart: func() {},
UpdateAvailableFunc: func() {
notificationCalled.Store(true)
},
}
// Re-enable auto-update via settings API
settings.AutoUpdateEnabled = true
body, err := json.Marshal(settings)
if err != nil {
t.Fatal(err)
}
req := httptest.NewRequest("POST", "/api/v1/settings", bytes.NewReader(body))
req.Header.Set("Content-Type", "application/json")
rr := httptest.NewRecorder()
if err := server.settings(rr, req); err != nil {
t.Fatalf("settings() error = %v", err)
}
if rr.Code != http.StatusOK {
t.Fatalf("settings() status = %d, want %d", rr.Code, http.StatusOK)
}
if !notificationCalled.Load() {
t.Fatal("expected UpdateAvailableFunc to be called when re-enabling with a downloaded update")
}
}
func TestSettingsToggleAutoUpdateOn_NoPendingUpdate_TriggersCheck(t *testing.T) {
testStore := &store.Store{
DBPath: filepath.Join(t.TempDir(), "db.sqlite"),
}
defer testStore.Close()
// Start with auto-update disabled
settings, err := testStore.Settings()
if err != nil {
t.Fatal(err)
}
settings.AutoUpdateEnabled = false
if err := testStore.SetSettings(settings); err != nil {
t.Fatal(err)
}
// Ensure no pending update - clear both the downloaded flag and the stage dir
oldVal := updater.UpdateDownloaded
updater.UpdateDownloaded = false
defer func() { updater.UpdateDownloaded = oldVal }()
oldStageDir := updater.UpdateStageDir
updater.UpdateStageDir = t.TempDir() // empty dir means IsUpdatePending() returns false
defer func() { updater.UpdateStageDir = oldStageDir }()
upd := &updater.Updater{Store: &store.Store{
DBPath: filepath.Join(t.TempDir(), "db2.sqlite"),
}}
defer upd.Store.Close()
// Initialize the checkNow channel by starting (and immediately stopping) the checker
// so TriggerImmediateCheck doesn't panic on nil channel
ctx, cancel := context.WithCancel(t.Context())
upd.StartBackgroundUpdaterChecker(ctx, func(string) error { return nil })
defer cancel()
var notificationCalled atomic.Bool
server := &Server{
Store: testStore,
Restart: func() {},
Updater: upd,
UpdateAvailableFunc: func() {
notificationCalled.Store(true)
},
}
// Re-enable auto-update via settings API
settings.AutoUpdateEnabled = true
body, err := json.Marshal(settings)
if err != nil {
t.Fatal(err)
}
req := httptest.NewRequest("POST", "/api/v1/settings", bytes.NewReader(body))
req.Header.Set("Content-Type", "application/json")
rr := httptest.NewRecorder()
if err := server.settings(rr, req); err != nil {
t.Fatalf("settings() error = %v", err)
}
if rr.Code != http.StatusOK {
t.Fatalf("settings() status = %d, want %d", rr.Code, http.StatusOK)
}
// UpdateAvailableFunc should NOT be called since there's no pending update
if notificationCalled.Load() {
t.Fatal("UpdateAvailableFunc should not be called when there is no pending update")
}
}

View File

@@ -19,6 +19,7 @@ import (
"runtime" "runtime"
"strconv" "strconv"
"strings" "strings"
"sync"
"time" "time"
"github.com/ollama/ollama/app/store" "github.com/ollama/ollama/app/store"
@@ -58,7 +59,8 @@ func (u *Updater) checkForUpdate(ctx context.Context) (bool, UpdateResponse) {
query := requestURL.Query() query := requestURL.Query()
query.Add("os", runtime.GOOS) query.Add("os", runtime.GOOS)
query.Add("arch", runtime.GOARCH) query.Add("arch", runtime.GOARCH)
query.Add("version", version.Version) currentVersion := version.Version
query.Add("version", currentVersion)
query.Add("ts", strconv.FormatInt(time.Now().Unix(), 10)) query.Add("ts", strconv.FormatInt(time.Now().Unix(), 10))
// The original macOS app used to use the device ID // The original macOS app used to use the device ID
@@ -131,15 +133,27 @@ func (u *Updater) checkForUpdate(ctx context.Context) (bool, UpdateResponse) {
} }
func (u *Updater) DownloadNewRelease(ctx context.Context, updateResp UpdateResponse) error { func (u *Updater) DownloadNewRelease(ctx context.Context, updateResp UpdateResponse) error {
// Create a cancellable context for this download
downloadCtx, cancel := context.WithCancel(ctx)
u.cancelDownloadLock.Lock()
u.cancelDownload = cancel
u.cancelDownloadLock.Unlock()
defer func() {
u.cancelDownloadLock.Lock()
u.cancelDownload = nil
u.cancelDownloadLock.Unlock()
cancel()
}()
// Do a head first to check etag info // Do a head first to check etag info
req, err := http.NewRequestWithContext(ctx, http.MethodHead, updateResp.UpdateURL, nil) req, err := http.NewRequestWithContext(downloadCtx, http.MethodHead, updateResp.UpdateURL, nil)
if err != nil { if err != nil {
return err return err
} }
// In case of slow downloads, continue the update check in the background // In case of slow downloads, continue the update check in the background
bgctx, cancel := context.WithCancel(ctx) bgctx, bgcancel := context.WithCancel(downloadCtx)
defer cancel() defer bgcancel()
go func() { go func() {
for { for {
select { select {
@@ -176,6 +190,7 @@ func (u *Updater) DownloadNewRelease(ctx context.Context, updateResp UpdateRespo
_, err = os.Stat(stageFilename) _, err = os.Stat(stageFilename)
if err == nil { if err == nil {
slog.Info("update already downloaded", "bundle", stageFilename) slog.Info("update already downloaded", "bundle", stageFilename)
UpdateDownloaded = true
return nil return nil
} }
@@ -245,32 +260,84 @@ func cleanupOldDownloads(stageDir string) {
type Updater struct { type Updater struct {
Store *store.Store Store *store.Store
cancelDownload context.CancelFunc
cancelDownloadLock sync.Mutex
checkNow chan struct{}
}
// CancelOngoingDownload cancels any currently running download
func (u *Updater) CancelOngoingDownload() {
u.cancelDownloadLock.Lock()
defer u.cancelDownloadLock.Unlock()
if u.cancelDownload != nil {
slog.Info("cancelling ongoing update download")
u.cancelDownload()
u.cancelDownload = nil
}
}
// TriggerImmediateCheck signals the background checker to check for updates immediately
func (u *Updater) TriggerImmediateCheck() {
if u.checkNow != nil {
select {
case u.checkNow <- struct{}{}:
default:
// Check already pending, no need to queue another
}
}
} }
func (u *Updater) StartBackgroundUpdaterChecker(ctx context.Context, cb func(string) error) { func (u *Updater) StartBackgroundUpdaterChecker(ctx context.Context, cb func(string) error) {
u.checkNow = make(chan struct{}, 1)
u.checkNow <- struct{}{} // Trigger first check after initial delay
go func() { go func() {
// Don't blast an update message immediately after startup // Don't blast an update message immediately after startup
time.Sleep(UpdateCheckInitialDelay) time.Sleep(UpdateCheckInitialDelay)
slog.Info("beginning update checker", "interval", UpdateCheckInterval) slog.Info("beginning update checker", "interval", UpdateCheckInterval)
ticker := time.NewTicker(UpdateCheckInterval)
defer ticker.Stop()
for { for {
available, resp := u.checkForUpdate(ctx)
if available {
err := u.DownloadNewRelease(ctx, resp)
if err != nil {
slog.Error(fmt.Sprintf("failed to download new release: %s", err))
} else {
err = cb(resp.UpdateVersion)
if err != nil {
slog.Warn(fmt.Sprintf("failed to register update available with tray: %s", err))
}
}
}
select { select {
case <-ctx.Done(): case <-ctx.Done():
slog.Debug("stopping background update checker") slog.Debug("stopping background update checker")
return return
default: case <-u.checkNow:
time.Sleep(UpdateCheckInterval) // Immediate check triggered
case <-ticker.C:
// Regular interval check
}
// Always check for updates
available, resp := u.checkForUpdate(ctx)
if !available {
continue
}
// Update is available - check if auto-update is enabled for downloading
settings, err := u.Store.Settings()
if err != nil {
slog.Error("failed to load settings", "error", err)
continue
}
if !settings.AutoUpdateEnabled {
// Auto-update disabled - don't download, just log
slog.Debug("update available but auto-update disabled", "version", resp.UpdateVersion)
continue
}
// Auto-update is enabled - download
err = u.DownloadNewRelease(ctx, resp)
if err != nil {
slog.Error("failed to download new release", "error", err)
continue
}
// Download successful - show tray notification
err = cb(resp.UpdateVersion)
if err != nil {
slog.Warn("failed to register update available with tray", "error", err)
} }
} }
}() }()

View File

@@ -11,6 +11,8 @@ import (
"log/slog" "log/slog"
"net/http" "net/http"
"net/http/httptest" "net/http/httptest"
"path/filepath"
"sync/atomic"
"testing" "testing"
"time" "time"
@@ -33,7 +35,7 @@ func TestIsNewReleaseAvailable(t *testing.T) {
defer server.Close() defer server.Close()
slog.Debug("server", "url", server.URL) slog.Debug("server", "url", server.URL)
updater := &Updater{Store: &store.Store{}} updater := &Updater{Store: &store.Store{DBPath: filepath.Join(t.TempDir(), "test.db")}}
defer updater.Store.Close() // Ensure database is closed defer updater.Store.Close() // Ensure database is closed
UpdateCheckURLBase = server.URL + "/update.json" UpdateCheckURLBase = server.URL + "/update.json"
updatePresent, resp := updater.checkForUpdate(t.Context()) updatePresent, resp := updater.checkForUpdate(t.Context())
@@ -84,8 +86,18 @@ func TestBackgoundChecker(t *testing.T) {
defer server.Close() defer server.Close()
UpdateCheckURLBase = server.URL + "/update.json" UpdateCheckURLBase = server.URL + "/update.json"
updater := &Updater{Store: &store.Store{}} updater := &Updater{Store: &store.Store{DBPath: filepath.Join(t.TempDir(), "test.db")}}
defer updater.Store.Close() // Ensure database is closed defer updater.Store.Close()
settings, err := updater.Store.Settings()
if err != nil {
t.Fatal(err)
}
settings.AutoUpdateEnabled = true
if err := updater.Store.SetSettings(settings); err != nil {
t.Fatal(err)
}
updater.StartBackgroundUpdaterChecker(ctx, cb) updater.StartBackgroundUpdaterChecker(ctx, cb)
select { select {
case <-stallTimer.C: case <-stallTimer.C:
@@ -99,3 +111,267 @@ func TestBackgoundChecker(t *testing.T) {
} }
} }
} }
func TestAutoUpdateDisabledSkipsDownload(t *testing.T) {
UpdateStageDir = t.TempDir()
var downloadAttempted atomic.Bool
done := make(chan struct{})
ctx, cancel := context.WithCancel(t.Context())
defer cancel()
UpdateCheckInitialDelay = 5 * time.Millisecond
UpdateCheckInterval = 5 * time.Millisecond
VerifyDownload = func() error {
return nil
}
var server *httptest.Server
server = httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.URL.Path == "/update.json" {
w.Write([]byte(
fmt.Sprintf(`{"version": "9.9.9", "url": "%s"}`,
server.URL+"/9.9.9/"+Installer)))
} else if r.URL.Path == "/9.9.9/"+Installer {
downloadAttempted.Store(true)
buf := &bytes.Buffer{}
zw := zip.NewWriter(buf)
zw.Close()
io.Copy(w, buf)
}
}))
defer server.Close()
UpdateCheckURLBase = server.URL + "/update.json"
updater := &Updater{Store: &store.Store{DBPath: filepath.Join(t.TempDir(), "test.db")}}
defer updater.Store.Close()
// Ensure auto-update is disabled
settings, err := updater.Store.Settings()
if err != nil {
t.Fatal(err)
}
settings.AutoUpdateEnabled = false
if err := updater.Store.SetSettings(settings); err != nil {
t.Fatal(err)
}
cb := func(ver string) error {
t.Fatal("callback should not be called when auto-update is disabled")
return nil
}
updater.StartBackgroundUpdaterChecker(ctx, cb)
// Wait enough time for multiple check cycles
time.Sleep(50 * time.Millisecond)
close(done)
if downloadAttempted.Load() {
t.Fatal("download should not be attempted when auto-update is disabled")
}
}
func TestAutoUpdateReenabledDownloadsUpdate(t *testing.T) {
UpdateStageDir = t.TempDir()
var downloadAttempted atomic.Bool
callbackCalled := make(chan struct{}, 1)
ctx, cancel := context.WithCancel(t.Context())
defer cancel()
UpdateCheckInitialDelay = 5 * time.Millisecond
UpdateCheckInterval = 5 * time.Millisecond
VerifyDownload = func() error {
return nil
}
var server *httptest.Server
server = httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.URL.Path == "/update.json" {
w.Write([]byte(
fmt.Sprintf(`{"version": "9.9.9", "url": "%s"}`,
server.URL+"/9.9.9/"+Installer)))
} else if r.URL.Path == "/9.9.9/"+Installer {
downloadAttempted.Store(true)
buf := &bytes.Buffer{}
zw := zip.NewWriter(buf)
zw.Close()
io.Copy(w, buf)
}
}))
defer server.Close()
UpdateCheckURLBase = server.URL + "/update.json"
upd := &Updater{Store: &store.Store{DBPath: filepath.Join(t.TempDir(), "test.db")}}
defer upd.Store.Close()
// Start with auto-update disabled
settings, err := upd.Store.Settings()
if err != nil {
t.Fatal(err)
}
settings.AutoUpdateEnabled = false
if err := upd.Store.SetSettings(settings); err != nil {
t.Fatal(err)
}
cb := func(ver string) error {
select {
case callbackCalled <- struct{}{}:
default:
}
return nil
}
upd.StartBackgroundUpdaterChecker(ctx, cb)
// Wait for a few cycles with auto-update disabled - no download should happen
time.Sleep(50 * time.Millisecond)
if downloadAttempted.Load() {
t.Fatal("download should not happen while auto-update is disabled")
}
// Re-enable auto-update
settings.AutoUpdateEnabled = true
if err := upd.Store.SetSettings(settings); err != nil {
t.Fatal(err)
}
// Wait for the checker to pick it up and download
select {
case <-callbackCalled:
// Success: download happened and callback was called after re-enabling
if !downloadAttempted.Load() {
t.Fatal("expected download to be attempted after re-enabling")
}
case <-time.After(5 * time.Second):
t.Fatal("expected download and callback after re-enabling auto-update")
}
}
func TestCancelOngoingDownload(t *testing.T) {
UpdateStageDir = t.TempDir()
downloadStarted := make(chan struct{})
downloadCancelled := make(chan struct{})
ctx := t.Context()
VerifyDownload = func() error {
return nil
}
var server *httptest.Server
server = httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.URL.Path == "/update.json" {
w.Write([]byte(
fmt.Sprintf(`{"version": "9.9.9", "url": "%s"}`,
server.URL+"/9.9.9/"+Installer)))
} else if r.URL.Path == "/9.9.9/"+Installer {
if r.Method == http.MethodHead {
w.Header().Set("Content-Length", "1000000")
w.WriteHeader(http.StatusOK)
return
}
// Signal that download has started
close(downloadStarted)
// Wait for cancellation or timeout
select {
case <-r.Context().Done():
close(downloadCancelled)
return
case <-time.After(5 * time.Second):
t.Error("download was not cancelled in time")
}
}
}))
defer server.Close()
UpdateCheckURLBase = server.URL + "/update.json"
updater := &Updater{Store: &store.Store{DBPath: filepath.Join(t.TempDir(), "test.db")}}
defer updater.Store.Close()
_, resp := updater.checkForUpdate(ctx)
// Start download in goroutine
go func() {
_ = updater.DownloadNewRelease(ctx, resp)
}()
// Wait for download to start
select {
case <-downloadStarted:
case <-time.After(2 * time.Second):
t.Fatal("download did not start in time")
}
// Cancel the download
updater.CancelOngoingDownload()
// Verify cancellation was received
select {
case <-downloadCancelled:
// Success
case <-time.After(2 * time.Second):
t.Fatal("download cancellation was not received by server")
}
}
func TestTriggerImmediateCheck(t *testing.T) {
UpdateStageDir = t.TempDir()
checkCount := atomic.Int32{}
checkDone := make(chan struct{}, 10)
ctx, cancel := context.WithCancel(t.Context())
defer cancel()
// Set a very long interval so only TriggerImmediateCheck causes checks
UpdateCheckInitialDelay = 1 * time.Millisecond
UpdateCheckInterval = 1 * time.Hour
VerifyDownload = func() error {
return nil
}
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.URL.Path == "/update.json" {
checkCount.Add(1)
select {
case checkDone <- struct{}{}:
default:
}
// Return no update available
w.WriteHeader(http.StatusNoContent)
}
}))
defer server.Close()
UpdateCheckURLBase = server.URL + "/update.json"
updater := &Updater{Store: &store.Store{DBPath: filepath.Join(t.TempDir(), "test.db")}}
defer updater.Store.Close()
cb := func(ver string) error {
return nil
}
updater.StartBackgroundUpdaterChecker(ctx, cb)
// Wait for the initial check that fires after the initial delay
select {
case <-checkDone:
case <-time.After(2 * time.Second):
t.Fatal("initial check did not happen")
}
initialCount := checkCount.Load()
// Trigger immediate check
updater.TriggerImmediateCheck()
// Wait for the triggered check
select {
case <-checkDone:
case <-time.After(2 * time.Second):
t.Fatal("triggered check did not happen")
}
finalCount := checkCount.Load()
if finalCount <= initialCount {
t.Fatalf("TriggerImmediateCheck did not cause additional check: initial=%d, final=%d", initialCount, finalCount)
}
}

View File

@@ -369,25 +369,6 @@ func (t *winTray) addSeparatorMenuItem(menuItemId, parentId uint32) error {
return nil return nil
} }
// func (t *winTray) hideMenuItem(menuItemId, parentId uint32) error {
// const ERROR_SUCCESS syscall.Errno = 0
// t.muMenus.RLock()
// menu := uintptr(t.menus[parentId])
// t.muMenus.RUnlock()
// res, _, err := pRemoveMenu.Call(
// menu,
// uintptr(menuItemId),
// MF_BYCOMMAND,
// )
// if res == 0 && err.(syscall.Errno) != ERROR_SUCCESS {
// return err
// }
// t.delFromVisibleItems(parentId, menuItemId)
// return nil
// }
func (t *winTray) showMenu() error { func (t *winTray) showMenu() error {
p := point{} p := point{}
boolRet, _, err := pGetCursorPos.Call(uintptr(unsafe.Pointer(&p))) boolRet, _, err := pGetCursorPos.Call(uintptr(unsafe.Pointer(&p)))

View File

@@ -51,7 +51,6 @@ const (
IMAGE_ICON = 1 // Loads an icon IMAGE_ICON = 1 // Loads an icon
LR_DEFAULTSIZE = 0x00000040 // Loads default-size icon for windows(SM_CXICON x SM_CYICON) if cx, cy are set to zero LR_DEFAULTSIZE = 0x00000040 // Loads default-size icon for windows(SM_CXICON x SM_CYICON) if cx, cy are set to zero
LR_LOADFROMFILE = 0x00000010 // Loads the stand-alone image from the file LR_LOADFROMFILE = 0x00000010 // Loads the stand-alone image from the file
MF_BYCOMMAND = 0x00000000
MFS_DISABLED = 0x00000003 MFS_DISABLED = 0x00000003
MFT_SEPARATOR = 0x00000800 MFT_SEPARATOR = 0x00000800
MFT_STRING = 0x00000000 MFT_STRING = 0x00000000

13
cmd/background_unix.go Normal file
View File

@@ -0,0 +1,13 @@
//go:build !windows
package cmd
import "syscall"
// backgroundServerSysProcAttr returns SysProcAttr for running the server in the background on Unix.
// Setpgid prevents the server from being killed when the parent process exits.
func backgroundServerSysProcAttr() *syscall.SysProcAttr {
return &syscall.SysProcAttr{
Setpgid: true,
}
}

12
cmd/background_windows.go Normal file
View File

@@ -0,0 +1,12 @@
package cmd
import "syscall"
// backgroundServerSysProcAttr returns SysProcAttr for running the server in the background on Windows.
// CREATE_NO_WINDOW (0x08000000) prevents a console window from appearing.
func backgroundServerSysProcAttr() *syscall.SysProcAttr {
return &syscall.SysProcAttr{
CreationFlags: 0x08000000,
HideWindow: true,
}
}

View File

@@ -1,27 +1,31 @@
Ollama Benchmark Tool Ollama Benchmark Tool
--------------------- ---------------------
A Go-based command-line tool for benchmarking Ollama models with configurable parameters and multiple output formats. A Go-based command-line tool for benchmarking Ollama models with configurable parameters, warmup phases, TTFT tracking, VRAM monitoring, and benchstat/CSV output.
## Features ## Features
* Benchmark multiple models in a single run * Benchmark multiple models in a single run
* Support for both text and image prompts * Support for both text and image prompts
* Configurable generation parameters (temperature, max tokens, seed, etc.) * Configurable generation parameters (temperature, max tokens, seed, etc.)
* Supports benchstat and CSV output formats * Warmup phase before timed epochs to stabilize measurements
* Detailed performance metrics (prefill, generate, load, total durations) * Time-to-first-token (TTFT) tracking per epoch
* Model metadata display (parameter size, quantization level, family)
* VRAM and CPU memory usage tracking via running process info
* Controlled prompt token length for reproducible benchmarks
* Benchstat and CSV output formats
## Building from Source ## Building from Source
``` ```
go build -o ollama-bench bench.go go build -o ollama-bench ./cmd/bench
./ollama-bench -model gpt-oss:20b -epochs 6 -format csv ./ollama-bench -model gemma3 -epochs 6 -format csv
``` ```
Using Go Run (without building) Using Go Run (without building)
``` ```
go run bench.go -model gpt-oss:20b -epochs 3 go run ./cmd/bench -model gemma3 -epochs 3
``` ```
## Usage ## Usage
@@ -45,10 +49,16 @@ benchstat -col /name gemma.bench
./ollama-bench -model qwen3-vl -image photo.jpg -epochs 6 -max-tokens 100 -p "Describe this image" ./ollama-bench -model qwen3-vl -image photo.jpg -epochs 6 -max-tokens 100 -p "Describe this image"
``` ```
### Controlled Prompt Length
```
./ollama-bench -model gemma3 -epochs 6 -prompt-tokens 512
```
### Advanced Example ### Advanced Example
``` ```
./ollama-bench -model llama3 -epochs 10 -temperature 0.7 -max-tokens 500 -seed 42 -format csv -output results.csv ./ollama-bench -model llama3 -epochs 10 -temperature 0.7 -max-tokens 500 -seed 42 -warmup 2 -format csv -output results.csv
``` ```
## Command Line Options ## Command Line Options
@@ -56,41 +66,48 @@ benchstat -col /name gemma.bench
| Option | Description | Default | | Option | Description | Default |
|----------|-------------|---------| |----------|-------------|---------|
| -model | Comma-separated list of models to benchmark | (required) | | -model | Comma-separated list of models to benchmark | (required) |
| -epochs | Number of iterations per model | 1 | | -epochs | Number of iterations per model | 6 |
| -max-tokens | Maximum tokens for model response | 0 (unlimited) | | -max-tokens | Maximum tokens for model response | 200 |
| -temperature | Temperature parameter | 0.0 | | -temperature | Temperature parameter | 0.0 |
| -seed | Random seed | 0 (random) | | -seed | Random seed | 0 (random) |
| -timeout | Timeout in seconds | 300 | | -timeout | Timeout in seconds | 300 |
| -p | Prompt text | "Write a long story." | | -p | Prompt text | (default story prompt) |
| -image | Image file to include in prompt | | | -image | Image file to include in prompt | |
| -k | Keep-alive duration in seconds | 0 | | -k | Keep-alive duration in seconds | 0 |
| -format | Output format (benchstat, csv) | benchstat | | -format | Output format (benchstat, csv) | benchstat |
| -output | Output file for results | "" (stdout) | | -output | Output file for results | "" (stdout) |
| -warmup | Number of warmup requests before timing | 1 |
| -prompt-tokens | Generate prompt targeting ~N tokens (0 = use -p) | 0 |
| -v | Verbose mode | false | | -v | Verbose mode | false |
| -debug | Show debug information | false | | -debug | Show debug information | false |
## Output Formats ## Output Formats
### Markdown Format ### Benchstat Format (default)
The default markdown format is suitable for copying and pasting into a GitHub issue and will look like: Compatible with Go's benchstat tool for statistical analysis. Uses one value/unit pair per line, standard `ns/op` for timing metrics, and `ns/token` for throughput. Each epoch produces one set of lines -- benchstat aggregates across repeated runs to compute statistics.
```
Model | Step | Count | Duration | nsPerToken | tokensPerSec |
|-------|------|-------|----------|------------|--------------|
| gpt-oss:20b | prefill | 124 | 30.006458ms | 241987.56 | 4132.44 |
| gpt-oss:20b | generate | 200 | 2.646843954s | 13234219.77 | 75.56 |
| gpt-oss:20b | load | 1 | 121.674208ms | - | - |
| gpt-oss:20b | total | 1 | 2.861047625s | - | - |
```
### Benchstat Format
Compatible with Go's benchstat tool for statistical analysis:
``` ```
BenchmarkModel/name=gpt-oss:20b/step=prefill 128 78125.00 ns/token 12800.00 token/sec # Model: gemma3 | Params: 4.3B | Quant: Q4_K_M | Family: gemma3 | Size: 4080218931 | VRAM: 4080218931
BenchmarkModel/name=gpt-oss:20b/step=generate 512 19531.25 ns/token 51200.00 token/sec BenchmarkModel/name=gemma3/step=prefill 1 78125.00 ns/token 12800.00 token/sec
BenchmarkModel/name=gpt-oss:20b/step=load 1 1500000000 ns/request BenchmarkModel/name=gemma3/step=generate 1 19531.25 ns/token 51200.00 token/sec
BenchmarkModel/name=gemma3/step=ttft 1 45123000 ns/op
BenchmarkModel/name=gemma3/step=load 1 1500000000 ns/op
BenchmarkModel/name=gemma3/step=total 1 2861047625 ns/op
```
Use with benchstat:
```
./ollama-bench -model gemma3 -epochs 6 > gemma3.bench
benchstat -col /step gemma3.bench
```
Compare two runs:
```
./ollama-bench -model gemma3 -epochs 6 > before.bench
# ... make changes ...
./ollama-bench -model gemma3 -epochs 6 > after.bench
benchstat before.bench after.bench
``` ```
### CSV Format ### CSV Format
@@ -99,17 +116,28 @@ Machine-readable comma-separated values:
``` ```
NAME,STEP,COUNT,NS_PER_COUNT,TOKEN_PER_SEC NAME,STEP,COUNT,NS_PER_COUNT,TOKEN_PER_SEC
gpt-oss:20b,prefill,128,78125.00,12800.00 # Model: gemma3 | Params: 4.3B | Quant: Q4_K_M | Family: gemma3 | Size: 4080218931 | VRAM: 4080218931
gpt-oss:20b,generate,512,19531.25,51200.00 gemma3,prefill,128,78125.00,12800.00
gpt-oss:20b,load,1,1500000000,0 gemma3,generate,512,19531.25,51200.00
gemma3,ttft,1,45123000,0
gemma3,load,1,1500000000,0
gemma3,total,1,2861047625,0
``` ```
## Metrics Explained ## Metrics Explained
The tool reports four types of metrics for each model: The tool reports the following metrics for each epoch:
* prefill: Time spent processing the prompt * **prefill**: Time spent processing the prompt (ns/token)
* generate: Time spent generating the response * **generate**: Time spent generating the response (ns/token)
* load: Model loading time (one-time cost) * **ttft**: Time to first token -- latency from request start to first response content
* total: Total request duration * **load**: Model loading time (one-time cost)
* **total**: Total request duration
Additionally, the model info comment line (displayed once per model before epochs) includes:
* **Params**: Model parameter count (e.g., 4.3B)
* **Quant**: Quantization level (e.g., Q4_K_M)
* **Family**: Model family (e.g., gemma3)
* **Size**: Total model memory in bytes
* **VRAM**: GPU memory used by the loaded model (when Size > VRAM, the difference is CPU spill)

View File

@@ -30,6 +30,9 @@ type flagOptions struct {
outputFile *string outputFile *string
debug *bool debug *bool
verbose *bool verbose *bool
warmup *int
promptTokens *int
numCtx *int
} }
type Metrics struct { type Metrics struct {
@@ -39,48 +42,203 @@ type Metrics struct {
Duration time.Duration Duration time.Duration
} }
var once sync.Once type ModelInfo struct {
Name string
ParameterSize string
QuantizationLevel string
Family string
SizeBytes int64
VRAMBytes int64
NumCtx int64
}
const DefaultPrompt = `Please write a descriptive story about a llama named Alonso who grows up to be President of the Land of Llamas. Include details about Alonso's childhood, adolescent years, and how he grew up to be a political mover and shaker. Write the story with a sense of whimsy.` const DefaultPrompt = `Please write a descriptive story about a llama named Alonso who grows up to be President of the Land of Llamas. Include details about Alonso's childhood, adolescent years, and how he grew up to be a political mover and shaker. Write the story with a sense of whimsy.`
// Word list for generating prompts targeting a specific token count.
var promptWordList = []string{
"the", "quick", "brown", "fox", "jumps", "over", "lazy", "dog",
"a", "bright", "sunny", "day", "in", "the", "meadow", "where",
"flowers", "bloom", "and", "birds", "sing", "their", "morning",
"songs", "while", "gentle", "breeze", "carries", "sweet", "scent",
"of", "pine", "trees", "across", "rolling", "hills", "toward",
"distant", "mountains", "covered", "with", "fresh", "snow",
"beneath", "clear", "blue", "sky", "children", "play", "near",
"old", "stone", "bridge", "that", "crosses", "winding", "river",
}
// tokensPerWord is the calibrated ratio of tokens to words for the current model.
// Initialized with a heuristic, then updated during warmup based on actual tokenization.
var tokensPerWord = 1.3
func generatePromptForTokenCount(targetTokens int, epoch int) string {
targetWords := int(float64(targetTokens) / tokensPerWord)
if targetWords < 1 {
targetWords = 1
}
// Vary the starting offset by epoch to defeat KV cache prefix matching
offset := epoch * 7 // stride by a prime to get good distribution
n := len(promptWordList)
words := make([]string, targetWords)
for i := range words {
words[i] = promptWordList[((i+offset)%n+n)%n]
}
return strings.Join(words, " ")
}
// calibratePromptTokens adjusts tokensPerWord based on actual tokenization from a warmup run.
func calibratePromptTokens(targetTokens, actualTokens, wordCount int) {
if actualTokens <= 0 || wordCount <= 0 {
return
}
tokensPerWord = float64(actualTokens) / float64(wordCount)
newWords := int(float64(targetTokens) / tokensPerWord)
fmt.Fprintf(os.Stderr, "bench: calibrated %.2f tokens/word (target=%d, got=%d, words=%d → %d)\n",
tokensPerWord, targetTokens, actualTokens, wordCount, newWords)
}
func buildGenerateRequest(model string, fOpt flagOptions, imgData api.ImageData, epoch int) *api.GenerateRequest {
options := make(map[string]interface{})
if *fOpt.maxTokens > 0 {
options["num_predict"] = *fOpt.maxTokens
}
options["temperature"] = *fOpt.temperature
if fOpt.seed != nil && *fOpt.seed > 0 {
options["seed"] = *fOpt.seed
}
if fOpt.numCtx != nil && *fOpt.numCtx > 0 {
options["num_ctx"] = *fOpt.numCtx
}
var keepAliveDuration *api.Duration
if *fOpt.keepAlive > 0 {
duration := api.Duration{Duration: time.Duration(*fOpt.keepAlive * float64(time.Second))}
keepAliveDuration = &duration
}
prompt := *fOpt.prompt
if *fOpt.promptTokens > 0 {
prompt = generatePromptForTokenCount(*fOpt.promptTokens, epoch)
} else {
// Vary the prompt per epoch to defeat KV cache prefix matching
prompt = fmt.Sprintf("[%d] %s", epoch, prompt)
}
req := &api.GenerateRequest{
Model: model,
Prompt: prompt,
Raw: true,
Options: options,
KeepAlive: keepAliveDuration,
}
if imgData != nil {
req.Images = []api.ImageData{imgData}
}
return req
}
func fetchModelInfo(ctx context.Context, client *api.Client, model string) ModelInfo {
info := ModelInfo{Name: model}
resp, err := client.Show(ctx, &api.ShowRequest{Model: model})
if err != nil {
fmt.Fprintf(os.Stderr, "WARNING: Could not fetch model info for '%s': %v\n", model, err)
return info
}
info.ParameterSize = resp.Details.ParameterSize
info.QuantizationLevel = resp.Details.QuantizationLevel
info.Family = resp.Details.Family
return info
}
func fetchMemoryUsage(ctx context.Context, client *api.Client, model string) (size, vram int64) {
resp, err := client.ListRunning(ctx)
if err != nil {
if debug := os.Getenv("OLLAMA_DEBUG"); debug != "" {
fmt.Fprintf(os.Stderr, "WARNING: Could not fetch memory usage: %v\n", err)
}
return 0, 0
}
for _, m := range resp.Models {
if m.Name == model || m.Model == model {
return m.Size, m.SizeVRAM
}
}
for _, m := range resp.Models {
if strings.HasPrefix(m.Name, model) || strings.HasPrefix(m.Model, model) {
return m.Size, m.SizeVRAM
}
}
return 0, 0
}
func fetchContextLength(ctx context.Context, client *api.Client, model string) int64 {
resp, err := client.ListRunning(ctx)
if err != nil {
return 0
}
for _, m := range resp.Models {
if m.Name == model || m.Model == model || strings.HasPrefix(m.Name, model) || strings.HasPrefix(m.Model, model) {
return int64(m.ContextLength)
}
}
return 0
}
func outputFormatHeader(w io.Writer, format string, verbose bool) {
switch format {
case "benchstat":
if verbose {
fmt.Fprintf(w, "goos: %s\n", runtime.GOOS)
fmt.Fprintf(w, "goarch: %s\n", runtime.GOARCH)
}
case "csv":
headings := []string{"NAME", "STEP", "COUNT", "NS_PER_COUNT", "TOKEN_PER_SEC"}
fmt.Fprintln(w, strings.Join(headings, ","))
}
}
func outputModelInfo(w io.Writer, format string, info ModelInfo) {
params := cmp.Or(info.ParameterSize, "unknown")
quant := cmp.Or(info.QuantizationLevel, "unknown")
family := cmp.Or(info.Family, "unknown")
memStr := ""
if info.SizeBytes > 0 {
memStr = fmt.Sprintf(" | Size: %d | VRAM: %d", info.SizeBytes, info.VRAMBytes)
}
ctxStr := ""
if info.NumCtx > 0 {
ctxStr = fmt.Sprintf(" | NumCtx: %d", info.NumCtx)
}
fmt.Fprintf(w, "# Model: %s | Params: %s | Quant: %s | Family: %s%s%s\n",
info.Name, params, quant, family, memStr, ctxStr)
}
func OutputMetrics(w io.Writer, format string, metrics []Metrics, verbose bool) { func OutputMetrics(w io.Writer, format string, metrics []Metrics, verbose bool) {
switch format { switch format {
case "benchstat": case "benchstat":
if verbose {
printHeader := func() {
fmt.Fprintf(w, "sysname: %s\n", runtime.GOOS)
fmt.Fprintf(w, "machine: %s\n", runtime.GOARCH)
}
once.Do(printHeader)
}
for _, m := range metrics { for _, m := range metrics {
if m.Step == "generate" || m.Step == "prefill" { if m.Step == "generate" || m.Step == "prefill" {
if m.Count > 0 { if m.Count > 0 {
nsPerToken := float64(m.Duration.Nanoseconds()) / float64(m.Count) nsPerToken := float64(m.Duration.Nanoseconds()) / float64(m.Count)
tokensPerSec := float64(m.Count) / (float64(m.Duration.Nanoseconds()) + 1e-12) * 1e9 tokensPerSec := float64(m.Count) / (float64(m.Duration.Nanoseconds()) + 1e-12) * 1e9
fmt.Fprintf(w, "BenchmarkModel/name=%s/step=%s 1 %.2f ns/token %.2f token/sec\n",
fmt.Fprintf(w, "BenchmarkModel/name=%s/step=%s %d %.2f ns/token %.2f token/sec\n", m.Model, m.Step, nsPerToken, tokensPerSec)
m.Model, m.Step, m.Count, nsPerToken, tokensPerSec)
} else { } else {
fmt.Fprintf(w, "BenchmarkModel/name=%s/step=%s %d 0 ns/token 0 token/sec\n", fmt.Fprintf(w, "BenchmarkModel/name=%s/step=%s 1 0 ns/token 0 token/sec\n",
m.Model, m.Step, m.Count) m.Model, m.Step)
} }
} else if m.Step == "ttft" {
fmt.Fprintf(w, "BenchmarkModel/name=%s/step=ttft 1 %d ns/op\n",
m.Model, m.Duration.Nanoseconds())
} else { } else {
var suffix string fmt.Fprintf(w, "BenchmarkModel/name=%s/step=%s 1 %d ns/op\n",
if m.Step == "load" { m.Model, m.Step, m.Duration.Nanoseconds())
suffix = "/step=load"
}
fmt.Fprintf(w, "BenchmarkModel/name=%s%s 1 %d ns/request\n",
m.Model, suffix, m.Duration.Nanoseconds())
} }
} }
case "csv": case "csv":
printHeader := func() {
headings := []string{"NAME", "STEP", "COUNT", "NS_PER_COUNT", "TOKEN_PER_SEC"}
fmt.Fprintln(w, strings.Join(headings, ","))
}
once.Do(printHeader)
for _, m := range metrics { for _, m := range metrics {
if m.Step == "generate" || m.Step == "prefill" { if m.Step == "generate" || m.Step == "prefill" {
var nsPerToken float64 var nsPerToken float64
@@ -94,39 +252,14 @@ func OutputMetrics(w io.Writer, format string, metrics []Metrics, verbose bool)
fmt.Fprintf(w, "%s,%s,1,%d,0\n", m.Model, m.Step, m.Duration.Nanoseconds()) fmt.Fprintf(w, "%s,%s,1,%d,0\n", m.Model, m.Step, m.Duration.Nanoseconds())
} }
} }
case "markdown":
printHeader := func() {
fmt.Fprintln(w, "| Model | Step | Count | Duration | nsPerToken | tokensPerSec |")
fmt.Fprintln(w, "|-------|------|-------|----------|------------|--------------|")
}
once.Do(printHeader)
for _, m := range metrics {
var nsPerToken, tokensPerSec float64
var nsPerTokenStr, tokensPerSecStr string
if m.Step == "generate" || m.Step == "prefill" {
nsPerToken = float64(m.Duration.Nanoseconds()) / float64(m.Count)
tokensPerSec = float64(m.Count) / (float64(m.Duration.Nanoseconds()) + 1e-12) * 1e9
nsPerTokenStr = fmt.Sprintf("%.2f", nsPerToken)
tokensPerSecStr = fmt.Sprintf("%.2f", tokensPerSec)
} else {
nsPerTokenStr = "-"
tokensPerSecStr = "-"
}
fmt.Fprintf(w, "| %s | %s | %d | %v | %s | %s |\n",
m.Model, m.Step, m.Count, m.Duration, nsPerTokenStr, tokensPerSecStr)
}
default: default:
fmt.Fprintf(os.Stderr, "Unknown output format '%s'\n", format) fmt.Fprintf(os.Stderr, "Unknown output format '%s'\n", format)
} }
} }
func BenchmarkChat(fOpt flagOptions) error { func BenchmarkModel(fOpt flagOptions) error {
models := strings.Split(*fOpt.models, ",") models := strings.Split(*fOpt.models, ",")
// todo - add multi-image support
var imgData api.ImageData var imgData api.ImageData
var err error var err error
if *fOpt.imageFile != "" { if *fOpt.imageFile != "" {
@@ -158,54 +291,100 @@ func BenchmarkChat(fOpt flagOptions) error {
out = f out = f
} }
outputFormatHeader(out, *fOpt.format, *fOpt.verbose)
// Log prompt-tokens info in debug mode
if *fOpt.debug && *fOpt.promptTokens > 0 {
prompt := generatePromptForTokenCount(*fOpt.promptTokens, 0)
wordCount := len(strings.Fields(prompt))
fmt.Fprintf(os.Stderr, "Generated prompt targeting ~%d tokens (%d words, varied per epoch)\n", *fOpt.promptTokens, wordCount)
}
for _, model := range models { for _, model := range models {
for range *fOpt.epochs { // Fetch model info
options := make(map[string]interface{}) infoCtx, infoCancel := context.WithTimeout(context.Background(), 10*time.Second)
if *fOpt.maxTokens > 0 { info := fetchModelInfo(infoCtx, client, model)
options["num_predict"] = *fOpt.maxTokens infoCancel()
// Warmup phase (uses negative epoch numbers to avoid colliding with timed epochs)
for i := range *fOpt.warmup {
req := buildGenerateRequest(model, fOpt, imgData, -(i + 1))
ctx, cancel := context.WithTimeout(context.Background(), time.Duration(*fOpt.timeout)*time.Second)
var warmupMetrics *api.Metrics
err = client.Generate(ctx, req, func(resp api.GenerateResponse) error {
if resp.Done {
warmupMetrics = &resp.Metrics
}
return nil
})
cancel()
if err != nil {
fmt.Fprintf(os.Stderr, "WARNING: Warmup %d/%d for %s failed: %v\n", i+1, *fOpt.warmup, model, err)
} else {
if *fOpt.debug {
fmt.Fprintf(os.Stderr, "Warmup %d/%d for %s complete\n", i+1, *fOpt.warmup, model)
}
// Calibrate prompt token count on last warmup run
if i == *fOpt.warmup-1 && *fOpt.promptTokens > 0 && warmupMetrics != nil {
prompt := generatePromptForTokenCount(*fOpt.promptTokens, -(i + 1))
wordCount := len(strings.Fields(prompt))
calibratePromptTokens(*fOpt.promptTokens, warmupMetrics.PromptEvalCount, wordCount)
}
} }
options["temperature"] = *fOpt.temperature
if fOpt.seed != nil && *fOpt.seed > 0 {
options["seed"] = *fOpt.seed
} }
var keepAliveDuration *api.Duration // Fetch memory/context info once after warmup (model is loaded and stable)
if *fOpt.keepAlive > 0 { memCtx, memCancel := context.WithTimeout(context.Background(), 5*time.Second)
duration := api.Duration{Duration: time.Duration(*fOpt.keepAlive * float64(time.Second))} info.SizeBytes, info.VRAMBytes = fetchMemoryUsage(memCtx, client, model)
keepAliveDuration = &duration if fOpt.numCtx != nil && *fOpt.numCtx > 0 {
info.NumCtx = int64(*fOpt.numCtx)
} else {
info.NumCtx = fetchContextLength(memCtx, client, model)
} }
memCancel()
req := &api.ChatRequest{ outputModelInfo(out, *fOpt.format, info)
Model: model,
Messages: []api.Message{
{
Role: "user",
Content: *fOpt.prompt,
},
},
Options: options,
KeepAlive: keepAliveDuration,
}
if imgData != nil {
req.Messages[0].Images = []api.ImageData{imgData}
}
// Timed epoch loop
shortCount := 0
for epoch := range *fOpt.epochs {
var responseMetrics *api.Metrics var responseMetrics *api.Metrics
var ttft time.Duration
short := false
// Retry loop: if the model hits a stop token before max-tokens,
// retry with a different prompt (up to maxRetries times).
const maxRetries = 3
for attempt := range maxRetries + 1 {
responseMetrics = nil
ttft = 0
var ttftOnce sync.Once
req := buildGenerateRequest(model, fOpt, imgData, epoch+attempt*1000)
requestStart := time.Now()
ctx, cancel := context.WithTimeout(context.Background(), time.Duration(*fOpt.timeout)*time.Second) ctx, cancel := context.WithTimeout(context.Background(), time.Duration(*fOpt.timeout)*time.Second)
defer cancel()
err = client.Chat(ctx, req, func(resp api.ChatResponse) error { err = client.Generate(ctx, req, func(resp api.GenerateResponse) error {
if *fOpt.debug { if *fOpt.debug {
fmt.Fprintf(os.Stderr, "%s", cmp.Or(resp.Message.Thinking, resp.Message.Content)) fmt.Fprintf(os.Stderr, "%s", cmp.Or(resp.Thinking, resp.Response))
} }
// Capture TTFT on first content
ttftOnce.Do(func() {
if resp.Response != "" || resp.Thinking != "" {
ttft = time.Since(requestStart)
}
})
if resp.Done { if resp.Done {
responseMetrics = &resp.Metrics responseMetrics = &resp.Metrics
} }
return nil return nil
}) })
cancel()
if *fOpt.debug { if *fOpt.debug {
fmt.Fprintln(os.Stderr) fmt.Fprintln(os.Stderr)
@@ -213,18 +392,42 @@ func BenchmarkChat(fOpt flagOptions) error {
if err != nil { if err != nil {
if ctx.Err() == context.DeadlineExceeded { if ctx.Err() == context.DeadlineExceeded {
fmt.Fprintf(os.Stderr, "ERROR: Chat request timed out with model '%s' after %vs\n", model, 1) fmt.Fprintf(os.Stderr, "ERROR: Request timed out with model '%s' after %vs\n", model, *fOpt.timeout)
continue } else {
fmt.Fprintf(os.Stderr, "ERROR: Couldn't generate with model '%s': %v\n", model, err)
} }
fmt.Fprintf(os.Stderr, "ERROR: Couldn't chat with model '%s': %v\n", model, err) break
continue
} }
if responseMetrics == nil { if responseMetrics == nil {
fmt.Fprintf(os.Stderr, "ERROR: No metrics received for model '%s'\n", model) fmt.Fprintf(os.Stderr, "ERROR: No metrics received for model '%s'\n", model)
break
}
// Check if the response was shorter than requested
short = *fOpt.maxTokens > 0 && responseMetrics.EvalCount < *fOpt.maxTokens
if !short || attempt == maxRetries {
break
}
if *fOpt.debug {
fmt.Fprintf(os.Stderr, "Short response (%d/%d tokens), retrying with different prompt (attempt %d/%d)\n",
responseMetrics.EvalCount, *fOpt.maxTokens, attempt+1, maxRetries)
}
}
if err != nil || responseMetrics == nil {
continue continue
} }
if short {
shortCount++
if *fOpt.debug {
fmt.Fprintf(os.Stderr, "WARNING: Short response (%d/%d tokens) after %d retries for epoch %d\n",
responseMetrics.EvalCount, *fOpt.maxTokens, maxRetries, epoch+1)
}
}
metrics := []Metrics{ metrics := []Metrics{
{ {
Model: model, Model: model,
@@ -238,6 +441,12 @@ func BenchmarkChat(fOpt flagOptions) error {
Count: responseMetrics.EvalCount, Count: responseMetrics.EvalCount,
Duration: responseMetrics.EvalDuration, Duration: responseMetrics.EvalDuration,
}, },
{
Model: model,
Step: "ttft",
Count: 1,
Duration: ttft,
},
{ {
Model: model, Model: model,
Step: "load", Step: "load",
@@ -254,15 +463,42 @@ func BenchmarkChat(fOpt flagOptions) error {
OutputMetrics(out, *fOpt.format, metrics, *fOpt.verbose) OutputMetrics(out, *fOpt.format, metrics, *fOpt.verbose)
if *fOpt.debug && *fOpt.promptTokens > 0 {
fmt.Fprintf(os.Stderr, "Generated prompt targeting ~%d tokens (actual: %d)\n",
*fOpt.promptTokens, responseMetrics.PromptEvalCount)
}
if *fOpt.keepAlive > 0 { if *fOpt.keepAlive > 0 {
time.Sleep(time.Duration(*fOpt.keepAlive*float64(time.Second)) + 200*time.Millisecond) time.Sleep(time.Duration(*fOpt.keepAlive*float64(time.Second)) + 200*time.Millisecond)
} }
} }
if shortCount > 0 {
fmt.Fprintf(os.Stderr, "WARNING: %d/%d epochs for '%s' had short responses (<%d tokens). Generation metrics may be unreliable.\n",
shortCount, *fOpt.epochs, model, *fOpt.maxTokens)
}
// Unload model before moving to the next one
unloadModel(client, model, *fOpt.timeout)
} }
return nil return nil
} }
func unloadModel(client *api.Client, model string, timeout int) {
ctx, cancel := context.WithTimeout(context.Background(), time.Duration(timeout)*time.Second)
defer cancel()
zero := api.Duration{Duration: 0}
req := &api.GenerateRequest{
Model: model,
KeepAlive: &zero,
}
_ = client.Generate(ctx, req, func(resp api.GenerateResponse) error {
return nil
})
}
func readImage(filePath string) (api.ImageData, error) { func readImage(filePath string) (api.ImageData, error) {
file, err := os.Open(filePath) file, err := os.Open(filePath)
if err != nil { if err != nil {
@@ -289,10 +525,13 @@ func main() {
prompt: flag.String("p", DefaultPrompt, "Prompt to use"), prompt: flag.String("p", DefaultPrompt, "Prompt to use"),
imageFile: flag.String("image", "", "Filename for an image to include"), imageFile: flag.String("image", "", "Filename for an image to include"),
keepAlive: flag.Float64("k", 0, "Keep alive duration in seconds"), keepAlive: flag.Float64("k", 0, "Keep alive duration in seconds"),
format: flag.String("format", "markdown", "Output format [benchstat|csv] (default benchstat)"), format: flag.String("format", "benchstat", "Output format [benchstat|csv]"),
outputFile: flag.String("output", "", "Output file for results (stdout if empty)"), outputFile: flag.String("output", "", "Output file for results (stdout if empty)"),
verbose: flag.Bool("v", false, "Show system information"), verbose: flag.Bool("v", false, "Show system information"),
debug: flag.Bool("debug", false, "Show debug information"), debug: flag.Bool("debug", false, "Show debug information"),
warmup: flag.Int("warmup", 1, "Number of warmup requests before timing"),
promptTokens: flag.Int("prompt-tokens", 0, "Generate prompt targeting ~N tokens (0 = use -p prompt)"),
numCtx: flag.Int("num-ctx", 0, "Context size (0 = server default)"),
} }
flag.Usage = func() { flag.Usage = func() {
@@ -302,11 +541,12 @@ func main() {
fmt.Fprintf(os.Stderr, "Options:\n") fmt.Fprintf(os.Stderr, "Options:\n")
flag.PrintDefaults() flag.PrintDefaults()
fmt.Fprintf(os.Stderr, "\nExamples:\n") fmt.Fprintf(os.Stderr, "\nExamples:\n")
fmt.Fprintf(os.Stderr, " bench -model gpt-oss:20b -epochs 3 -temperature 0.7\n") fmt.Fprintf(os.Stderr, " bench -model gemma3,llama3 -epochs 6\n")
fmt.Fprintf(os.Stderr, " bench -model gemma3 -epochs 6 -prompt-tokens 512 -format csv\n")
} }
flag.Parse() flag.Parse()
if !slices.Contains([]string{"markdown", "benchstat", "csv"}, *fOpt.format) { if !slices.Contains([]string{"benchstat", "csv"}, *fOpt.format) {
fmt.Fprintf(os.Stderr, "ERROR: Unknown format '%s'\n", *fOpt.format) fmt.Fprintf(os.Stderr, "ERROR: Unknown format '%s'\n", *fOpt.format)
os.Exit(1) os.Exit(1)
} }
@@ -317,5 +557,5 @@ func main() {
return return
} }
BenchmarkChat(fOpt) BenchmarkModel(fOpt)
} }

File diff suppressed because it is too large Load Diff

View File

@@ -11,10 +11,12 @@ import (
"fmt" "fmt"
"io" "io"
"log" "log"
"log/slog"
"math" "math"
"net" "net"
"net/http" "net/http"
"os" "os"
"os/exec"
"os/signal" "os/signal"
"path/filepath" "path/filepath"
"runtime" "runtime"
@@ -29,14 +31,20 @@ import (
"github.com/containerd/console" "github.com/containerd/console"
"github.com/mattn/go-runewidth" "github.com/mattn/go-runewidth"
"github.com/olekukonko/tablewriter" "github.com/olekukonko/tablewriter"
"github.com/pkg/browser"
"github.com/spf13/cobra" "github.com/spf13/cobra"
"golang.org/x/crypto/ssh" "golang.org/x/crypto/ssh"
"golang.org/x/sync/errgroup" "golang.org/x/sync/errgroup"
"golang.org/x/term" "golang.org/x/term"
"github.com/ollama/ollama/api" "github.com/ollama/ollama/api"
"github.com/ollama/ollama/cmd/config"
"github.com/ollama/ollama/cmd/launch"
"github.com/ollama/ollama/cmd/tui"
"github.com/ollama/ollama/envconfig" "github.com/ollama/ollama/envconfig"
"github.com/ollama/ollama/format" "github.com/ollama/ollama/format"
"github.com/ollama/ollama/internal/modelref"
"github.com/ollama/ollama/logutil"
"github.com/ollama/ollama/parser" "github.com/ollama/ollama/parser"
"github.com/ollama/ollama/progress" "github.com/ollama/ollama/progress"
"github.com/ollama/ollama/readline" "github.com/ollama/ollama/readline"
@@ -46,12 +54,48 @@ import (
"github.com/ollama/ollama/types/syncmap" "github.com/ollama/ollama/types/syncmap"
"github.com/ollama/ollama/version" "github.com/ollama/ollama/version"
xcmd "github.com/ollama/ollama/x/cmd" xcmd "github.com/ollama/ollama/x/cmd"
"github.com/ollama/ollama/x/create"
xcreateclient "github.com/ollama/ollama/x/create/client" xcreateclient "github.com/ollama/ollama/x/create/client"
"github.com/ollama/ollama/x/imagegen" "github.com/ollama/ollama/x/imagegen"
) )
const ConnectInstructions = "To sign in, navigate to:\n %s\n\n" func init() {
// Override default selectors to use Bubbletea TUI instead of raw terminal I/O.
launch.DefaultSingleSelector = func(title string, items []launch.ModelItem, current string) (string, error) {
if !term.IsTerminal(int(os.Stdin.Fd())) || !term.IsTerminal(int(os.Stdout.Fd())) {
return "", fmt.Errorf("model selection requires an interactive terminal; use --model to run in headless mode")
}
tuiItems := tui.ReorderItems(tui.ConvertItems(items))
result, err := tui.SelectSingle(title, tuiItems, current)
if errors.Is(err, tui.ErrCancelled) {
return "", launch.ErrCancelled
}
return result, err
}
launch.DefaultMultiSelector = func(title string, items []launch.ModelItem, preChecked []string) ([]string, error) {
if !term.IsTerminal(int(os.Stdin.Fd())) || !term.IsTerminal(int(os.Stdout.Fd())) {
return nil, fmt.Errorf("model selection requires an interactive terminal; use --model to run in headless mode")
}
tuiItems := tui.ReorderItems(tui.ConvertItems(items))
result, err := tui.SelectMultiple(title, tuiItems, preChecked)
if errors.Is(err, tui.ErrCancelled) {
return nil, launch.ErrCancelled
}
return result, err
}
launch.DefaultSignIn = func(modelName, signInURL string) (string, error) {
userName, err := tui.RunSignIn(modelName, signInURL)
if errors.Is(err, tui.ErrCancelled) {
return "", launch.ErrCancelled
}
return userName, err
}
launch.DefaultConfirmPrompt = tui.RunConfirmWithOptions
}
const ConnectInstructions = "If your browser did not open, navigate to:\n %s\n\n"
// ensureThinkingSupport emits a warning if the model does not advertise thinking support // ensureThinkingSupport emits a warning if the model does not advertise thinking support
func ensureThinkingSupport(ctx context.Context, client *api.Client, name string) { func ensureThinkingSupport(ctx context.Context, client *api.Client, name string) {
@@ -90,6 +134,17 @@ func getModelfileName(cmd *cobra.Command) (string, error) {
return absName, nil return absName, nil
} }
// isLocalhost returns true if the configured Ollama host is a loopback or unspecified address.
func isLocalhost() bool {
host := envconfig.Host()
h, _, _ := net.SplitHostPort(host.Host)
if h == "localhost" {
return true
}
ip := net.ParseIP(h)
return ip != nil && (ip.IsLoopback() || ip.IsUnspecified())
}
func CreateHandler(cmd *cobra.Command, args []string) error { func CreateHandler(cmd *cobra.Command, args []string) error {
p := progress.NewProgress(os.Stderr) p := progress.NewProgress(os.Stderr)
defer p.Stop() defer p.Stop()
@@ -102,8 +157,13 @@ func CreateHandler(cmd *cobra.Command, args []string) error {
} }
// Check for --experimental flag for safetensors model creation // Check for --experimental flag for safetensors model creation
// This gates both safetensors LLM and imagegen model creation
experimental, _ := cmd.Flags().GetBool("experimental") experimental, _ := cmd.Flags().GetBool("experimental")
if experimental { if experimental {
if !isLocalhost() {
return errors.New("remote safetensor model creation not yet supported")
}
// Get Modelfile content - either from -f flag or default to "FROM ." // Get Modelfile content - either from -f flag or default to "FROM ."
var reader io.Reader var reader io.Reader
filename, err := getModelfileName(cmd) filename, err := getModelfileName(cmd)
@@ -127,25 +187,9 @@ func CreateHandler(cmd *cobra.Command, args []string) error {
return fmt.Errorf("failed to parse Modelfile: %w", err) return fmt.Errorf("failed to parse Modelfile: %w", err)
} }
// Extract FROM path and configuration modelDir, mfConfig, err := xcreateclient.ConfigFromModelfile(modelfile)
var modelDir string if err != nil {
mfConfig := &xcreateclient.ModelfileConfig{} return err
for _, cmd := range modelfile.Commands {
switch cmd.Name {
case "model":
modelDir = cmd.Args
case "template":
mfConfig.Template = cmd.Args
case "system":
mfConfig.System = cmd.Args
case "license":
mfConfig.License = cmd.Args
}
}
if modelDir == "" {
modelDir = "."
} }
// Resolve relative paths based on Modelfile location // Resolve relative paths based on Modelfile location
@@ -162,20 +206,12 @@ func CreateHandler(cmd *cobra.Command, args []string) error {
}, p) }, p)
} }
// Standard Modelfile + API path
var reader io.Reader var reader io.Reader
filename, err := getModelfileName(cmd) filename, err := getModelfileName(cmd)
if os.IsNotExist(err) { if os.IsNotExist(err) {
if filename == "" { if filename == "" {
// No Modelfile found - check if current directory is an image gen model
if create.IsTensorModelDir(".") {
quantize, _ := cmd.Flags().GetString("quantize")
return xcreateclient.CreateModel(xcreateclient.CreateOptions{
ModelName: modelName,
ModelDir: ".",
Quantize: quantize,
}, p)
}
reader = strings.NewReader("FROM .\n") reader = strings.NewReader("FROM .\n")
} else { } else {
return errModelfileNotFound return errModelfileNotFound
@@ -361,18 +397,35 @@ func loadOrUnloadModel(cmd *cobra.Command, opts *runOptions) error {
return err return err
} }
requestedCloud := modelref.HasExplicitCloudSource(opts.Model)
if info, err := client.Show(cmd.Context(), &api.ShowRequest{Model: opts.Model}); err != nil { if info, err := client.Show(cmd.Context(), &api.ShowRequest{Model: opts.Model}); err != nil {
return err return err
} else if info.RemoteHost != "" { } else if info.RemoteHost != "" || requestedCloud {
// Cloud model, no need to load/unload // Cloud model, no need to load/unload
isCloud := requestedCloud || strings.HasPrefix(info.RemoteHost, "https://ollama.com")
// Check if user is signed in for ollama.com cloud models
if isCloud {
if _, err := client.Whoami(cmd.Context()); err != nil {
return err
}
}
if opts.ShowConnect { if opts.ShowConnect {
p.StopAndClear() p.StopAndClear()
if strings.HasPrefix(info.RemoteHost, "https://ollama.com") { remoteModel := info.RemoteModel
fmt.Fprintf(os.Stderr, "Connecting to '%s' on 'ollama.com' ⚡\n", info.RemoteModel) if remoteModel == "" {
remoteModel = opts.Model
}
if isCloud {
fmt.Fprintf(os.Stderr, "Connecting to '%s' on 'ollama.com' ⚡\n", remoteModel)
} else { } else {
fmt.Fprintf(os.Stderr, "Connecting to '%s' on '%s'\n", info.RemoteModel, info.RemoteHost) fmt.Fprintf(os.Stderr, "Connecting to '%s' on '%s'\n", remoteModel, info.RemoteHost)
} }
} }
return nil return nil
} }
@@ -441,6 +494,64 @@ func generateEmbedding(cmd *cobra.Command, modelName, input string, keepAlive *a
return nil return nil
} }
// TODO(parthsareen): consolidate with TUI signin flow
func handleCloudAuthorizationError(err error) bool {
var authErr api.AuthorizationError
if errors.As(err, &authErr) && authErr.StatusCode == http.StatusUnauthorized {
fmt.Printf("You need to be signed in to Ollama to run Cloud models.\n\n")
if authErr.SigninURL != "" {
fmt.Printf(ConnectInstructions, authErr.SigninURL)
}
return true
}
return false
}
// TEMP(drifkin): To match legacy `ollama run some-model:cloud` behavior, we
// best-effort pull cloud stub files for any explicit cloud source models.
// Remove this once `/api/tags` is cloud-aware.
func ensureCloudStub(ctx context.Context, client *api.Client, modelName string) {
if !modelref.HasExplicitCloudSource(modelName) {
return
}
normalizedName, _, err := modelref.NormalizePullName(modelName)
if err != nil {
slog.Warn("failed to normalize pull name", "model", modelName, "error", err, "normalizedName", normalizedName)
return
}
listResp, err := client.List(ctx)
if err != nil {
slog.Warn("failed to list models", "error", err)
return
}
if hasListedModelName(listResp.Models, modelName) || hasListedModelName(listResp.Models, normalizedName) {
return
}
logutil.Trace("pulling cloud stub", "model", modelName, "normalizedName", normalizedName)
err = client.Pull(ctx, &api.PullRequest{
Model: normalizedName,
}, func(api.ProgressResponse) error {
return nil
})
if err != nil {
slog.Warn("failed to pull cloud stub", "model", modelName, "error", err)
}
}
func hasListedModelName(models []api.ListModelResponse, name string) bool {
for _, m := range models {
if strings.EqualFold(m.Name, name) || strings.EqualFold(m.Model, name) {
return true
}
}
return false
}
func RunHandler(cmd *cobra.Command, args []string) error { func RunHandler(cmd *cobra.Command, args []string) error {
interactive := true interactive := true
@@ -537,12 +648,16 @@ func RunHandler(cmd *cobra.Command, args []string) error {
} }
name := args[0] name := args[0]
requestedCloud := modelref.HasExplicitCloudSource(name)
info, err := func() (*api.ShowResponse, error) { info, err := func() (*api.ShowResponse, error) {
showReq := &api.ShowRequest{Name: name} showReq := &api.ShowRequest{Name: name}
info, err := client.Show(cmd.Context(), showReq) info, err := client.Show(cmd.Context(), showReq)
var se api.StatusError var se api.StatusError
if errors.As(err, &se) && se.StatusCode == http.StatusNotFound { if errors.As(err, &se) && se.StatusCode == http.StatusNotFound {
if requestedCloud {
return nil, err
}
if err := PullHandler(cmd, []string{name}); err != nil { if err := PullHandler(cmd, []string{name}); err != nil {
return nil, err return nil, err
} }
@@ -551,15 +666,21 @@ func RunHandler(cmd *cobra.Command, args []string) error {
return info, err return info, err
}() }()
if err != nil { if err != nil {
if handleCloudAuthorizationError(err) {
return nil
}
return err return err
} }
ensureCloudStub(cmd.Context(), client, name)
opts.Think, err = inferThinkingOption(&info.Capabilities, &opts, thinkFlag.Changed) opts.Think, err = inferThinkingOption(&info.Capabilities, &opts, thinkFlag.Changed)
if err != nil { if err != nil {
return err return err
} }
opts.MultiModal = slices.Contains(info.Capabilities, model.CapabilityVision) audioCapable := slices.Contains(info.Capabilities, model.CapabilityAudio)
opts.MultiModal = slices.Contains(info.Capabilities, model.CapabilityVision) || audioCapable
// TODO: remove the projector info and vision info checks below, // TODO: remove the projector info and vision info checks below,
// these are left in for backwards compatibility with older servers // these are left in for backwards compatibility with older servers
@@ -574,7 +695,7 @@ func RunHandler(cmd *cobra.Command, args []string) error {
} }
} }
opts.ParentModel = info.Details.ParentModel applyShowResponseToRunOptions(&opts, info)
// Check if this is an embedding model // Check if this is an embedding model
isEmbeddingModel := slices.Contains(info.Capabilities, model.CapabilityEmbedding) isEmbeddingModel := slices.Contains(info.Capabilities, model.CapabilityEmbedding)
@@ -645,7 +766,13 @@ func RunHandler(cmd *cobra.Command, args []string) error {
return generateInteractive(cmd, opts) return generateInteractive(cmd, opts)
} }
return generate(cmd, opts) if err := generate(cmd, opts); err != nil {
if handleCloudAuthorizationError(err) {
return nil
}
return err
}
return nil
} }
func SigninHandler(cmd *cobra.Command, args []string) error { func SigninHandler(cmd *cobra.Command, args []string) error {
@@ -662,6 +789,7 @@ func SigninHandler(cmd *cobra.Command, args []string) error {
fmt.Println() fmt.Println()
if aErr.SigninURL != "" { if aErr.SigninURL != "" {
_ = browser.OpenURL(aErr.SigninURL)
fmt.Printf(ConnectInstructions, aErr.SigninURL) fmt.Printf(ConnectInstructions, aErr.SigninURL)
} }
return nil return nil
@@ -1018,8 +1146,10 @@ func showInfo(resp *api.ShowResponse, verbose bool, w io.Writer) error {
} }
if resp.ModelInfo != nil { if resp.ModelInfo != nil {
arch := resp.ModelInfo["general.architecture"].(string) arch, _ := resp.ModelInfo["general.architecture"].(string)
if arch != "" {
rows = append(rows, []string{"", "architecture", arch}) rows = append(rows, []string{"", "architecture", arch})
}
var paramStr string var paramStr string
if resp.Details.ParameterSize != "" { if resp.Details.ParameterSize != "" {
@@ -1029,7 +1159,9 @@ func showInfo(resp *api.ShowResponse, verbose bool, w io.Writer) error {
paramStr = format.HumanNumber(uint64(f)) paramStr = format.HumanNumber(uint64(f))
} }
} }
if paramStr != "" {
rows = append(rows, []string{"", "parameters", paramStr}) rows = append(rows, []string{"", "parameters", paramStr})
}
if v, ok := resp.ModelInfo[fmt.Sprintf("%s.context_length", arch)]; ok { if v, ok := resp.ModelInfo[fmt.Sprintf("%s.context_length", arch)]; ok {
if f, ok := v.(float64); ok { if f, ok := v.(float64); ok {
@@ -1281,6 +1413,7 @@ type generateContextKey string
type runOptions struct { type runOptions struct {
Model string Model string
ParentModel string ParentModel string
LoadedMessages []api.Message
Prompt string Prompt string
Messages []api.Message Messages []api.Message
WordWrap bool WordWrap bool
@@ -1296,6 +1429,12 @@ type runOptions struct {
} }
func (r runOptions) Copy() runOptions { func (r runOptions) Copy() runOptions {
var loadedMessages []api.Message
if r.LoadedMessages != nil {
loadedMessages = make([]api.Message, len(r.LoadedMessages))
copy(loadedMessages, r.LoadedMessages)
}
var messages []api.Message var messages []api.Message
if r.Messages != nil { if r.Messages != nil {
messages = make([]api.Message, len(r.Messages)) messages = make([]api.Message, len(r.Messages))
@@ -1325,6 +1464,7 @@ func (r runOptions) Copy() runOptions {
return runOptions{ return runOptions{
Model: r.Model, Model: r.Model,
ParentModel: r.ParentModel, ParentModel: r.ParentModel,
LoadedMessages: loadedMessages,
Prompt: r.Prompt, Prompt: r.Prompt,
Messages: messages, Messages: messages,
WordWrap: r.WordWrap, WordWrap: r.WordWrap,
@@ -1340,6 +1480,11 @@ func (r runOptions) Copy() runOptions {
} }
} }
func applyShowResponseToRunOptions(opts *runOptions, info *api.ShowResponse) {
opts.ParentModel = info.Details.ParentModel
opts.LoadedMessages = slices.Clone(info.Messages)
}
type displayResponseState struct { type displayResponseState struct {
lineLength int lineLength int
wordBuffer string wordBuffer string
@@ -1347,6 +1492,9 @@ type displayResponseState struct {
func displayResponse(content string, wordWrap bool, state *displayResponseState) { func displayResponse(content string, wordWrap bool, state *displayResponseState) {
termWidth, _, _ := term.GetSize(int(os.Stdout.Fd())) termWidth, _, _ := term.GetSize(int(os.Stdout.Fd()))
if termWidth == 0 {
termWidth = 80
}
if wordWrap && termWidth >= 10 { if wordWrap && termWidth >= 10 {
for _, ch := range content { for _, ch := range content {
if state.lineLength+1 > termWidth-5 { if state.lineLength+1 > termWidth-5 {
@@ -1745,7 +1893,7 @@ func checkServerHeartbeat(cmd *cobra.Command, _ []string) error {
return err return err
} }
if err := startApp(cmd.Context(), client); err != nil { if err := startApp(cmd.Context(), client); err != nil {
return fmt.Errorf("ollama server not responding - %w", err) return err
} }
} }
return nil return nil
@@ -1786,6 +1934,148 @@ Environment Variables:
cmd.SetUsageTemplate(cmd.UsageTemplate() + envUsage) cmd.SetUsageTemplate(cmd.UsageTemplate() + envUsage)
} }
// ensureServerRunning checks if the ollama server is running and starts it in the background if not.
func ensureServerRunning(ctx context.Context) error {
client, err := api.ClientFromEnvironment()
if err != nil {
return err
}
// Check if server is already running
if err := client.Heartbeat(ctx); err == nil {
return nil // server is already running
}
// Server not running, start it in the background
exe, err := os.Executable()
if err != nil {
return fmt.Errorf("could not find executable: %w", err)
}
serverCmd := exec.CommandContext(ctx, exe, "serve")
serverCmd.Env = os.Environ()
serverCmd.SysProcAttr = backgroundServerSysProcAttr()
if err := serverCmd.Start(); err != nil {
return fmt.Errorf("failed to start server: %w", err)
}
// Wait for the server to be ready
for {
time.Sleep(500 * time.Millisecond)
if err := client.Heartbeat(ctx); err == nil {
return nil // server has started
}
}
}
func launchInteractiveModel(cmd *cobra.Command, modelName string) error {
opts := runOptions{
Model: modelName,
WordWrap: os.Getenv("TERM") == "xterm-256color",
Options: map[string]any{},
ShowConnect: true,
}
// loadOrUnloadModel is cloud-safe here: remote/cloud models skip local preload
// and only validate auth/connectivity before interactive chat starts.
if err := loadOrUnloadModel(cmd, &opts); err != nil {
return fmt.Errorf("error loading model: %w", err)
}
if err := generateInteractive(cmd, opts); err != nil {
return fmt.Errorf("error running model: %w", err)
}
return nil
}
// runInteractiveTUI runs the main interactive TUI menu.
func runInteractiveTUI(cmd *cobra.Command) {
// Ensure the server is running before showing the TUI
if err := ensureServerRunning(cmd.Context()); err != nil {
fmt.Fprintf(os.Stderr, "Error starting server: %v\n", err)
return
}
deps := launcherDeps{
buildState: launch.BuildLauncherState,
runMenu: tui.RunMenu,
resolveRunModel: launch.ResolveRunModel,
launchIntegration: launch.LaunchIntegration,
runModel: launchInteractiveModel,
}
for {
continueLoop, err := runInteractiveTUIStep(cmd, deps)
if err != nil {
fmt.Fprintf(os.Stderr, "Error: %v\n", err)
}
if !continueLoop {
return
}
}
}
type launcherDeps struct {
buildState func(context.Context) (*launch.LauncherState, error)
runMenu func(*launch.LauncherState) (tui.TUIAction, error)
resolveRunModel func(context.Context, launch.RunModelRequest) (string, error)
launchIntegration func(context.Context, launch.IntegrationLaunchRequest) error
runModel func(*cobra.Command, string) error
}
func runInteractiveTUIStep(cmd *cobra.Command, deps launcherDeps) (bool, error) {
state, err := deps.buildState(cmd.Context())
if err != nil {
return false, fmt.Errorf("build launcher state: %w", err)
}
action, err := deps.runMenu(state)
if err != nil {
return false, fmt.Errorf("run launcher menu: %w", err)
}
return runLauncherAction(cmd, action, deps)
}
func saveLauncherSelection(action tui.TUIAction) {
// Best effort only: this affects menu recall, not launch correctness.
_ = config.SetLastSelection(action.LastSelection())
}
func runLauncherAction(cmd *cobra.Command, action tui.TUIAction, deps launcherDeps) (bool, error) {
switch action.Kind {
case tui.TUIActionNone:
return false, nil
case tui.TUIActionRunModel:
saveLauncherSelection(action)
modelName, err := deps.resolveRunModel(cmd.Context(), action.RunModelRequest())
if errors.Is(err, launch.ErrCancelled) {
return true, nil
}
if err != nil {
return true, fmt.Errorf("selecting model: %w", err)
}
if err := deps.runModel(cmd, modelName); err != nil {
return true, err
}
return true, nil
case tui.TUIActionLaunchIntegration:
saveLauncherSelection(action)
err := deps.launchIntegration(cmd.Context(), action.IntegrationLaunchRequest())
if errors.Is(err, launch.ErrCancelled) {
return true, nil
}
if err != nil {
return true, fmt.Errorf("launching %s: %w", action.Integration, err)
}
// VS Code is a GUI app — exit the TUI loop after launching
if action.Integration == "vscode" {
return false, nil
}
return true, nil
default:
return false, fmt.Errorf("unknown launcher action: %d", action.Kind)
}
}
func NewCLI() *cobra.Command { func NewCLI() *cobra.Command {
log.SetFlags(log.LstdFlags | log.Lshortfile) log.SetFlags(log.LstdFlags | log.Lshortfile)
cobra.EnableCommandSorting = false cobra.EnableCommandSorting = false
@@ -1808,11 +2098,13 @@ func NewCLI() *cobra.Command {
return return
} }
cmd.Print(cmd.UsageString()) runInteractiveTUI(cmd)
}, },
} }
rootCmd.Flags().BoolP("version", "v", false, "Show version information") rootCmd.Flags().BoolP("version", "v", false, "Show version information")
rootCmd.Flags().Bool("verbose", false, "Show timings for response")
rootCmd.Flags().Bool("nowordwrap", false, "Don't wrap words to the next line automatically")
createCmd := &cobra.Command{ createCmd := &cobra.Command{
Use: "create MODEL", Use: "create MODEL",
@@ -1872,6 +2164,9 @@ func NewCLI() *cobra.Command {
// Image generation flags (width, height, steps, seed, etc.) // Image generation flags (width, height, steps, seed, etc.)
imagegen.RegisterFlags(runCmd) imagegen.RegisterFlags(runCmd)
runCmd.Flags().Bool("imagegen", false, "Use the imagegen runner for LLM inference")
runCmd.Flags().MarkHidden("imagegen")
stopCmd := &cobra.Command{ stopCmd := &cobra.Command{
Use: "stop MODEL", Use: "stop MODEL",
Short: "Stop a running model", Short: "Stop a running model",
@@ -1883,7 +2178,7 @@ func NewCLI() *cobra.Command {
serveCmd := &cobra.Command{ serveCmd := &cobra.Command{
Use: "serve", Use: "serve",
Aliases: []string{"start"}, Aliases: []string{"start"},
Short: "Start ollama", Short: "Start Ollama",
Args: cobra.ExactArgs(0), Args: cobra.ExactArgs(0),
RunE: RunServer, RunE: RunServer,
} }
@@ -1916,6 +2211,15 @@ func NewCLI() *cobra.Command {
RunE: SigninHandler, RunE: SigninHandler,
} }
loginCmd := &cobra.Command{
Use: "login",
Short: "Sign in to ollama.com",
Hidden: true,
Args: cobra.ExactArgs(0),
PreRunE: checkServerHeartbeat,
RunE: SigninHandler,
}
signoutCmd := &cobra.Command{ signoutCmd := &cobra.Command{
Use: "signout", Use: "signout",
Short: "Sign out from ollama.com", Short: "Sign out from ollama.com",
@@ -1924,6 +2228,15 @@ func NewCLI() *cobra.Command {
RunE: SignoutHandler, RunE: SignoutHandler,
} }
logoutCmd := &cobra.Command{
Use: "logout",
Short: "Sign out from ollama.com",
Hidden: true,
Args: cobra.ExactArgs(0),
PreRunE: checkServerHeartbeat,
RunE: SignoutHandler,
}
listCmd := &cobra.Command{ listCmd := &cobra.Command{
Use: "list", Use: "list",
Aliases: []string{"ls"}, Aliases: []string{"ls"},
@@ -1986,7 +2299,7 @@ func NewCLI() *cobra.Command {
switch cmd { switch cmd {
case runCmd: case runCmd:
imagegen.AppendFlagsDocs(cmd) imagegen.AppendFlagsDocs(cmd)
appendEnvDocs(cmd, []envconfig.EnvVar{envVars["OLLAMA_HOST"], envVars["OLLAMA_NOHISTORY"]}) appendEnvDocs(cmd, []envconfig.EnvVar{envVars["OLLAMA_EDITOR"], envVars["OLLAMA_HOST"], envVars["OLLAMA_NOHISTORY"]})
case serveCmd: case serveCmd:
appendEnvDocs(cmd, []envconfig.EnvVar{ appendEnvDocs(cmd, []envconfig.EnvVar{
envVars["OLLAMA_DEBUG"], envVars["OLLAMA_DEBUG"],
@@ -1997,6 +2310,7 @@ func NewCLI() *cobra.Command {
envVars["OLLAMA_MAX_QUEUE"], envVars["OLLAMA_MAX_QUEUE"],
envVars["OLLAMA_MODELS"], envVars["OLLAMA_MODELS"],
envVars["OLLAMA_NUM_PARALLEL"], envVars["OLLAMA_NUM_PARALLEL"],
envVars["OLLAMA_NO_CLOUD"],
envVars["OLLAMA_NOPRUNE"], envVars["OLLAMA_NOPRUNE"],
envVars["OLLAMA_ORIGINS"], envVars["OLLAMA_ORIGINS"],
envVars["OLLAMA_SCHED_SPREAD"], envVars["OLLAMA_SCHED_SPREAD"],
@@ -2020,12 +2334,15 @@ func NewCLI() *cobra.Command {
pullCmd, pullCmd,
pushCmd, pushCmd,
signinCmd, signinCmd,
loginCmd,
signoutCmd, signoutCmd,
logoutCmd,
listCmd, listCmd,
psCmd, psCmd,
copyCmd, copyCmd,
deleteCmd, deleteCmd,
runnerCmd, runnerCmd,
launch.LaunchCmd(checkServerHeartbeat, runInteractiveTUI),
) )
return rootCmd return rootCmd

270
cmd/cmd_launcher_test.go Normal file
View File

@@ -0,0 +1,270 @@
package cmd
import (
"context"
"testing"
"github.com/spf13/cobra"
"github.com/ollama/ollama/cmd/config"
"github.com/ollama/ollama/cmd/launch"
"github.com/ollama/ollama/cmd/tui"
)
func setCmdTestHome(t *testing.T, dir string) {
t.Helper()
t.Setenv("HOME", dir)
t.Setenv("USERPROFILE", dir)
}
func unexpectedRunModelResolution(t *testing.T) func(context.Context, launch.RunModelRequest) (string, error) {
t.Helper()
return func(ctx context.Context, req launch.RunModelRequest) (string, error) {
t.Fatalf("did not expect run-model resolution: %+v", req)
return "", nil
}
}
func unexpectedIntegrationLaunch(t *testing.T) func(context.Context, launch.IntegrationLaunchRequest) error {
t.Helper()
return func(ctx context.Context, req launch.IntegrationLaunchRequest) error {
t.Fatalf("did not expect integration launch: %+v", req)
return nil
}
}
func unexpectedModelLaunch(t *testing.T) func(*cobra.Command, string) error {
t.Helper()
return func(cmd *cobra.Command, model string) error {
t.Fatalf("did not expect chat launch: %s", model)
return nil
}
}
func TestRunInteractiveTUI_RunModelActionsUseResolveRunModel(t *testing.T) {
tests := []struct {
name string
action tui.TUIAction
wantForce bool
wantModel string
}{
{
name: "enter uses saved model flow",
action: tui.TUIAction{Kind: tui.TUIActionRunModel},
wantModel: "qwen3:8b",
},
{
name: "right forces picker",
action: tui.TUIAction{Kind: tui.TUIActionRunModel, ForceConfigure: true},
wantForce: true,
wantModel: "glm-5:cloud",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
setCmdTestHome(t, t.TempDir())
var menuCalls int
runMenu := func(state *launch.LauncherState) (tui.TUIAction, error) {
menuCalls++
if menuCalls == 1 {
return tt.action, nil
}
return tui.TUIAction{Kind: tui.TUIActionNone}, nil
}
var gotReq launch.RunModelRequest
var launched string
deps := launcherDeps{
buildState: func(ctx context.Context) (*launch.LauncherState, error) {
return &launch.LauncherState{}, nil
},
runMenu: runMenu,
resolveRunModel: func(ctx context.Context, req launch.RunModelRequest) (string, error) {
gotReq = req
return tt.wantModel, nil
},
launchIntegration: unexpectedIntegrationLaunch(t),
runModel: func(cmd *cobra.Command, model string) error {
launched = model
return nil
},
}
cmd := &cobra.Command{}
cmd.SetContext(context.Background())
for {
continueLoop, err := runInteractiveTUIStep(cmd, deps)
if err != nil {
t.Fatalf("unexpected step error: %v", err)
}
if !continueLoop {
break
}
}
if gotReq.ForcePicker != tt.wantForce {
t.Fatalf("expected ForcePicker=%v, got %v", tt.wantForce, gotReq.ForcePicker)
}
if launched != tt.wantModel {
t.Fatalf("expected interactive launcher to run %q, got %q", tt.wantModel, launched)
}
if got := config.LastSelection(); got != "run" {
t.Fatalf("expected last selection to be run, got %q", got)
}
})
}
}
func TestRunInteractiveTUI_IntegrationActionsUseLaunchIntegration(t *testing.T) {
tests := []struct {
name string
action tui.TUIAction
wantForce bool
}{
{
name: "enter launches integration",
action: tui.TUIAction{Kind: tui.TUIActionLaunchIntegration, Integration: "claude"},
},
{
name: "right forces configure",
action: tui.TUIAction{Kind: tui.TUIActionLaunchIntegration, Integration: "claude", ForceConfigure: true},
wantForce: true,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
setCmdTestHome(t, t.TempDir())
var menuCalls int
runMenu := func(state *launch.LauncherState) (tui.TUIAction, error) {
menuCalls++
if menuCalls == 1 {
return tt.action, nil
}
return tui.TUIAction{Kind: tui.TUIActionNone}, nil
}
var gotReq launch.IntegrationLaunchRequest
deps := launcherDeps{
buildState: func(ctx context.Context) (*launch.LauncherState, error) {
return &launch.LauncherState{}, nil
},
runMenu: runMenu,
resolveRunModel: unexpectedRunModelResolution(t),
launchIntegration: func(ctx context.Context, req launch.IntegrationLaunchRequest) error {
gotReq = req
return nil
},
runModel: unexpectedModelLaunch(t),
}
cmd := &cobra.Command{}
cmd.SetContext(context.Background())
for {
continueLoop, err := runInteractiveTUIStep(cmd, deps)
if err != nil {
t.Fatalf("unexpected step error: %v", err)
}
if !continueLoop {
break
}
}
if gotReq.Name != "claude" {
t.Fatalf("expected integration name to be passed through, got %q", gotReq.Name)
}
if gotReq.ForceConfigure != tt.wantForce {
t.Fatalf("expected ForceConfigure=%v, got %v", tt.wantForce, gotReq.ForceConfigure)
}
if got := config.LastSelection(); got != "claude" {
t.Fatalf("expected last selection to be claude, got %q", got)
}
})
}
}
func TestRunLauncherAction_RunModelContinuesAfterCancellation(t *testing.T) {
setCmdTestHome(t, t.TempDir())
cmd := &cobra.Command{}
cmd.SetContext(context.Background())
continueLoop, err := runLauncherAction(cmd, tui.TUIAction{Kind: tui.TUIActionRunModel}, launcherDeps{
buildState: nil,
runMenu: nil,
resolveRunModel: func(ctx context.Context, req launch.RunModelRequest) (string, error) {
return "", launch.ErrCancelled
},
launchIntegration: unexpectedIntegrationLaunch(t),
runModel: unexpectedModelLaunch(t),
})
if err != nil {
t.Fatalf("expected nil error on cancellation, got %v", err)
}
if !continueLoop {
t.Fatal("expected cancellation to continue the menu loop")
}
}
func TestRunLauncherAction_VSCodeExitsTUILoop(t *testing.T) {
setCmdTestHome(t, t.TempDir())
cmd := &cobra.Command{}
cmd.SetContext(context.Background())
// VS Code should exit the TUI loop (return false) after a successful launch.
continueLoop, err := runLauncherAction(cmd, tui.TUIAction{Kind: tui.TUIActionLaunchIntegration, Integration: "vscode"}, launcherDeps{
resolveRunModel: unexpectedRunModelResolution(t),
launchIntegration: func(ctx context.Context, req launch.IntegrationLaunchRequest) error {
return nil
},
runModel: unexpectedModelLaunch(t),
})
if err != nil {
t.Fatalf("expected nil error, got %v", err)
}
if continueLoop {
t.Fatal("expected vscode launch to exit the TUI loop (return false)")
}
// Other integrations should continue the TUI loop (return true).
continueLoop, err = runLauncherAction(cmd, tui.TUIAction{Kind: tui.TUIActionLaunchIntegration, Integration: "claude"}, launcherDeps{
resolveRunModel: unexpectedRunModelResolution(t),
launchIntegration: func(ctx context.Context, req launch.IntegrationLaunchRequest) error {
return nil
},
runModel: unexpectedModelLaunch(t),
})
if err != nil {
t.Fatalf("expected nil error, got %v", err)
}
if !continueLoop {
t.Fatal("expected non-vscode integration to continue the TUI loop (return true)")
}
}
func TestRunLauncherAction_IntegrationContinuesAfterCancellation(t *testing.T) {
setCmdTestHome(t, t.TempDir())
cmd := &cobra.Command{}
cmd.SetContext(context.Background())
continueLoop, err := runLauncherAction(cmd, tui.TUIAction{Kind: tui.TUIActionLaunchIntegration, Integration: "claude"}, launcherDeps{
buildState: nil,
runMenu: nil,
resolveRunModel: unexpectedRunModelResolution(t),
launchIntegration: func(ctx context.Context, req launch.IntegrationLaunchRequest) error {
return launch.ErrCancelled
},
runModel: unexpectedModelLaunch(t),
})
if err != nil {
t.Fatalf("expected nil error on cancellation, got %v", err)
}
if !continueLoop {
t.Fatal("expected cancellation to continue the menu loop")
}
}

View File

@@ -3,6 +3,7 @@ package cmd
import ( import (
"bytes" "bytes"
"encoding/json" "encoding/json"
"errors"
"fmt" "fmt"
"io" "io"
"net/http" "net/http"
@@ -300,7 +301,7 @@ Weigh anchor!
ParameterSize: "7B", ParameterSize: "7B",
QuantizationLevel: "FP16", QuantizationLevel: "FP16",
}, },
Requires: "0.14.0", Requires: "0.19.0",
}, false, &b); err != nil { }, false, &b); err != nil {
t.Fatal(err) t.Fatal(err)
} }
@@ -309,10 +310,17 @@ Weigh anchor!
architecture test architecture test
parameters 7B parameters 7B
quantization FP16 quantization FP16
requires 0.14.0 requires 0.19.0
` `
if diff := cmp.Diff(expect, b.String()); diff != "" { trimLinePadding := func(s string) string {
lines := strings.Split(s, "\n")
for i, line := range lines {
lines[i] = strings.TrimRight(line, " \t\r")
}
return strings.Join(lines, "\n")
}
if diff := cmp.Diff(trimLinePadding(expect), trimLinePadding(b.String())); diff != "" {
t.Errorf("unexpected output (-want +got):\n%s", diff) t.Errorf("unexpected output (-want +got):\n%s", diff)
} }
}) })
@@ -704,6 +712,347 @@ func TestRunEmbeddingModelNoInput(t *testing.T) {
} }
} }
func TestRunHandler_CloudAuthErrorOnShow_PrintsSigninMessage(t *testing.T) {
var generateCalled bool
mockServer := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
switch {
case r.URL.Path == "/api/show" && r.Method == http.MethodPost:
w.WriteHeader(http.StatusUnauthorized)
if err := json.NewEncoder(w).Encode(map[string]string{
"error": "unauthorized",
"signin_url": "https://ollama.com/signin",
}); err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
}
return
case r.URL.Path == "/api/generate" && r.Method == http.MethodPost:
generateCalled = true
w.WriteHeader(http.StatusOK)
if err := json.NewEncoder(w).Encode(api.GenerateResponse{Done: true}); err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
}
return
default:
http.NotFound(w, r)
}
}))
t.Setenv("OLLAMA_HOST", mockServer.URL)
t.Cleanup(mockServer.Close)
cmd := &cobra.Command{}
cmd.SetContext(t.Context())
cmd.Flags().String("keepalive", "", "")
cmd.Flags().Bool("truncate", false, "")
cmd.Flags().Int("dimensions", 0, "")
cmd.Flags().Bool("verbose", false, "")
cmd.Flags().Bool("insecure", false, "")
cmd.Flags().Bool("nowordwrap", false, "")
cmd.Flags().String("format", "", "")
cmd.Flags().String("think", "", "")
cmd.Flags().Bool("hidethinking", false, "")
oldStdout := os.Stdout
readOut, writeOut, _ := os.Pipe()
os.Stdout = writeOut
t.Cleanup(func() { os.Stdout = oldStdout })
err := RunHandler(cmd, []string{"gpt-oss:20b:cloud", "hi"})
_ = writeOut.Close()
var out bytes.Buffer
_, _ = io.Copy(&out, readOut)
if err != nil {
t.Fatalf("RunHandler returned error: %v", err)
}
if generateCalled {
t.Fatal("expected run to stop before /api/generate after unauthorized /api/show")
}
if !strings.Contains(out.String(), "You need to be signed in to Ollama to run Cloud models.") {
t.Fatalf("expected sign-in guidance message, got %q", out.String())
}
if !strings.Contains(out.String(), "https://ollama.com/signin") {
t.Fatalf("expected signin_url in output, got %q", out.String())
}
}
func TestRunHandler_CloudAuthErrorOnGenerate_PrintsSigninMessage(t *testing.T) {
mockServer := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
switch {
case r.URL.Path == "/api/show" && r.Method == http.MethodPost:
w.WriteHeader(http.StatusOK)
if err := json.NewEncoder(w).Encode(api.ShowResponse{
Capabilities: []model.Capability{model.CapabilityCompletion},
}); err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
}
return
case r.URL.Path == "/api/generate" && r.Method == http.MethodPost:
w.WriteHeader(http.StatusUnauthorized)
if err := json.NewEncoder(w).Encode(map[string]string{
"error": "unauthorized",
"signin_url": "https://ollama.com/signin",
}); err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
}
return
default:
http.NotFound(w, r)
}
}))
t.Setenv("OLLAMA_HOST", mockServer.URL)
t.Cleanup(mockServer.Close)
cmd := &cobra.Command{}
cmd.SetContext(t.Context())
cmd.Flags().String("keepalive", "", "")
cmd.Flags().Bool("truncate", false, "")
cmd.Flags().Int("dimensions", 0, "")
cmd.Flags().Bool("verbose", false, "")
cmd.Flags().Bool("insecure", false, "")
cmd.Flags().Bool("nowordwrap", false, "")
cmd.Flags().String("format", "", "")
cmd.Flags().String("think", "", "")
cmd.Flags().Bool("hidethinking", false, "")
oldStdout := os.Stdout
readOut, writeOut, _ := os.Pipe()
os.Stdout = writeOut
t.Cleanup(func() { os.Stdout = oldStdout })
err := RunHandler(cmd, []string{"gpt-oss:20b:cloud", "hi"})
_ = writeOut.Close()
var out bytes.Buffer
_, _ = io.Copy(&out, readOut)
if err != nil {
t.Fatalf("RunHandler returned error: %v", err)
}
if !strings.Contains(out.String(), "You need to be signed in to Ollama to run Cloud models.") {
t.Fatalf("expected sign-in guidance message, got %q", out.String())
}
if !strings.Contains(out.String(), "https://ollama.com/signin") {
t.Fatalf("expected signin_url in output, got %q", out.String())
}
}
func TestRunHandler_ExplicitCloudStubMissing_PullsNormalizedNameTEMP(t *testing.T) {
var pulledModel string
var generateCalled bool
mockServer := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
switch {
case r.URL.Path == "/api/show" && r.Method == http.MethodPost:
w.WriteHeader(http.StatusOK)
if err := json.NewEncoder(w).Encode(api.ShowResponse{
Capabilities: []model.Capability{model.CapabilityCompletion},
RemoteModel: "gpt-oss:20b",
}); err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
}
return
case r.URL.Path == "/api/tags" && r.Method == http.MethodGet:
w.WriteHeader(http.StatusOK)
if err := json.NewEncoder(w).Encode(api.ListResponse{Models: nil}); err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
}
return
case r.URL.Path == "/api/pull" && r.Method == http.MethodPost:
var req api.PullRequest
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
http.Error(w, err.Error(), http.StatusBadRequest)
return
}
pulledModel = req.Model
w.WriteHeader(http.StatusOK)
if err := json.NewEncoder(w).Encode(api.ProgressResponse{Status: "success"}); err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
}
return
case r.URL.Path == "/api/generate" && r.Method == http.MethodPost:
generateCalled = true
w.WriteHeader(http.StatusOK)
if err := json.NewEncoder(w).Encode(api.GenerateResponse{Done: true}); err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
}
return
default:
http.NotFound(w, r)
}
}))
t.Setenv("OLLAMA_HOST", mockServer.URL)
t.Cleanup(mockServer.Close)
cmd := &cobra.Command{}
cmd.SetContext(t.Context())
cmd.Flags().String("keepalive", "", "")
cmd.Flags().Bool("truncate", false, "")
cmd.Flags().Int("dimensions", 0, "")
cmd.Flags().Bool("verbose", false, "")
cmd.Flags().Bool("insecure", false, "")
cmd.Flags().Bool("nowordwrap", false, "")
cmd.Flags().String("format", "", "")
cmd.Flags().String("think", "", "")
cmd.Flags().Bool("hidethinking", false, "")
err := RunHandler(cmd, []string{"gpt-oss:20b:cloud", "hi"})
if err != nil {
t.Fatalf("RunHandler returned error: %v", err)
}
if pulledModel != "gpt-oss:20b-cloud" {
t.Fatalf("expected normalized pull model %q, got %q", "gpt-oss:20b-cloud", pulledModel)
}
if !generateCalled {
t.Fatal("expected /api/generate to be called")
}
}
func TestRunHandler_ExplicitCloudStubPresent_SkipsPullTEMP(t *testing.T) {
var pullCalled bool
var generateCalled bool
mockServer := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
switch {
case r.URL.Path == "/api/show" && r.Method == http.MethodPost:
w.WriteHeader(http.StatusOK)
if err := json.NewEncoder(w).Encode(api.ShowResponse{
Capabilities: []model.Capability{model.CapabilityCompletion},
RemoteModel: "gpt-oss:20b",
}); err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
}
return
case r.URL.Path == "/api/tags" && r.Method == http.MethodGet:
w.WriteHeader(http.StatusOK)
if err := json.NewEncoder(w).Encode(api.ListResponse{
Models: []api.ListModelResponse{{Name: "gpt-oss:20b-cloud"}},
}); err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
}
return
case r.URL.Path == "/api/pull" && r.Method == http.MethodPost:
pullCalled = true
w.WriteHeader(http.StatusOK)
if err := json.NewEncoder(w).Encode(api.ProgressResponse{Status: "success"}); err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
}
return
case r.URL.Path == "/api/generate" && r.Method == http.MethodPost:
generateCalled = true
w.WriteHeader(http.StatusOK)
if err := json.NewEncoder(w).Encode(api.GenerateResponse{Done: true}); err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
}
return
default:
http.NotFound(w, r)
}
}))
t.Setenv("OLLAMA_HOST", mockServer.URL)
t.Cleanup(mockServer.Close)
cmd := &cobra.Command{}
cmd.SetContext(t.Context())
cmd.Flags().String("keepalive", "", "")
cmd.Flags().Bool("truncate", false, "")
cmd.Flags().Int("dimensions", 0, "")
cmd.Flags().Bool("verbose", false, "")
cmd.Flags().Bool("insecure", false, "")
cmd.Flags().Bool("nowordwrap", false, "")
cmd.Flags().String("format", "", "")
cmd.Flags().String("think", "", "")
cmd.Flags().Bool("hidethinking", false, "")
err := RunHandler(cmd, []string{"gpt-oss:20b:cloud", "hi"})
if err != nil {
t.Fatalf("RunHandler returned error: %v", err)
}
if pullCalled {
t.Fatal("expected /api/pull not to be called when cloud stub already exists")
}
if !generateCalled {
t.Fatal("expected /api/generate to be called")
}
}
func TestRunHandler_ExplicitCloudStubPullFailure_IsBestEffortTEMP(t *testing.T) {
var generateCalled bool
mockServer := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
switch {
case r.URL.Path == "/api/show" && r.Method == http.MethodPost:
w.WriteHeader(http.StatusOK)
if err := json.NewEncoder(w).Encode(api.ShowResponse{
Capabilities: []model.Capability{model.CapabilityCompletion},
RemoteModel: "gpt-oss:20b",
}); err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
}
return
case r.URL.Path == "/api/tags" && r.Method == http.MethodGet:
w.WriteHeader(http.StatusOK)
if err := json.NewEncoder(w).Encode(api.ListResponse{Models: nil}); err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
}
return
case r.URL.Path == "/api/pull" && r.Method == http.MethodPost:
w.WriteHeader(http.StatusInternalServerError)
if err := json.NewEncoder(w).Encode(map[string]string{"error": "pull failed"}); err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
}
return
case r.URL.Path == "/api/generate" && r.Method == http.MethodPost:
generateCalled = true
w.WriteHeader(http.StatusOK)
if err := json.NewEncoder(w).Encode(api.GenerateResponse{Done: true}); err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
}
return
default:
http.NotFound(w, r)
}
}))
t.Setenv("OLLAMA_HOST", mockServer.URL)
t.Cleanup(mockServer.Close)
cmd := &cobra.Command{}
cmd.SetContext(t.Context())
cmd.Flags().String("keepalive", "", "")
cmd.Flags().Bool("truncate", false, "")
cmd.Flags().Int("dimensions", 0, "")
cmd.Flags().Bool("verbose", false, "")
cmd.Flags().Bool("insecure", false, "")
cmd.Flags().Bool("nowordwrap", false, "")
cmd.Flags().String("format", "", "")
cmd.Flags().String("think", "", "")
cmd.Flags().Bool("hidethinking", false, "")
err := RunHandler(cmd, []string{"gpt-oss:20b:cloud", "hi"})
if err != nil {
t.Fatalf("RunHandler returned error: %v", err)
}
if !generateCalled {
t.Fatal("expected /api/generate to be called despite pull failure")
}
}
func TestGetModelfileName(t *testing.T) { func TestGetModelfileName(t *testing.T) {
tests := []struct { tests := []struct {
name string name string
@@ -1211,6 +1560,20 @@ func TestNewCreateRequest(t *testing.T) {
Model: "newmodel", Model: "newmodel",
}, },
}, },
{
"explicit cloud model preserves source when parent lacks it",
"newmodel",
runOptions{
Model: "qwen3.5:cloud",
ParentModel: "qwen3.5",
Messages: []api.Message{},
WordWrap: true,
},
&api.CreateRequest{
From: "qwen3.5:cloud",
Model: "newmodel",
},
},
{ {
"parent model as filepath test", "parent model as filepath test",
"newmodel", "newmodel",
@@ -1292,6 +1655,24 @@ func TestNewCreateRequest(t *testing.T) {
}, },
}, },
}, },
{
"loaded messages are preserved when saving",
"newmodel",
runOptions{
Model: "mymodel",
ParentModel: "parentmodel",
LoadedMessages: []api.Message{{Role: "assistant", Content: "loaded"}},
Messages: []api.Message{{Role: "user", Content: "new"}},
},
&api.CreateRequest{
From: "parentmodel",
Model: "newmodel",
Messages: []api.Message{
{Role: "assistant", Content: "loaded"},
{Role: "user", Content: "new"},
},
},
},
} }
for _, tt := range tests { for _, tt := range tests {
@@ -1304,6 +1685,33 @@ func TestNewCreateRequest(t *testing.T) {
} }
} }
func TestApplyShowResponseToRunOptions(t *testing.T) {
opts := runOptions{}
info := &api.ShowResponse{
Details: api.ModelDetails{
ParentModel: "parentmodel",
},
Messages: []api.Message{
{Role: "assistant", Content: "loaded"},
},
}
applyShowResponseToRunOptions(&opts, info)
if opts.ParentModel != "parentmodel" {
t.Fatalf("ParentModel = %q, want %q", opts.ParentModel, "parentmodel")
}
if !cmp.Equal(opts.LoadedMessages, info.Messages) {
t.Fatalf("LoadedMessages = %#v, want %#v", opts.LoadedMessages, info.Messages)
}
info.Messages[0].Content = "modified"
if opts.LoadedMessages[0].Content == "modified" {
t.Fatal("LoadedMessages should be copied independently from ShowResponse")
}
}
func TestRunOptions_Copy(t *testing.T) { func TestRunOptions_Copy(t *testing.T) {
// Setup test data // Setup test data
originalKeepAlive := &api.Duration{Duration: 5 * time.Minute} originalKeepAlive := &api.Duration{Duration: 5 * time.Minute}
@@ -1312,6 +1720,7 @@ func TestRunOptions_Copy(t *testing.T) {
original := runOptions{ original := runOptions{
Model: "test-model", Model: "test-model",
ParentModel: "parent-model", ParentModel: "parent-model",
LoadedMessages: []api.Message{{Role: "assistant", Content: "loaded hello"}},
Prompt: "test prompt", Prompt: "test prompt",
Messages: []api.Message{ Messages: []api.Message{
{Role: "user", Content: "hello"}, {Role: "user", Content: "hello"},
@@ -1352,6 +1761,7 @@ func TestRunOptions_Copy(t *testing.T) {
}{ }{
{"Model", copied.Model, original.Model}, {"Model", copied.Model, original.Model},
{"ParentModel", copied.ParentModel, original.ParentModel}, {"ParentModel", copied.ParentModel, original.ParentModel},
{"LoadedMessages", copied.LoadedMessages, original.LoadedMessages},
{"Prompt", copied.Prompt, original.Prompt}, {"Prompt", copied.Prompt, original.Prompt},
{"WordWrap", copied.WordWrap, original.WordWrap}, {"WordWrap", copied.WordWrap, original.WordWrap},
{"Format", copied.Format, original.Format}, {"Format", copied.Format, original.Format},
@@ -1456,6 +1866,7 @@ func TestRunOptions_Copy(t *testing.T) {
func TestRunOptions_Copy_EmptySlicesAndMaps(t *testing.T) { func TestRunOptions_Copy_EmptySlicesAndMaps(t *testing.T) {
// Test with empty slices and maps // Test with empty slices and maps
original := runOptions{ original := runOptions{
LoadedMessages: []api.Message{},
Messages: []api.Message{}, Messages: []api.Message{},
Images: []api.ImageData{}, Images: []api.ImageData{},
Options: map[string]any{}, Options: map[string]any{},
@@ -1463,6 +1874,10 @@ func TestRunOptions_Copy_EmptySlicesAndMaps(t *testing.T) {
copied := original.Copy() copied := original.Copy()
if copied.LoadedMessages == nil {
t.Error("Empty LoadedMessages slice should remain empty, not nil")
}
if copied.Messages == nil { if copied.Messages == nil {
t.Error("Empty Messages slice should remain empty, not nil") t.Error("Empty Messages slice should remain empty, not nil")
} }
@@ -1479,6 +1894,10 @@ func TestRunOptions_Copy_EmptySlicesAndMaps(t *testing.T) {
t.Error("Empty Messages slice should remain empty") t.Error("Empty Messages slice should remain empty")
} }
if len(copied.LoadedMessages) != 0 {
t.Error("Empty LoadedMessages slice should remain empty")
}
if len(copied.Images) != 0 { if len(copied.Images) != 0 {
t.Error("Empty Images slice should remain empty") t.Error("Empty Images slice should remain empty")
} }
@@ -1553,10 +1972,10 @@ func TestShowInfoImageGen(t *testing.T) {
Details: api.ModelDetails{ Details: api.ModelDetails{
Family: "ZImagePipeline", Family: "ZImagePipeline",
ParameterSize: "10.3B", ParameterSize: "10.3B",
QuantizationLevel: "FP8", QuantizationLevel: "Q8",
}, },
Capabilities: []model.Capability{model.CapabilityImage}, Capabilities: []model.Capability{model.CapabilityImage},
Requires: "0.14.0", Requires: "0.19.0",
}, false, &b) }, false, &b)
if err != nil { if err != nil {
t.Fatal(err) t.Fatal(err)
@@ -1565,8 +1984,8 @@ func TestShowInfoImageGen(t *testing.T) {
expect := " Model\n" + expect := " Model\n" +
" architecture ZImagePipeline \n" + " architecture ZImagePipeline \n" +
" parameters 10.3B \n" + " parameters 10.3B \n" +
" quantization FP8 \n" + " quantization Q8 \n" +
" requires 0.14.0 \n" + " requires 0.19.0 \n" +
"\n" + "\n" +
" Capabilities\n" + " Capabilities\n" +
" image \n" + " image \n" +
@@ -1625,6 +2044,7 @@ func TestRunOptions_Copy_Independence(t *testing.T) {
originalThink := &api.ThinkValue{Value: "original"} originalThink := &api.ThinkValue{Value: "original"}
original := runOptions{ original := runOptions{
Model: "original-model", Model: "original-model",
LoadedMessages: []api.Message{{Role: "assistant", Content: "loaded"}},
Messages: []api.Message{{Role: "user", Content: "original"}}, Messages: []api.Message{{Role: "user", Content: "original"}},
Options: map[string]any{"key": "value"}, Options: map[string]any{"key": "value"},
Think: originalThink, Think: originalThink,
@@ -1634,6 +2054,9 @@ func TestRunOptions_Copy_Independence(t *testing.T) {
// Modify original // Modify original
original.Model = "modified-model" original.Model = "modified-model"
if len(original.LoadedMessages) > 0 {
original.LoadedMessages[0].Content = "modified loaded"
}
if len(original.Messages) > 0 { if len(original.Messages) > 0 {
original.Messages[0].Content = "modified" original.Messages[0].Content = "modified"
} }
@@ -1647,6 +2070,10 @@ func TestRunOptions_Copy_Independence(t *testing.T) {
t.Error("Copy Model should not be affected by original modification") t.Error("Copy Model should not be affected by original modification")
} }
if len(copied.LoadedMessages) > 0 && copied.LoadedMessages[0].Content == "modified loaded" {
t.Error("Copy LoadedMessages should not be affected by original modification")
}
if len(copied.Messages) > 0 && copied.Messages[0].Content == "modified" { if len(copied.Messages) > 0 && copied.Messages[0].Content == "modified" {
t.Error("Copy Messages should not be affected by original modification") t.Error("Copy Messages should not be affected by original modification")
} }
@@ -1659,3 +2086,194 @@ func TestRunOptions_Copy_Independence(t *testing.T) {
t.Error("Copy Think should not be affected by original modification") t.Error("Copy Think should not be affected by original modification")
} }
} }
func TestLoadOrUnloadModel_CloudModelAuth(t *testing.T) {
tests := []struct {
name string
model string
showStatus int
remoteHost string
remoteModel string
whoamiStatus int
whoamiResp any
expectWhoami bool
expectedError string
expectAuthError bool
}{
{
name: "ollama.com cloud model - user signed in",
model: "test-cloud-model",
remoteHost: "https://ollama.com",
remoteModel: "test-model",
whoamiStatus: http.StatusOK,
whoamiResp: api.UserResponse{Name: "testuser"},
expectWhoami: true,
},
{
name: "ollama.com cloud model - user not signed in",
model: "test-cloud-model",
remoteHost: "https://ollama.com",
remoteModel: "test-model",
whoamiStatus: http.StatusUnauthorized,
whoamiResp: map[string]string{
"error": "unauthorized",
"signin_url": "https://ollama.com/signin",
},
expectWhoami: true,
expectedError: "unauthorized",
expectAuthError: true,
},
{
name: "non-ollama.com remote - no auth check",
model: "test-cloud-model",
remoteHost: "https://other-remote.com",
remoteModel: "test-model",
whoamiStatus: http.StatusUnauthorized, // should not be called
whoamiResp: nil,
},
{
name: "explicit :cloud model - auth check without remote metadata",
model: "kimi-k2.5:cloud",
remoteHost: "",
remoteModel: "",
whoamiStatus: http.StatusOK,
whoamiResp: api.UserResponse{Name: "testuser"},
expectWhoami: true,
},
{
name: "explicit :cloud model without local stub returns not found by default",
model: "minimax-m2.7:cloud",
showStatus: http.StatusNotFound,
whoamiStatus: http.StatusOK,
whoamiResp: api.UserResponse{Name: "testuser"},
expectedError: "not found",
expectWhoami: false,
expectAuthError: false,
},
{
name: "explicit -cloud model - auth check without remote metadata",
model: "kimi-k2.5:latest-cloud",
remoteHost: "",
remoteModel: "",
whoamiStatus: http.StatusOK,
whoamiResp: api.UserResponse{Name: "testuser"},
expectWhoami: true,
},
{
name: "dash cloud-like name without explicit source does not require auth",
model: "test-cloud-model",
remoteHost: "",
remoteModel: "",
whoamiStatus: http.StatusUnauthorized, // should not be called
whoamiResp: nil,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
whoamiCalled := false
mockServer := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
switch r.URL.Path {
case "/api/show":
if tt.showStatus != 0 && tt.showStatus != http.StatusOK {
w.WriteHeader(tt.showStatus)
_ = json.NewEncoder(w).Encode(map[string]string{"error": "not found"})
return
}
w.Header().Set("Content-Type", "application/json")
if err := json.NewEncoder(w).Encode(api.ShowResponse{
RemoteHost: tt.remoteHost,
RemoteModel: tt.remoteModel,
}); err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
}
case "/api/me":
whoamiCalled = true
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(tt.whoamiStatus)
if tt.whoamiResp != nil {
if err := json.NewEncoder(w).Encode(tt.whoamiResp); err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
}
}
case "/api/generate":
w.WriteHeader(http.StatusOK)
default:
http.NotFound(w, r)
}
}))
defer mockServer.Close()
t.Setenv("OLLAMA_HOST", mockServer.URL)
cmd := &cobra.Command{}
cmd.SetContext(t.Context())
opts := &runOptions{
Model: tt.model,
ShowConnect: false,
}
err := loadOrUnloadModel(cmd, opts)
if whoamiCalled != tt.expectWhoami {
t.Errorf("whoami called = %v, want %v", whoamiCalled, tt.expectWhoami)
}
if tt.expectedError != "" {
if err == nil {
t.Errorf("expected error containing %q, got nil", tt.expectedError)
} else {
if !tt.expectAuthError && !strings.Contains(strings.ToLower(err.Error()), strings.ToLower(tt.expectedError)) {
t.Errorf("expected error containing %q, got %v", tt.expectedError, err)
}
if tt.expectAuthError {
var authErr api.AuthorizationError
if !errors.As(err, &authErr) {
t.Errorf("expected AuthorizationError, got %T: %v", err, err)
}
}
}
} else {
if err != nil {
t.Errorf("expected no error, got %v", err)
}
}
})
}
}
func TestIsLocalhost(t *testing.T) {
tests := []struct {
name string
host string
expected bool
}{
{"default empty", "", true},
{"localhost no port", "localhost", true},
{"localhost with port", "localhost:11435", true},
{"127.0.0.1 no port", "127.0.0.1", true},
{"127.0.0.1 with port", "127.0.0.1:11434", true},
{"0.0.0.0 no port", "0.0.0.0", true},
{"0.0.0.0 with port", "0.0.0.0:11434", true},
{"::1 no port", "::1", true},
{"[::1] with port", "[::1]:11434", true},
{"loopback with scheme", "http://localhost:11434", true},
{"remote hostname", "example.com", false},
{"remote hostname with port", "example.com:11434", false},
{"remote IP", "192.168.1.1", false},
{"remote IP with port", "192.168.1.1:11434", false},
{"remote with scheme", "http://example.com:11434", false},
{"https remote", "https://example.com:443", false},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
t.Setenv("OLLAMA_HOST", tt.host)
got := isLocalhost()
if got != tt.expected {
t.Errorf("isLocalhost() with OLLAMA_HOST=%q = %v, want %v", tt.host, got, tt.expected)
}
})
}
}

284
cmd/config/config.go Normal file
View File

@@ -0,0 +1,284 @@
// Package config provides integration configuration for external coding tools
// (Claude Code, Codex, Droid, OpenCode) to use Ollama models.
package config
import (
"encoding/json"
"errors"
"fmt"
"os"
"path/filepath"
"strings"
"github.com/ollama/ollama/cmd/internal/fileutil"
)
type integration struct {
Models []string `json:"models"`
Aliases map[string]string `json:"aliases,omitempty"`
Onboarded bool `json:"onboarded,omitempty"`
}
// IntegrationConfig is the persisted config for one integration.
type IntegrationConfig = integration
type config struct {
Integrations map[string]*integration `json:"integrations"`
LastModel string `json:"last_model,omitempty"`
LastSelection string `json:"last_selection,omitempty"` // "run" or integration name
}
func configPath() (string, error) {
home, err := os.UserHomeDir()
if err != nil {
return "", err
}
return filepath.Join(home, ".ollama", "config.json"), nil
}
func legacyConfigPath() (string, error) {
home, err := os.UserHomeDir()
if err != nil {
return "", err
}
return filepath.Join(home, ".ollama", "config", "config.json"), nil
}
// migrateConfig moves the config from the legacy path to ~/.ollama/config.json
func migrateConfig() (bool, error) {
oldPath, err := legacyConfigPath()
if err != nil {
return false, err
}
oldData, err := os.ReadFile(oldPath)
if err != nil {
if os.IsNotExist(err) {
return false, nil
}
return false, err
}
// Ignore legacy files with invalid JSON and continue startup.
if !json.Valid(oldData) {
return false, nil
}
newPath, err := configPath()
if err != nil {
return false, err
}
if err := os.MkdirAll(filepath.Dir(newPath), 0o755); err != nil {
return false, err
}
if err := os.WriteFile(newPath, oldData, 0o644); err != nil {
return false, fmt.Errorf("write new config: %w", err)
}
_ = os.Remove(oldPath)
_ = os.Remove(filepath.Dir(oldPath)) // clean up empty directory
return true, nil
}
func load() (*config, error) {
path, err := configPath()
if err != nil {
return nil, err
}
data, err := os.ReadFile(path)
if err != nil && os.IsNotExist(err) {
if migrated, merr := migrateConfig(); merr == nil && migrated {
data, err = os.ReadFile(path)
}
}
if err != nil {
if os.IsNotExist(err) {
return &config{Integrations: make(map[string]*integration)}, nil
}
return nil, err
}
var cfg config
if err := json.Unmarshal(data, &cfg); err != nil {
return nil, fmt.Errorf("failed to parse config: %w, at: %s", err, path)
}
if cfg.Integrations == nil {
cfg.Integrations = make(map[string]*integration)
}
return &cfg, nil
}
func save(cfg *config) error {
path, err := configPath()
if err != nil {
return err
}
if err := os.MkdirAll(filepath.Dir(path), 0o755); err != nil {
return err
}
data, err := json.MarshalIndent(cfg, "", " ")
if err != nil {
return err
}
return fileutil.WriteWithBackup(path, data)
}
func SaveIntegration(appName string, models []string) error {
if appName == "" {
return errors.New("app name cannot be empty")
}
cfg, err := load()
if err != nil {
return err
}
key := strings.ToLower(appName)
existing := cfg.Integrations[key]
var aliases map[string]string
var onboarded bool
if existing != nil {
aliases = existing.Aliases
onboarded = existing.Onboarded
}
cfg.Integrations[key] = &integration{
Models: models,
Aliases: aliases,
Onboarded: onboarded,
}
return save(cfg)
}
// MarkIntegrationOnboarded marks an integration as onboarded in Ollama's config.
func MarkIntegrationOnboarded(appName string) error {
cfg, err := load()
if err != nil {
return err
}
key := strings.ToLower(appName)
existing := cfg.Integrations[key]
if existing == nil {
existing = &integration{}
}
existing.Onboarded = true
cfg.Integrations[key] = existing
return save(cfg)
}
// IntegrationModel returns the first configured model for an integration, or empty string if not configured.
func IntegrationModel(appName string) string {
integrationConfig, err := LoadIntegration(appName)
if err != nil || len(integrationConfig.Models) == 0 {
return ""
}
return integrationConfig.Models[0]
}
// IntegrationModels returns all configured models for an integration, or nil.
func IntegrationModels(appName string) []string {
integrationConfig, err := LoadIntegration(appName)
if err != nil || len(integrationConfig.Models) == 0 {
return nil
}
return integrationConfig.Models
}
// LastModel returns the last model that was run, or empty string if none.
func LastModel() string {
cfg, err := load()
if err != nil {
return ""
}
return cfg.LastModel
}
// SetLastModel saves the last model that was run.
func SetLastModel(model string) error {
cfg, err := load()
if err != nil {
return err
}
cfg.LastModel = model
return save(cfg)
}
// LastSelection returns the last menu selection ("run" or integration name), or empty string if none.
func LastSelection() string {
cfg, err := load()
if err != nil {
return ""
}
return cfg.LastSelection
}
// SetLastSelection saves the last menu selection ("run" or integration name).
func SetLastSelection(selection string) error {
cfg, err := load()
if err != nil {
return err
}
cfg.LastSelection = selection
return save(cfg)
}
// LoadIntegration returns the saved config for one integration.
func LoadIntegration(appName string) (*integration, error) {
cfg, err := load()
if err != nil {
return nil, err
}
integrationConfig, ok := cfg.Integrations[strings.ToLower(appName)]
if !ok {
return nil, os.ErrNotExist
}
return integrationConfig, nil
}
// SaveAliases replaces the saved aliases for one integration.
func SaveAliases(appName string, aliases map[string]string) error {
if appName == "" {
return errors.New("app name cannot be empty")
}
cfg, err := load()
if err != nil {
return err
}
key := strings.ToLower(appName)
existing := cfg.Integrations[key]
if existing == nil {
existing = &integration{}
}
// Replace aliases entirely (not merge) so deletions are persisted
existing.Aliases = aliases
cfg.Integrations[key] = existing
return save(cfg)
}
func listIntegrations() ([]integration, error) {
cfg, err := load()
if err != nil {
return nil, err
}
result := make([]integration, 0, len(cfg.Integrations))
for _, integrationConfig := range cfg.Integrations {
result = append(result, *integrationConfig)
}
return result, nil
}

View File

@@ -0,0 +1,641 @@
package config
import (
"errors"
"os"
"path/filepath"
"testing"
)
func TestSetAliases_CloudModel(t *testing.T) {
// Test the SetAliases logic by checking the alias map behavior
aliases := map[string]string{
"primary": "kimi-k2.5:cloud",
"fast": "kimi-k2.5:cloud",
}
// Verify fast is set (cloud model behavior)
if aliases["fast"] == "" {
t.Error("cloud model should have fast alias set")
}
if aliases["fast"] != aliases["primary"] {
t.Errorf("fast should equal primary for auto-set, got fast=%q primary=%q", aliases["fast"], aliases["primary"])
}
}
func TestSetAliases_LocalModel(t *testing.T) {
aliases := map[string]string{
"primary": "llama3.2:latest",
}
// Simulate local model behavior: fast should be empty
delete(aliases, "fast")
if aliases["fast"] != "" {
t.Error("local model should have empty fast alias")
}
}
func TestSaveAliases_ReplacesNotMerges(t *testing.T) {
tmpDir := t.TempDir()
setTestHome(t, tmpDir)
// First save with both primary and fast
initial := map[string]string{
"primary": "cloud-model",
"fast": "cloud-model",
}
if err := SaveAliases("claude", initial); err != nil {
t.Fatalf("failed to save initial aliases: %v", err)
}
// Verify both are saved
loaded, err := LoadIntegration("claude")
if err != nil {
t.Fatalf("failed to load: %v", err)
}
if loaded.Aliases["fast"] != "cloud-model" {
t.Errorf("expected fast=cloud-model, got %q", loaded.Aliases["fast"])
}
// Now save without fast (simulating switch to local model)
updated := map[string]string{
"primary": "local-model",
// fast intentionally missing
}
if err := SaveAliases("claude", updated); err != nil {
t.Fatalf("failed to save updated aliases: %v", err)
}
// Verify fast is GONE (not merged/preserved)
loaded, err = LoadIntegration("claude")
if err != nil {
t.Fatalf("failed to load after update: %v", err)
}
if loaded.Aliases["fast"] != "" {
t.Errorf("fast should be removed after saving without it, got %q", loaded.Aliases["fast"])
}
if loaded.Aliases["primary"] != "local-model" {
t.Errorf("primary should be updated to local-model, got %q", loaded.Aliases["primary"])
}
}
func TestSaveAliases_PreservesModels(t *testing.T) {
tmpDir := t.TempDir()
setTestHome(t, tmpDir)
// First save integration with models
if err := SaveIntegration("claude", []string{"model1", "model2"}); err != nil {
t.Fatalf("failed to save integration: %v", err)
}
// Then update aliases
aliases := map[string]string{"primary": "new-model"}
if err := SaveAliases("claude", aliases); err != nil {
t.Fatalf("failed to save aliases: %v", err)
}
// Verify models are preserved
loaded, err := LoadIntegration("claude")
if err != nil {
t.Fatalf("failed to load: %v", err)
}
if len(loaded.Models) != 2 || loaded.Models[0] != "model1" {
t.Errorf("models should be preserved, got %v", loaded.Models)
}
}
// TestSaveAliases_EmptyMap clears all aliases
func TestSaveAliases_EmptyMap(t *testing.T) {
tmpDir := t.TempDir()
setTestHome(t, tmpDir)
// Save with aliases
if err := SaveAliases("claude", map[string]string{"primary": "model", "fast": "model"}); err != nil {
t.Fatalf("failed to save: %v", err)
}
// Save empty map
if err := SaveAliases("claude", map[string]string{}); err != nil {
t.Fatalf("failed to save empty: %v", err)
}
loaded, err := LoadIntegration("claude")
if err != nil {
t.Fatalf("failed to load: %v", err)
}
if len(loaded.Aliases) != 0 {
t.Errorf("aliases should be empty, got %v", loaded.Aliases)
}
}
// TestSaveAliases_NilMap handles nil gracefully
func TestSaveAliases_NilMap(t *testing.T) {
tmpDir := t.TempDir()
setTestHome(t, tmpDir)
// Save with aliases first
if err := SaveAliases("claude", map[string]string{"primary": "model"}); err != nil {
t.Fatalf("failed to save: %v", err)
}
// Save nil map - should clear aliases
if err := SaveAliases("claude", nil); err != nil {
t.Fatalf("failed to save nil: %v", err)
}
loaded, err := LoadIntegration("claude")
if err != nil {
t.Fatalf("failed to load: %v", err)
}
if len(loaded.Aliases) > 0 {
t.Errorf("aliases should be nil or empty, got %v", loaded.Aliases)
}
}
// TestSaveAliases_EmptyAppName returns error
func TestSaveAliases_EmptyAppName(t *testing.T) {
err := SaveAliases("", map[string]string{"primary": "model"})
if err == nil {
t.Error("expected error for empty app name")
}
}
func TestSaveAliases_CaseInsensitive(t *testing.T) {
tmpDir := t.TempDir()
setTestHome(t, tmpDir)
if err := SaveAliases("Claude", map[string]string{"primary": "model1"}); err != nil {
t.Fatalf("failed to save: %v", err)
}
// Load with different case
loaded, err := LoadIntegration("claude")
if err != nil {
t.Fatalf("failed to load: %v", err)
}
if loaded.Aliases["primary"] != "model1" {
t.Errorf("expected primary=model1, got %q", loaded.Aliases["primary"])
}
// Update with different case
if err := SaveAliases("CLAUDE", map[string]string{"primary": "model2"}); err != nil {
t.Fatalf("failed to update: %v", err)
}
loaded, err = LoadIntegration("claude")
if err != nil {
t.Fatalf("failed to load after update: %v", err)
}
if loaded.Aliases["primary"] != "model2" {
t.Errorf("expected primary=model2, got %q", loaded.Aliases["primary"])
}
}
// TestSaveAliases_CreatesIntegration creates integration if it doesn't exist
func TestSaveAliases_CreatesIntegration(t *testing.T) {
tmpDir := t.TempDir()
setTestHome(t, tmpDir)
// Save aliases for non-existent integration
if err := SaveAliases("newintegration", map[string]string{"primary": "model"}); err != nil {
t.Fatalf("failed to save: %v", err)
}
loaded, err := LoadIntegration("newintegration")
if err != nil {
t.Fatalf("failed to load: %v", err)
}
if loaded.Aliases["primary"] != "model" {
t.Errorf("expected primary=model, got %q", loaded.Aliases["primary"])
}
}
func TestConfigureAliases_AliasMap(t *testing.T) {
t.Run("cloud model auto-sets fast to primary", func(t *testing.T) {
aliases := make(map[string]string)
aliases["primary"] = "cloud-model"
// Simulate cloud model behavior
isCloud := true
if isCloud {
if aliases["fast"] == "" {
aliases["fast"] = aliases["primary"]
}
}
if aliases["fast"] != "cloud-model" {
t.Errorf("expected fast=cloud-model, got %q", aliases["fast"])
}
})
t.Run("cloud model preserves custom fast", func(t *testing.T) {
aliases := map[string]string{
"primary": "cloud-model",
"fast": "custom-fast-model",
}
// Simulate cloud model behavior - should preserve existing fast
isCloud := true
if isCloud {
if aliases["fast"] == "" {
aliases["fast"] = aliases["primary"]
}
}
if aliases["fast"] != "custom-fast-model" {
t.Errorf("expected fast=custom-fast-model (preserved), got %q", aliases["fast"])
}
})
t.Run("local model clears fast", func(t *testing.T) {
aliases := map[string]string{
"primary": "local-model",
"fast": "should-be-cleared",
}
// Simulate local model behavior
isCloud := false
if !isCloud {
delete(aliases, "fast")
}
if aliases["fast"] != "" {
t.Errorf("expected fast to be cleared, got %q", aliases["fast"])
}
})
t.Run("switching cloud to local clears fast", func(t *testing.T) {
// Start with cloud config
aliases := map[string]string{
"primary": "cloud-model",
"fast": "cloud-model",
}
// Switch to local
aliases["primary"] = "local-model"
isCloud := false
if !isCloud {
delete(aliases, "fast")
}
if aliases["fast"] != "" {
t.Errorf("fast should be cleared when switching to local, got %q", aliases["fast"])
}
if aliases["primary"] != "local-model" {
t.Errorf("primary should be updated, got %q", aliases["primary"])
}
})
t.Run("switching local to cloud sets fast", func(t *testing.T) {
// Start with local config (no fast)
aliases := map[string]string{
"primary": "local-model",
}
// Switch to cloud
aliases["primary"] = "cloud-model"
isCloud := true
if isCloud {
if aliases["fast"] == "" {
aliases["fast"] = aliases["primary"]
}
}
if aliases["fast"] != "cloud-model" {
t.Errorf("fast should be set when switching to cloud, got %q", aliases["fast"])
}
})
}
func TestSetAliases_PrefixMapping(t *testing.T) {
// This tests the expected mapping without needing a real client
aliases := map[string]string{
"primary": "my-cloud-model",
"fast": "my-fast-model",
}
expectedMappings := map[string]string{
"claude-sonnet-": aliases["primary"],
"claude-haiku-": aliases["fast"],
}
if expectedMappings["claude-sonnet-"] != "my-cloud-model" {
t.Errorf("claude-sonnet- should map to primary")
}
if expectedMappings["claude-haiku-"] != "my-fast-model" {
t.Errorf("claude-haiku- should map to fast")
}
}
func TestSetAliases_LocalDeletesPrefixes(t *testing.T) {
aliases := map[string]string{
"primary": "local-model",
// fast is empty/missing - indicates local model
}
prefixesToDelete := []string{"claude-sonnet-", "claude-haiku-"}
// Verify the logic: when fast is empty, we should delete
if aliases["fast"] != "" {
t.Error("fast should be empty for local model")
}
// Verify we have the right prefixes to delete
if len(prefixesToDelete) != 2 {
t.Errorf("expected 2 prefixes to delete, got %d", len(prefixesToDelete))
}
}
// TestAtomicUpdate_ServerFailsConfigNotSaved simulates atomic update behavior
func TestAtomicUpdate_ServerFailsConfigNotSaved(t *testing.T) {
tmpDir := t.TempDir()
setTestHome(t, tmpDir)
// Simulate: server fails, config should NOT be saved
serverErr := errors.New("server unavailable")
if serverErr == nil {
t.Error("config should NOT be saved when server fails")
}
}
// TestAtomicUpdate_ServerSucceedsConfigSaved simulates successful atomic update
func TestAtomicUpdate_ServerSucceedsConfigSaved(t *testing.T) {
tmpDir := t.TempDir()
setTestHome(t, tmpDir)
// Simulate: server succeeds, config should be saved
var serverErr error
if serverErr != nil {
t.Fatal("server should succeed")
}
if err := SaveAliases("claude", map[string]string{"primary": "model"}); err != nil {
t.Fatalf("saveAliases failed: %v", err)
}
// Verify it was actually saved
loaded, err := LoadIntegration("claude")
if err != nil {
t.Fatalf("failed to load: %v", err)
}
if loaded.Aliases["primary"] != "model" {
t.Errorf("expected primary=model, got %q", loaded.Aliases["primary"])
}
}
func TestConfigFile_PreservesUnknownFields(t *testing.T) {
tmpDir := t.TempDir()
setTestHome(t, tmpDir)
// Write config with extra fields
configPath := filepath.Join(tmpDir, ".ollama", "config.json")
os.MkdirAll(filepath.Dir(configPath), 0o755)
// Note: Our config struct only has Integrations, so top-level unknown fields
// won't be preserved by our current implementation. This test documents that.
initialConfig := `{
"integrations": {
"claude": {
"models": ["model1"],
"aliases": {"primary": "model1"},
"unknownField": "should be lost"
}
},
"topLevelUnknown": "will be lost"
}`
os.WriteFile(configPath, []byte(initialConfig), 0o644)
// Update aliases
if err := SaveAliases("claude", map[string]string{"primary": "model2"}); err != nil {
t.Fatalf("failed to save: %v", err)
}
// Read raw file to check
data, _ := os.ReadFile(configPath)
content := string(data)
// models should be preserved
if !contains(content, "model1") {
t.Error("models should be preserved")
}
// primary should be updated
if !contains(content, "model2") {
t.Error("primary should be updated to model2")
}
}
func contains(s, substr string) bool {
return len(s) >= len(substr) && (s == substr || len(s) > 0 && containsHelper(s, substr))
}
func containsHelper(s, substr string) bool {
for i := 0; i <= len(s)-len(substr); i++ {
if s[i:i+len(substr)] == substr {
return true
}
}
return false
}
func TestModelNameEdgeCases(t *testing.T) {
testCases := []struct {
name string
model string
}{
{"simple", "llama3.2"},
{"with tag", "llama3.2:latest"},
{"with cloud tag", "kimi-k2.5:cloud"},
{"with namespace", "library/llama3.2"},
{"with dots", "glm-4.7-flash"},
{"with numbers", "qwen3:8b"},
}
for _, tc := range testCases {
t.Run(tc.name, func(t *testing.T) {
tmpDir := t.TempDir()
setTestHome(t, tmpDir)
aliases := map[string]string{"primary": tc.model}
if err := SaveAliases("claude", aliases); err != nil {
t.Fatalf("failed to save model %q: %v", tc.model, err)
}
loaded, err := LoadIntegration("claude")
if err != nil {
t.Fatalf("failed to load: %v", err)
}
if loaded.Aliases["primary"] != tc.model {
t.Errorf("expected primary=%q, got %q", tc.model, loaded.Aliases["primary"])
}
})
}
}
func TestSwitchingScenarios(t *testing.T) {
t.Run("cloud to local removes fast", func(t *testing.T) {
tmpDir := t.TempDir()
setTestHome(t, tmpDir)
// Initial cloud config
if err := SaveAliases("claude", map[string]string{
"primary": "cloud-model",
"fast": "cloud-model",
}); err != nil {
t.Fatal(err)
}
// Switch to local (no fast)
if err := SaveAliases("claude", map[string]string{
"primary": "local-model",
}); err != nil {
t.Fatal(err)
}
loaded, _ := LoadIntegration("claude")
if loaded.Aliases["fast"] != "" {
t.Errorf("fast should be removed, got %q", loaded.Aliases["fast"])
}
if loaded.Aliases["primary"] != "local-model" {
t.Errorf("primary should be local-model, got %q", loaded.Aliases["primary"])
}
})
t.Run("local to cloud adds fast", func(t *testing.T) {
tmpDir := t.TempDir()
setTestHome(t, tmpDir)
// Initial local config
if err := SaveAliases("claude", map[string]string{
"primary": "local-model",
}); err != nil {
t.Fatal(err)
}
// Switch to cloud (with fast)
if err := SaveAliases("claude", map[string]string{
"primary": "cloud-model",
"fast": "cloud-model",
}); err != nil {
t.Fatal(err)
}
loaded, _ := LoadIntegration("claude")
if loaded.Aliases["fast"] != "cloud-model" {
t.Errorf("fast should be cloud-model, got %q", loaded.Aliases["fast"])
}
})
t.Run("cloud to different cloud updates both", func(t *testing.T) {
tmpDir := t.TempDir()
setTestHome(t, tmpDir)
// Initial cloud config
if err := SaveAliases("claude", map[string]string{
"primary": "cloud-model-1",
"fast": "cloud-model-1",
}); err != nil {
t.Fatal(err)
}
// Switch to different cloud
if err := SaveAliases("claude", map[string]string{
"primary": "cloud-model-2",
"fast": "cloud-model-2",
}); err != nil {
t.Fatal(err)
}
loaded, _ := LoadIntegration("claude")
if loaded.Aliases["primary"] != "cloud-model-2" {
t.Errorf("primary should be cloud-model-2, got %q", loaded.Aliases["primary"])
}
if loaded.Aliases["fast"] != "cloud-model-2" {
t.Errorf("fast should be cloud-model-2, got %q", loaded.Aliases["fast"])
}
})
}
func TestModelsAndAliasesMustStayInSync(t *testing.T) {
t.Run("saveAliases followed by saveIntegration keeps them in sync", func(t *testing.T) {
tmpDir := t.TempDir()
setTestHome(t, tmpDir)
// Save aliases with one model
if err := SaveAliases("claude", map[string]string{"primary": "model-a"}); err != nil {
t.Fatal(err)
}
// Save integration with same model (this is the pattern we use)
if err := SaveIntegration("claude", []string{"model-a"}); err != nil {
t.Fatal(err)
}
loaded, _ := LoadIntegration("claude")
if loaded.Aliases["primary"] != loaded.Models[0] {
t.Errorf("aliases.primary (%q) != models[0] (%q)", loaded.Aliases["primary"], loaded.Models[0])
}
})
t.Run("out of sync config is detectable", func(t *testing.T) {
tmpDir := t.TempDir()
setTestHome(t, tmpDir)
// Simulate out-of-sync state (like manual edit or bug)
if err := SaveIntegration("claude", []string{"old-model"}); err != nil {
t.Fatal(err)
}
if err := SaveAliases("claude", map[string]string{"primary": "new-model"}); err != nil {
t.Fatal(err)
}
loaded, _ := LoadIntegration("claude")
// They should be different (this is the bug state)
if loaded.Models[0] == loaded.Aliases["primary"] {
t.Error("expected out-of-sync state for this test")
}
// The fix: when updating aliases, also update models
if err := SaveIntegration("claude", []string{loaded.Aliases["primary"]}); err != nil {
t.Fatal(err)
}
loaded, _ = LoadIntegration("claude")
if loaded.Models[0] != loaded.Aliases["primary"] {
t.Errorf("after fix: models[0] (%q) should equal aliases.primary (%q)",
loaded.Models[0], loaded.Aliases["primary"])
}
})
t.Run("updating primary alias updates models too", func(t *testing.T) {
tmpDir := t.TempDir()
setTestHome(t, tmpDir)
// Initial state
if err := SaveIntegration("claude", []string{"initial-model"}); err != nil {
t.Fatal(err)
}
if err := SaveAliases("claude", map[string]string{"primary": "initial-model"}); err != nil {
t.Fatal(err)
}
// Update aliases AND models together
newAliases := map[string]string{"primary": "updated-model"}
if err := SaveAliases("claude", newAliases); err != nil {
t.Fatal(err)
}
if err := SaveIntegration("claude", []string{newAliases["primary"]}); err != nil {
t.Fatal(err)
}
loaded, _ := LoadIntegration("claude")
if loaded.Models[0] != "updated-model" {
t.Errorf("models[0] should be updated-model, got %q", loaded.Models[0])
}
if loaded.Aliases["primary"] != "updated-model" {
t.Errorf("aliases.primary should be updated-model, got %q", loaded.Aliases["primary"])
}
})
}

530
cmd/config/config_test.go Normal file
View File

@@ -0,0 +1,530 @@
package config
import (
"os"
"path/filepath"
"strings"
"testing"
)
// setTestHome sets both HOME (Unix) and USERPROFILE (Windows) for cross-platform tests
func setTestHome(t *testing.T, dir string) {
t.Setenv("HOME", dir)
t.Setenv("TMPDIR", dir)
t.Setenv("USERPROFILE", dir)
}
func TestIntegrationConfig(t *testing.T) {
tmpDir := t.TempDir()
setTestHome(t, tmpDir)
t.Run("save and load round-trip", func(t *testing.T) {
models := []string{"llama3.2", "mistral", "qwen2.5"}
if err := SaveIntegration("claude", models); err != nil {
t.Fatal(err)
}
config, err := LoadIntegration("claude")
if err != nil {
t.Fatal(err)
}
if len(config.Models) != len(models) {
t.Errorf("expected %d models, got %d", len(models), len(config.Models))
}
for i, m := range models {
if config.Models[i] != m {
t.Errorf("model %d: expected %s, got %s", i, m, config.Models[i])
}
}
})
t.Run("save and load aliases", func(t *testing.T) {
models := []string{"llama3.2"}
if err := SaveIntegration("claude", models); err != nil {
t.Fatal(err)
}
aliases := map[string]string{
"primary": "llama3.2:70b",
"fast": "llama3.2:8b",
}
if err := SaveAliases("claude", aliases); err != nil {
t.Fatal(err)
}
config, err := LoadIntegration("claude")
if err != nil {
t.Fatal(err)
}
if config.Aliases == nil {
t.Fatal("expected aliases to be saved")
}
for k, v := range aliases {
if config.Aliases[k] != v {
t.Errorf("alias %s: expected %s, got %s", k, v, config.Aliases[k])
}
}
})
t.Run("saveIntegration preserves aliases", func(t *testing.T) {
if err := SaveIntegration("claude", []string{"model-a"}); err != nil {
t.Fatal(err)
}
if err := SaveAliases("claude", map[string]string{"primary": "model-a", "fast": "model-small"}); err != nil {
t.Fatal(err)
}
if err := SaveIntegration("claude", []string{"model-b"}); err != nil {
t.Fatal(err)
}
config, err := LoadIntegration("claude")
if err != nil {
t.Fatal(err)
}
if config.Aliases["primary"] != "model-a" {
t.Errorf("expected aliases to be preserved, got %v", config.Aliases)
}
})
t.Run("defaultModel returns first model", func(t *testing.T) {
SaveIntegration("codex", []string{"model-a", "model-b"})
config, _ := LoadIntegration("codex")
defaultModel := ""
if len(config.Models) > 0 {
defaultModel = config.Models[0]
}
if defaultModel != "model-a" {
t.Errorf("expected model-a, got %s", defaultModel)
}
})
t.Run("defaultModel returns empty for no models", func(t *testing.T) {
config := &integration{Models: []string{}}
defaultModel := ""
if len(config.Models) > 0 {
defaultModel = config.Models[0]
}
if defaultModel != "" {
t.Errorf("expected empty string, got %s", defaultModel)
}
})
t.Run("app name is case-insensitive", func(t *testing.T) {
SaveIntegration("Claude", []string{"model-x"})
config, err := LoadIntegration("claude")
if err != nil {
t.Fatal(err)
}
defaultModel := ""
if len(config.Models) > 0 {
defaultModel = config.Models[0]
}
if defaultModel != "model-x" {
t.Errorf("expected model-x, got %s", defaultModel)
}
})
t.Run("multiple integrations in single file", func(t *testing.T) {
SaveIntegration("app1", []string{"model-1"})
SaveIntegration("app2", []string{"model-2"})
config1, _ := LoadIntegration("app1")
config2, _ := LoadIntegration("app2")
defaultModel1 := ""
if len(config1.Models) > 0 {
defaultModel1 = config1.Models[0]
}
defaultModel2 := ""
if len(config2.Models) > 0 {
defaultModel2 = config2.Models[0]
}
if defaultModel1 != "model-1" {
t.Errorf("expected model-1, got %s", defaultModel1)
}
if defaultModel2 != "model-2" {
t.Errorf("expected model-2, got %s", defaultModel2)
}
})
}
func TestListIntegrations(t *testing.T) {
tmpDir := t.TempDir()
setTestHome(t, tmpDir)
t.Run("returns empty when no integrations", func(t *testing.T) {
configs, err := listIntegrations()
if err != nil {
t.Fatal(err)
}
if len(configs) != 0 {
t.Errorf("expected 0 integrations, got %d", len(configs))
}
})
t.Run("returns all saved integrations", func(t *testing.T) {
SaveIntegration("claude", []string{"model-1"})
SaveIntegration("droid", []string{"model-2"})
configs, err := listIntegrations()
if err != nil {
t.Fatal(err)
}
if len(configs) != 2 {
t.Errorf("expected 2 integrations, got %d", len(configs))
}
})
}
func TestLoadIntegration_CorruptedJSON(t *testing.T) {
tmpDir := t.TempDir()
setTestHome(t, tmpDir)
dir := filepath.Join(tmpDir, ".ollama")
os.MkdirAll(dir, 0o755)
os.WriteFile(filepath.Join(dir, "config.json"), []byte(`{corrupted json`), 0o644)
_, err := LoadIntegration("test")
if err == nil {
t.Error("expected error for nonexistent integration in corrupted file")
}
}
func TestSaveIntegration_NilModels(t *testing.T) {
tmpDir := t.TempDir()
setTestHome(t, tmpDir)
if err := SaveIntegration("test", nil); err != nil {
t.Fatalf("saveIntegration with nil models failed: %v", err)
}
config, err := LoadIntegration("test")
if err != nil {
t.Fatalf("loadIntegration failed: %v", err)
}
if config.Models == nil {
// nil is acceptable
} else if len(config.Models) != 0 {
t.Errorf("expected empty or nil models, got %v", config.Models)
}
}
func TestSaveIntegration_EmptyAppName(t *testing.T) {
tmpDir := t.TempDir()
setTestHome(t, tmpDir)
err := SaveIntegration("", []string{"model"})
if err == nil {
t.Error("expected error for empty app name, got nil")
}
if err != nil && !strings.Contains(err.Error(), "app name cannot be empty") {
t.Errorf("expected 'app name cannot be empty' error, got: %v", err)
}
}
func TestLoadIntegration_NonexistentIntegration(t *testing.T) {
tmpDir := t.TempDir()
setTestHome(t, tmpDir)
_, err := LoadIntegration("nonexistent")
if err == nil {
t.Error("expected error for nonexistent integration, got nil")
}
if !os.IsNotExist(err) {
t.Logf("error type is os.ErrNotExist as expected: %v", err)
}
}
func TestConfigPath(t *testing.T) {
tmpDir := t.TempDir()
setTestHome(t, tmpDir)
path, err := configPath()
if err != nil {
t.Fatal(err)
}
expected := filepath.Join(tmpDir, ".ollama", "config.json")
if path != expected {
t.Errorf("expected %s, got %s", expected, path)
}
}
func TestLoad(t *testing.T) {
tmpDir := t.TempDir()
setTestHome(t, tmpDir)
t.Run("returns empty config when file does not exist", func(t *testing.T) {
cfg, err := load()
if err != nil {
t.Fatal(err)
}
if cfg == nil {
t.Fatal("expected non-nil config")
}
if cfg.Integrations == nil {
t.Error("expected non-nil Integrations map")
}
if len(cfg.Integrations) != 0 {
t.Errorf("expected empty Integrations, got %d", len(cfg.Integrations))
}
})
t.Run("loads existing config", func(t *testing.T) {
path, _ := configPath()
os.MkdirAll(filepath.Dir(path), 0o755)
os.WriteFile(path, []byte(`{"integrations":{"test":{"models":["model-a"]}}}`), 0o644)
cfg, err := load()
if err != nil {
t.Fatal(err)
}
if cfg.Integrations["test"] == nil {
t.Fatal("expected test integration")
}
if len(cfg.Integrations["test"].Models) != 1 {
t.Errorf("expected 1 model, got %d", len(cfg.Integrations["test"].Models))
}
})
t.Run("returns error for corrupted JSON", func(t *testing.T) {
path, _ := configPath()
os.MkdirAll(filepath.Dir(path), 0o755)
os.WriteFile(path, []byte(`{corrupted`), 0o644)
_, err := load()
if err == nil {
t.Error("expected error for corrupted JSON")
}
})
}
func TestMigrateConfig(t *testing.T) {
t.Run("migrates legacy file to new location", func(t *testing.T) {
tmpDir := t.TempDir()
setTestHome(t, tmpDir)
legacyDir := filepath.Join(tmpDir, ".ollama", "config")
os.MkdirAll(legacyDir, 0o755)
data := []byte(`{"integrations":{"claude":{"models":["llama3.2"]}}}`)
os.WriteFile(filepath.Join(legacyDir, "config.json"), data, 0o644)
migrated, err := migrateConfig()
if err != nil {
t.Fatal(err)
}
if !migrated {
t.Fatal("expected migration to occur")
}
newPath, _ := configPath()
got, err := os.ReadFile(newPath)
if err != nil {
t.Fatalf("new config not found: %v", err)
}
if string(got) != string(data) {
t.Errorf("content mismatch: got %s", got)
}
if _, err := os.Stat(filepath.Join(legacyDir, "config.json")); !os.IsNotExist(err) {
t.Error("legacy file should have been removed")
}
if _, err := os.Stat(legacyDir); !os.IsNotExist(err) {
t.Error("legacy directory should have been removed")
}
})
t.Run("no-op when no legacy file exists", func(t *testing.T) {
tmpDir := t.TempDir()
setTestHome(t, tmpDir)
migrated, err := migrateConfig()
if err != nil {
t.Fatal(err)
}
if migrated {
t.Error("expected no migration")
}
})
t.Run("skips corrupt legacy file", func(t *testing.T) {
tmpDir := t.TempDir()
setTestHome(t, tmpDir)
legacyDir := filepath.Join(tmpDir, ".ollama", "config")
os.MkdirAll(legacyDir, 0o755)
os.WriteFile(filepath.Join(legacyDir, "config.json"), []byte(`{corrupt`), 0o644)
migrated, err := migrateConfig()
if err != nil {
t.Fatal(err)
}
if migrated {
t.Error("should not migrate corrupt file")
}
if _, err := os.Stat(filepath.Join(legacyDir, "config.json")); os.IsNotExist(err) {
t.Error("corrupt legacy file should not have been deleted")
}
})
t.Run("new path takes precedence over legacy", func(t *testing.T) {
tmpDir := t.TempDir()
setTestHome(t, tmpDir)
legacyDir := filepath.Join(tmpDir, ".ollama", "config")
os.MkdirAll(legacyDir, 0o755)
os.WriteFile(filepath.Join(legacyDir, "config.json"), []byte(`{"integrations":{"old":{"models":["old-model"]}}}`), 0o644)
newDir := filepath.Join(tmpDir, ".ollama")
os.WriteFile(filepath.Join(newDir, "config.json"), []byte(`{"integrations":{"new":{"models":["new-model"]}}}`), 0o644)
cfg, err := load()
if err != nil {
t.Fatal(err)
}
if _, ok := cfg.Integrations["new"]; !ok {
t.Error("expected new-path integration to be loaded")
}
if _, ok := cfg.Integrations["old"]; ok {
t.Error("legacy integration should not have been loaded")
}
})
t.Run("idempotent when called twice", func(t *testing.T) {
tmpDir := t.TempDir()
setTestHome(t, tmpDir)
legacyDir := filepath.Join(tmpDir, ".ollama", "config")
os.MkdirAll(legacyDir, 0o755)
os.WriteFile(filepath.Join(legacyDir, "config.json"), []byte(`{"integrations":{}}`), 0o644)
if _, err := migrateConfig(); err != nil {
t.Fatal(err)
}
migrated, err := migrateConfig()
if err != nil {
t.Fatal(err)
}
if migrated {
t.Error("second migration should be a no-op")
}
})
t.Run("legacy directory preserved if not empty", func(t *testing.T) {
tmpDir := t.TempDir()
setTestHome(t, tmpDir)
legacyDir := filepath.Join(tmpDir, ".ollama", "config")
os.MkdirAll(legacyDir, 0o755)
os.WriteFile(filepath.Join(legacyDir, "config.json"), []byte(`{"integrations":{}}`), 0o644)
os.WriteFile(filepath.Join(legacyDir, "other-file.txt"), []byte("keep me"), 0o644)
if _, err := migrateConfig(); err != nil {
t.Fatal(err)
}
if _, err := os.Stat(legacyDir); os.IsNotExist(err) {
t.Error("directory with other files should not have been removed")
}
if _, err := os.Stat(filepath.Join(legacyDir, "other-file.txt")); os.IsNotExist(err) {
t.Error("other files in legacy directory should be untouched")
}
})
t.Run("save writes to new path after migration", func(t *testing.T) {
tmpDir := t.TempDir()
setTestHome(t, tmpDir)
legacyDir := filepath.Join(tmpDir, ".ollama", "config")
os.MkdirAll(legacyDir, 0o755)
os.WriteFile(filepath.Join(legacyDir, "config.json"), []byte(`{"integrations":{"claude":{"models":["llama3.2"]}}}`), 0o644)
// load triggers migration, then save should write to new path
if err := SaveIntegration("codex", []string{"qwen2.5"}); err != nil {
t.Fatal(err)
}
newPath := filepath.Join(tmpDir, ".ollama", "config.json")
if _, err := os.Stat(newPath); os.IsNotExist(err) {
t.Error("save should write to new path")
}
// old path should not be recreated
if _, err := os.Stat(filepath.Join(legacyDir, "config.json")); !os.IsNotExist(err) {
t.Error("save should not recreate legacy path")
}
})
t.Run("load triggers migration transparently", func(t *testing.T) {
tmpDir := t.TempDir()
setTestHome(t, tmpDir)
legacyDir := filepath.Join(tmpDir, ".ollama", "config")
os.MkdirAll(legacyDir, 0o755)
os.WriteFile(filepath.Join(legacyDir, "config.json"), []byte(`{"integrations":{"claude":{"models":["llama3.2"]}}}`), 0o644)
cfg, err := load()
if err != nil {
t.Fatal(err)
}
if cfg.Integrations["claude"] == nil || cfg.Integrations["claude"].Models[0] != "llama3.2" {
t.Error("migration via load() did not preserve data")
}
})
}
func TestSave(t *testing.T) {
tmpDir := t.TempDir()
setTestHome(t, tmpDir)
t.Run("creates config file", func(t *testing.T) {
cfg := &config{
Integrations: map[string]*integration{
"test": {Models: []string{"model-a", "model-b"}},
},
}
if err := save(cfg); err != nil {
t.Fatal(err)
}
path, _ := configPath()
if _, err := os.Stat(path); os.IsNotExist(err) {
t.Error("config file was not created")
}
})
t.Run("round-trip preserves data", func(t *testing.T) {
cfg := &config{
Integrations: map[string]*integration{
"claude": {Models: []string{"llama3.2", "mistral"}},
"codex": {Models: []string{"qwen2.5"}},
},
}
if err := save(cfg); err != nil {
t.Fatal(err)
}
loaded, err := load()
if err != nil {
t.Fatal(err)
}
if len(loaded.Integrations) != 2 {
t.Errorf("expected 2 integrations, got %d", len(loaded.Integrations))
}
if loaded.Integrations["claude"] == nil {
t.Error("missing claude integration")
}
if len(loaded.Integrations["claude"].Models) != 2 {
t.Errorf("expected 2 models for claude, got %d", len(loaded.Integrations["claude"].Models))
}
})
}

5
cmd/editor_unix.go Normal file
View File

@@ -0,0 +1,5 @@
//go:build !windows
package cmd
const defaultEditor = "vi"

5
cmd/editor_windows.go Normal file
View File

@@ -0,0 +1,5 @@
//go:build windows
package cmd
const defaultEditor = "edit"

View File

@@ -7,6 +7,7 @@ import (
"io" "io"
"net/http" "net/http"
"os" "os"
"os/exec"
"path/filepath" "path/filepath"
"regexp" "regexp"
"slices" "slices"
@@ -16,6 +17,7 @@ import (
"github.com/ollama/ollama/api" "github.com/ollama/ollama/api"
"github.com/ollama/ollama/envconfig" "github.com/ollama/ollama/envconfig"
"github.com/ollama/ollama/internal/modelref"
"github.com/ollama/ollama/readline" "github.com/ollama/ollama/readline"
"github.com/ollama/ollama/types/errtypes" "github.com/ollama/ollama/types/errtypes"
"github.com/ollama/ollama/types/model" "github.com/ollama/ollama/types/model"
@@ -45,7 +47,7 @@ func generateInteractive(cmd *cobra.Command, opts runOptions) error {
fmt.Fprintln(os.Stderr, "Use \"\"\" to begin a multi-line message.") fmt.Fprintln(os.Stderr, "Use \"\"\" to begin a multi-line message.")
if opts.MultiModal { if opts.MultiModal {
fmt.Fprintf(os.Stderr, "Use %s to include .jpg, .png, or .webp images.\n", filepath.FromSlash("/path/to/file")) fmt.Fprintf(os.Stderr, "Use %s to include .jpg, .png, .webp images, or .wav audio files.\n", filepath.FromSlash("/path/to/file"))
} }
fmt.Fprintln(os.Stderr, "") fmt.Fprintln(os.Stderr, "")
@@ -79,6 +81,7 @@ func generateInteractive(cmd *cobra.Command, opts runOptions) error {
fmt.Fprintln(os.Stderr, " Ctrl + w Delete the word before the cursor") fmt.Fprintln(os.Stderr, " Ctrl + w Delete the word before the cursor")
fmt.Fprintln(os.Stderr, "") fmt.Fprintln(os.Stderr, "")
fmt.Fprintln(os.Stderr, " Ctrl + l Clear the screen") fmt.Fprintln(os.Stderr, " Ctrl + l Clear the screen")
fmt.Fprintln(os.Stderr, " Ctrl + g Open default editor to compose a prompt")
fmt.Fprintln(os.Stderr, " Ctrl + c Stop the model from responding") fmt.Fprintln(os.Stderr, " Ctrl + c Stop the model from responding")
fmt.Fprintln(os.Stderr, " Ctrl + d Exit ollama (/bye)") fmt.Fprintln(os.Stderr, " Ctrl + d Exit ollama (/bye)")
fmt.Fprintln(os.Stderr, "") fmt.Fprintln(os.Stderr, "")
@@ -147,6 +150,18 @@ func generateInteractive(cmd *cobra.Command, opts runOptions) error {
scanner.Prompt.UseAlt = false scanner.Prompt.UseAlt = false
sb.Reset() sb.Reset()
continue
case errors.Is(err, readline.ErrEditPrompt):
sb.Reset()
content, err := editInExternalEditor(line)
if err != nil {
fmt.Fprintf(os.Stderr, "error: %v\n", err)
continue
}
if strings.TrimSpace(content) == "" {
continue
}
scanner.Prefill = content
continue continue
case err != nil: case err != nil:
return err return err
@@ -159,6 +174,7 @@ func generateInteractive(cmd *cobra.Command, opts runOptions) error {
sb.WriteString(before) sb.WriteString(before)
if !ok { if !ok {
fmt.Fprintln(&sb) fmt.Fprintln(&sb)
scanner.Prompt.UseAlt = true
continue continue
} }
@@ -198,10 +214,17 @@ func generateInteractive(cmd *cobra.Command, opts runOptions) error {
} }
origOpts := opts.Copy() origOpts := opts.Copy()
client, err := api.ClientFromEnvironment()
if err != nil {
fmt.Println("error: couldn't connect to ollama server")
return err
}
opts.Model = args[1] opts.Model = args[1]
opts.Messages = []api.Message{} opts.Messages = []api.Message{}
opts.LoadedMessages = nil
fmt.Printf("Loading model '%s'\n", opts.Model) fmt.Printf("Loading model '%s'\n", opts.Model)
opts.Think, err = inferThinkingOption(nil, &opts, thinkExplicitlySet) info, err := client.Show(cmd.Context(), &api.ShowRequest{Model: opts.Model})
if err != nil { if err != nil {
if strings.Contains(err.Error(), "not found") { if strings.Contains(err.Error(), "not found") {
fmt.Printf("Couldn't find model '%s'\n", opts.Model) fmt.Printf("Couldn't find model '%s'\n", opts.Model)
@@ -210,6 +233,11 @@ func generateInteractive(cmd *cobra.Command, opts runOptions) error {
} }
return err return err
} }
applyShowResponseToRunOptions(&opts, info)
opts.Think, err = inferThinkingOption(&info.Capabilities, &opts, thinkExplicitlySet)
if err != nil {
return err
}
if err := loadOrUnloadModel(cmd, &opts); err != nil { if err := loadOrUnloadModel(cmd, &opts); err != nil {
if strings.Contains(err.Error(), "not found") { if strings.Contains(err.Error(), "not found") {
fmt.Printf("Couldn't find model '%s'\n", opts.Model) fmt.Printf("Couldn't find model '%s'\n", opts.Model)
@@ -525,6 +553,13 @@ func NewCreateRequest(name string, opts runOptions) *api.CreateRequest {
parentModel = "" parentModel = ""
} }
// Preserve explicit cloud intent for sessions started with `:cloud`.
// Cloud model metadata can return a source-less parent_model (for example
// "qwen3.5"), which would otherwise make `/save` create a local derivative.
if modelref.HasExplicitCloudSource(opts.Model) && !modelref.HasExplicitCloudSource(parentModel) {
parentModel = ""
}
req := &api.CreateRequest{ req := &api.CreateRequest{
Model: name, Model: name,
From: cmp.Or(parentModel, opts.Model), From: cmp.Or(parentModel, opts.Model),
@@ -538,8 +573,10 @@ func NewCreateRequest(name string, opts runOptions) *api.CreateRequest {
req.Parameters = opts.Options req.Parameters = opts.Options
} }
if len(opts.Messages) > 0 { messages := slices.Clone(opts.LoadedMessages)
req.Messages = opts.Messages messages = append(messages, opts.Messages...)
if len(messages) > 0 {
req.Messages = messages
} }
return req return req
@@ -569,7 +606,7 @@ func extractFileNames(input string) []string {
// Regex to match file paths starting with optional drive letter, / ./ \ or .\ and include escaped or unescaped spaces (\ or %20) // Regex to match file paths starting with optional drive letter, / ./ \ or .\ and include escaped or unescaped spaces (\ or %20)
// and followed by more characters and a file extension // and followed by more characters and a file extension
// This will capture non filename strings, but we'll check for file existence to remove mismatches // This will capture non filename strings, but we'll check for file existence to remove mismatches
regexPattern := `(?:[a-zA-Z]:)?(?:\./|/|\\)[\S\\ ]+?\.(?i:jpg|jpeg|png|webp)\b` regexPattern := `(?:[a-zA-Z]:)?(?:\./|/|\\)[\S\\ ]+?\.(?i:jpg|jpeg|png|webp|wav)\b`
re := regexp.MustCompile(regexPattern) re := regexp.MustCompile(regexPattern)
return re.FindAllString(input, -1) return re.FindAllString(input, -1)
@@ -585,10 +622,16 @@ func extractFileData(input string) (string, []api.ImageData, error) {
if errors.Is(err, os.ErrNotExist) { if errors.Is(err, os.ErrNotExist) {
continue continue
} else if err != nil { } else if err != nil {
fmt.Fprintf(os.Stderr, "Couldn't process image: %q\n", err) fmt.Fprintf(os.Stderr, "Couldn't process file: %q\n", err)
return "", imgs, err return "", imgs, err
} }
ext := strings.ToLower(filepath.Ext(nfp))
switch ext {
case ".wav":
fmt.Fprintf(os.Stderr, "Added audio '%s'\n", nfp)
default:
fmt.Fprintf(os.Stderr, "Added image '%s'\n", nfp) fmt.Fprintf(os.Stderr, "Added image '%s'\n", nfp)
}
input = strings.ReplaceAll(input, "'"+nfp+"'", "") input = strings.ReplaceAll(input, "'"+nfp+"'", "")
input = strings.ReplaceAll(input, "'"+fp+"'", "") input = strings.ReplaceAll(input, "'"+fp+"'", "")
input = strings.ReplaceAll(input, fp, "") input = strings.ReplaceAll(input, fp, "")
@@ -597,6 +640,57 @@ func extractFileData(input string) (string, []api.ImageData, error) {
return strings.TrimSpace(input), imgs, nil return strings.TrimSpace(input), imgs, nil
} }
func editInExternalEditor(content string) (string, error) {
editor := envconfig.Editor()
if editor == "" {
editor = os.Getenv("VISUAL")
}
if editor == "" {
editor = os.Getenv("EDITOR")
}
if editor == "" {
editor = defaultEditor
}
// Check that the editor binary exists
name := strings.Fields(editor)[0]
if _, err := exec.LookPath(name); err != nil {
return "", fmt.Errorf("editor %q not found, set OLLAMA_EDITOR to the path of your preferred editor", name)
}
tmpFile, err := os.CreateTemp("", "ollama-prompt-*.txt")
if err != nil {
return "", fmt.Errorf("creating temp file: %w", err)
}
defer os.Remove(tmpFile.Name())
if content != "" {
if _, err := tmpFile.WriteString(content); err != nil {
tmpFile.Close()
return "", fmt.Errorf("writing to temp file: %w", err)
}
}
tmpFile.Close()
args := strings.Fields(editor)
args = append(args, tmpFile.Name())
cmd := exec.Command(args[0], args[1:]...)
cmd.Stdin = os.Stdin
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
if err := cmd.Run(); err != nil {
return "", fmt.Errorf("editor exited with error: %w", err)
}
data, err := os.ReadFile(tmpFile.Name())
if err != nil {
return "", fmt.Errorf("reading temp file: %w", err)
}
return strings.TrimRight(string(data), "\n"), nil
}
func getImageData(filePath string) ([]byte, error) { func getImageData(filePath string) ([]byte, error) {
file, err := os.Open(filePath) file, err := os.Open(filePath)
if err != nil { if err != nil {
@@ -611,9 +705,9 @@ func getImageData(filePath string) ([]byte, error) {
} }
contentType := http.DetectContentType(buf) contentType := http.DetectContentType(buf)
allowedTypes := []string{"image/jpeg", "image/jpg", "image/png", "image/webp"} allowedTypes := []string{"image/jpeg", "image/jpg", "image/png", "image/webp", "audio/wave"}
if !slices.Contains(allowedTypes, contentType) { if !slices.Contains(allowedTypes, contentType) {
return nil, fmt.Errorf("invalid image type: %s", contentType) return nil, fmt.Errorf("invalid file type: %s", contentType)
} }
info, err := file.Stat() info, err := file.Stat()
@@ -621,8 +715,7 @@ func getImageData(filePath string) ([]byte, error) {
return nil, err return nil, err
} }
// Check if the file size exceeds 100MB var maxSize int64 = 100 * 1024 * 1024 // 100MB
var maxSize int64 = 100 * 1024 * 1024 // 100MB in bytes
if info.Size() > maxSize { if info.Size() > maxSize {
return nil, errors.New("file size exceeds maximum limit (100MB)") return nil, errors.New("file size exceeds maximum limit (100MB)")
} }

View File

@@ -84,3 +84,33 @@ func TestExtractFileDataRemovesQuotedFilepath(t *testing.T) {
assert.Len(t, imgs, 1) assert.Len(t, imgs, 1)
assert.Equal(t, cleaned, "before after") assert.Equal(t, cleaned, "before after")
} }
func TestExtractFileDataWAV(t *testing.T) {
dir := t.TempDir()
fp := filepath.Join(dir, "sample.wav")
data := make([]byte, 600)
copy(data[:44], []byte{
'R', 'I', 'F', 'F',
0x58, 0x02, 0x00, 0x00, // file size - 8
'W', 'A', 'V', 'E',
'f', 'm', 't', ' ',
0x10, 0x00, 0x00, 0x00, // fmt chunk size
0x01, 0x00, // PCM
0x01, 0x00, // mono
0x80, 0x3e, 0x00, 0x00, // 16000 Hz
0x00, 0x7d, 0x00, 0x00, // byte rate
0x02, 0x00, // block align
0x10, 0x00, // 16-bit
'd', 'a', 't', 'a',
0x34, 0x02, 0x00, 0x00, // data size
})
if err := os.WriteFile(fp, data, 0o600); err != nil {
t.Fatalf("failed to write test audio: %v", err)
}
input := "before " + fp + " after"
cleaned, imgs, err := extractFileData(input)
assert.NoError(t, err)
assert.Len(t, imgs, 1)
assert.Equal(t, "before after", cleaned)
}

View File

@@ -0,0 +1,103 @@
// Package fileutil provides small shared helpers for reading JSON files
// and writing config files with backup-on-overwrite semantics.
package fileutil
import (
"bytes"
"encoding/json"
"fmt"
"os"
"path/filepath"
"time"
)
// ReadJSON reads a JSON object file into a generic map.
func ReadJSON(path string) (map[string]any, error) {
data, err := os.ReadFile(path)
if err != nil {
return nil, err
}
var result map[string]any
if err := json.Unmarshal(data, &result); err != nil {
return nil, err
}
return result, nil
}
func copyFile(src, dst string) error {
info, err := os.Stat(src)
if err != nil {
return err
}
data, err := os.ReadFile(src)
if err != nil {
return err
}
return os.WriteFile(dst, data, info.Mode().Perm())
}
// BackupDir returns the shared backup directory used before overwriting files.
func BackupDir() string {
return filepath.Join(os.TempDir(), "ollama-backups")
}
func backupToTmp(srcPath string) (string, error) {
dir := BackupDir()
if err := os.MkdirAll(dir, 0o755); err != nil {
return "", err
}
backupPath := filepath.Join(dir, fmt.Sprintf("%s.%d", filepath.Base(srcPath), time.Now().Unix()))
if err := copyFile(srcPath, backupPath); err != nil {
return "", err
}
return backupPath, nil
}
// WriteWithBackup writes data to path via temp file + rename, backing up any existing file first.
func WriteWithBackup(path string, data []byte) error {
var backupPath string
// backup must be created before any writes to the target file
if existingContent, err := os.ReadFile(path); err == nil {
if !bytes.Equal(existingContent, data) {
backupPath, err = backupToTmp(path)
if err != nil {
return fmt.Errorf("backup failed: %w", err)
}
}
} else if !os.IsNotExist(err) {
return fmt.Errorf("read existing file: %w", err)
}
dir := filepath.Dir(path)
tmp, err := os.CreateTemp(dir, ".tmp-*")
if err != nil {
return fmt.Errorf("create temp failed: %w", err)
}
tmpPath := tmp.Name()
if _, err := tmp.Write(data); err != nil {
_ = tmp.Close()
_ = os.Remove(tmpPath)
return fmt.Errorf("write failed: %w", err)
}
if err := tmp.Sync(); err != nil {
_ = tmp.Close()
_ = os.Remove(tmpPath)
return fmt.Errorf("sync failed: %w", err)
}
if err := tmp.Close(); err != nil {
_ = os.Remove(tmpPath)
return fmt.Errorf("close failed: %w", err)
}
if err := os.Rename(tmpPath, path); err != nil {
_ = os.Remove(tmpPath)
if backupPath != "" {
_ = copyFile(backupPath, path)
}
return fmt.Errorf("rename failed: %w", err)
}
return nil
}

View File

@@ -0,0 +1,522 @@
package fileutil
import (
"encoding/json"
"fmt"
"os"
"path/filepath"
"runtime"
"testing"
)
func TestMain(m *testing.M) {
tmpRoot, err := os.MkdirTemp("", "fileutil-test-*")
if err != nil {
panic(err)
}
if err := os.Setenv("TMPDIR", tmpRoot); err != nil {
panic(err)
}
code := m.Run()
_ = os.RemoveAll(tmpRoot)
os.Exit(code)
}
func mustMarshal(t *testing.T, v any) []byte {
t.Helper()
data, err := json.MarshalIndent(v, "", " ")
if err != nil {
t.Fatal(err)
}
return data
}
func isolatedTempDir(t *testing.T) string {
t.Helper()
return t.TempDir()
}
func TestWriteWithBackup(t *testing.T) {
tmpDir := isolatedTempDir(t)
t.Run("creates file", func(t *testing.T) {
path := filepath.Join(tmpDir, "new.json")
data := mustMarshal(t, map[string]string{"key": "value"})
if err := WriteWithBackup(path, data); err != nil {
t.Fatal(err)
}
content, err := os.ReadFile(path)
if err != nil {
t.Fatal(err)
}
var result map[string]string
if err := json.Unmarshal(content, &result); err != nil {
t.Fatal(err)
}
if result["key"] != "value" {
t.Errorf("expected value, got %s", result["key"])
}
})
t.Run("creates backup in the temp backup directory", func(t *testing.T) {
path := filepath.Join(tmpDir, "backup.json")
os.WriteFile(path, []byte(`{"original": true}`), 0o644)
data := mustMarshal(t, map[string]bool{"updated": true})
if err := WriteWithBackup(path, data); err != nil {
t.Fatal(err)
}
entries, err := os.ReadDir(BackupDir())
if err != nil {
t.Fatal("backup directory not created")
}
var foundBackup bool
for _, entry := range entries {
if filepath.Ext(entry.Name()) != ".json" {
name := entry.Name()
if len(name) > len("backup.json.") && name[:len("backup.json.")] == "backup.json." {
backupPath := filepath.Join(BackupDir(), name)
backup, err := os.ReadFile(backupPath)
if err == nil {
var backupData map[string]bool
json.Unmarshal(backup, &backupData)
if backupData["original"] {
foundBackup = true
os.Remove(backupPath)
break
}
}
}
}
}
if !foundBackup {
t.Error("backup file not created in backup directory")
}
current, _ := os.ReadFile(path)
var currentData map[string]bool
json.Unmarshal(current, &currentData)
if !currentData["updated"] {
t.Error("file doesn't contain updated data")
}
})
t.Run("no backup for new file", func(t *testing.T) {
path := filepath.Join(tmpDir, "nobak.json")
data := mustMarshal(t, map[string]string{"new": "file"})
if err := WriteWithBackup(path, data); err != nil {
t.Fatal(err)
}
entries, _ := os.ReadDir(BackupDir())
for _, entry := range entries {
if len(entry.Name()) > len("nobak.json.") && entry.Name()[:len("nobak.json.")] == "nobak.json." {
t.Error("backup should not exist for new file")
}
}
})
t.Run("no backup when content unchanged", func(t *testing.T) {
path := filepath.Join(tmpDir, "unchanged.json")
data := mustMarshal(t, map[string]string{"key": "value"})
if err := WriteWithBackup(path, data); err != nil {
t.Fatal(err)
}
entries1, _ := os.ReadDir(BackupDir())
countBefore := 0
for _, e := range entries1 {
if len(e.Name()) > len("unchanged.json.") && e.Name()[:len("unchanged.json.")] == "unchanged.json." {
countBefore++
}
}
if err := WriteWithBackup(path, data); err != nil {
t.Fatal(err)
}
entries2, _ := os.ReadDir(BackupDir())
countAfter := 0
for _, e := range entries2 {
if len(e.Name()) > len("unchanged.json.") && e.Name()[:len("unchanged.json.")] == "unchanged.json." {
countAfter++
}
}
if countAfter != countBefore {
t.Errorf("backup was created when content unchanged (before=%d, after=%d)", countBefore, countAfter)
}
})
t.Run("backup filename contains unix timestamp", func(t *testing.T) {
path := filepath.Join(tmpDir, "timestamped.json")
os.WriteFile(path, []byte(`{"v": 1}`), 0o644)
data := mustMarshal(t, map[string]int{"v": 2})
if err := WriteWithBackup(path, data); err != nil {
t.Fatal(err)
}
entries, _ := os.ReadDir(BackupDir())
var found bool
for _, entry := range entries {
name := entry.Name()
if len(name) > len("timestamped.json.") && name[:len("timestamped.json.")] == "timestamped.json." {
timestamp := name[len("timestamped.json."):]
for _, c := range timestamp {
if c < '0' || c > '9' {
t.Errorf("backup filename timestamp contains non-numeric character: %s", name)
}
}
found = true
os.Remove(filepath.Join(BackupDir(), name))
break
}
}
if !found {
t.Error("backup file with timestamp not found")
}
})
}
// Edge case tests for files.go
// TestWriteWithBackup_FailsIfBackupFails documents critical behavior: if backup fails, we must not proceed.
// User could lose their config with no way to recover.
func TestWriteWithBackup_FailsIfBackupFails(t *testing.T) {
if runtime.GOOS == "windows" {
t.Skip("permission tests unreliable on Windows")
}
tmpDir := isolatedTempDir(t)
path := filepath.Join(tmpDir, "config.json")
// Create original file
originalContent := []byte(`{"original": true}`)
os.WriteFile(path, originalContent, 0o644)
// Make backup directory read-only to force backup failure
backupDir := BackupDir()
os.MkdirAll(backupDir, 0o755)
os.Chmod(backupDir, 0o444) // Read-only
defer os.Chmod(backupDir, 0o755)
newContent := []byte(`{"updated": true}`)
err := WriteWithBackup(path, newContent)
// Should fail because backup couldn't be created
if err == nil {
t.Error("expected error when backup fails, got nil")
}
// Original file should be preserved
current, _ := os.ReadFile(path)
if string(current) != string(originalContent) {
t.Errorf("original file was modified despite backup failure: got %s", string(current))
}
}
// TestWriteWithBackup_PermissionDenied verifies clear error when target file has wrong permissions.
// Common issue when config owned by root or wrong perms.
func TestWriteWithBackup_PermissionDenied(t *testing.T) {
if runtime.GOOS == "windows" {
t.Skip("permission tests unreliable on Windows")
}
tmpDir := isolatedTempDir(t)
// Create a read-only directory
readOnlyDir := filepath.Join(tmpDir, "readonly")
os.MkdirAll(readOnlyDir, 0o755)
os.Chmod(readOnlyDir, 0o444)
defer os.Chmod(readOnlyDir, 0o755)
path := filepath.Join(readOnlyDir, "config.json")
err := WriteWithBackup(path, []byte(`{"test": true}`))
if err == nil {
t.Error("expected permission error, got nil")
}
}
// TestWriteWithBackup_DirectoryDoesNotExist verifies behavior when target directory doesn't exist.
// writeWithBackup doesn't create directories - caller is responsible.
func TestWriteWithBackup_DirectoryDoesNotExist(t *testing.T) {
tmpDir := isolatedTempDir(t)
path := filepath.Join(tmpDir, "nonexistent", "subdir", "config.json")
err := WriteWithBackup(path, []byte(`{"test": true}`))
// Should fail because directory doesn't exist
if err == nil {
t.Error("expected error for nonexistent directory, got nil")
}
}
// TestWriteWithBackup_SymlinkTarget documents behavior when target is a symlink.
// Documents what happens if user symlinks their config file.
func TestWriteWithBackup_SymlinkTarget(t *testing.T) {
if runtime.GOOS == "windows" {
t.Skip("symlink tests may require admin on Windows")
}
tmpDir := isolatedTempDir(t)
realFile := filepath.Join(tmpDir, "real.json")
symlink := filepath.Join(tmpDir, "link.json")
// Create real file and symlink
os.WriteFile(realFile, []byte(`{"v": 1}`), 0o644)
os.Symlink(realFile, symlink)
// Write through symlink
err := WriteWithBackup(symlink, []byte(`{"v": 2}`))
if err != nil {
t.Fatalf("writeWithBackup through symlink failed: %v", err)
}
// The real file should be updated (symlink followed for temp file creation)
content, _ := os.ReadFile(symlink)
if string(content) != `{"v": 2}` {
t.Errorf("symlink target not updated correctly: got %s", string(content))
}
}
// TestBackupToTmp_SpecialCharsInFilename verifies backup works with special characters.
// User may have config files with unusual names.
func TestBackupToTmp_SpecialCharsInFilename(t *testing.T) {
tmpDir := isolatedTempDir(t)
// File with spaces and special chars
path := filepath.Join(tmpDir, "my config (backup).json")
os.WriteFile(path, []byte(`{"test": true}`), 0o644)
backupPath, err := backupToTmp(path)
if err != nil {
t.Fatalf("backupToTmp with special chars failed: %v", err)
}
// Verify backup exists and has correct content
content, err := os.ReadFile(backupPath)
if err != nil {
t.Fatalf("could not read backup: %v", err)
}
if string(content) != `{"test": true}` {
t.Errorf("backup content mismatch: got %s", string(content))
}
os.Remove(backupPath)
}
// TestCopyFile_PreservesPermissions verifies that copyFile preserves file permissions.
func TestCopyFile_PreservesPermissions(t *testing.T) {
if runtime.GOOS == "windows" {
t.Skip("permission preservation tests unreliable on Windows")
}
tmpDir := isolatedTempDir(t)
src := filepath.Join(tmpDir, "src.json")
dst := filepath.Join(tmpDir, "dst.json")
// Create source with specific permissions
os.WriteFile(src, []byte(`{"test": true}`), 0o600)
err := copyFile(src, dst)
if err != nil {
t.Fatalf("copyFile failed: %v", err)
}
srcInfo, _ := os.Stat(src)
dstInfo, _ := os.Stat(dst)
if srcInfo.Mode().Perm() != dstInfo.Mode().Perm() {
t.Errorf("permissions not preserved: src=%v, dst=%v", srcInfo.Mode().Perm(), dstInfo.Mode().Perm())
}
}
// TestCopyFile_SourceNotFound verifies clear error when source doesn't exist.
func TestCopyFile_SourceNotFound(t *testing.T) {
tmpDir := isolatedTempDir(t)
src := filepath.Join(tmpDir, "nonexistent.json")
dst := filepath.Join(tmpDir, "dst.json")
err := copyFile(src, dst)
if err == nil {
t.Error("expected error for nonexistent source, got nil")
}
}
// TestWriteWithBackup_TargetIsDirectory verifies error when path points to a directory.
func TestWriteWithBackup_TargetIsDirectory(t *testing.T) {
tmpDir := isolatedTempDir(t)
dirPath := filepath.Join(tmpDir, "actualdir")
os.MkdirAll(dirPath, 0o755)
err := WriteWithBackup(dirPath, []byte(`{"test": true}`))
if err == nil {
t.Error("expected error when target is a directory, got nil")
}
}
// TestWriteWithBackup_EmptyData verifies writing zero bytes works correctly.
func TestWriteWithBackup_EmptyData(t *testing.T) {
tmpDir := isolatedTempDir(t)
path := filepath.Join(tmpDir, "empty.json")
err := WriteWithBackup(path, []byte{})
if err != nil {
t.Fatalf("writeWithBackup with empty data failed: %v", err)
}
content, err := os.ReadFile(path)
if err != nil {
t.Fatalf("could not read file: %v", err)
}
if len(content) != 0 {
t.Errorf("expected empty file, got %d bytes", len(content))
}
}
// TestWriteWithBackup_FileUnreadableButDirWritable verifies behavior when existing file
// cannot be read (for backup comparison) but directory is writable.
func TestWriteWithBackup_FileUnreadableButDirWritable(t *testing.T) {
if runtime.GOOS == "windows" {
t.Skip("permission tests unreliable on Windows")
}
tmpDir := isolatedTempDir(t)
path := filepath.Join(tmpDir, "unreadable.json")
// Create file and make it unreadable
os.WriteFile(path, []byte(`{"original": true}`), 0o644)
os.Chmod(path, 0o000)
defer os.Chmod(path, 0o644)
// Should fail because we can't read the file to compare/backup
err := WriteWithBackup(path, []byte(`{"updated": true}`))
if err == nil {
t.Error("expected error when file is unreadable, got nil")
}
}
// TestWriteWithBackup_RapidSuccessiveWrites verifies backup works with multiple writes
// within the same second (timestamp collision scenario).
func TestWriteWithBackup_RapidSuccessiveWrites(t *testing.T) {
tmpDir := isolatedTempDir(t)
path := filepath.Join(tmpDir, "rapid.json")
// Create initial file
os.WriteFile(path, []byte(`{"v": 0}`), 0o644)
// Rapid successive writes
for i := 1; i <= 3; i++ {
data := []byte(fmt.Sprintf(`{"v": %d}`, i))
if err := WriteWithBackup(path, data); err != nil {
t.Fatalf("write %d failed: %v", i, err)
}
}
// Verify final content
content, _ := os.ReadFile(path)
if string(content) != `{"v": 3}` {
t.Errorf("expected final content {\"v\": 3}, got %s", string(content))
}
// Verify at least one backup exists
entries, _ := os.ReadDir(BackupDir())
var backupCount int
for _, e := range entries {
if len(e.Name()) > len("rapid.json.") && e.Name()[:len("rapid.json.")] == "rapid.json." {
backupCount++
}
}
if backupCount == 0 {
t.Error("expected at least one backup file from rapid writes")
}
}
// TestWriteWithBackup_BackupDirIsFile verifies error when backup directory path is a file.
func TestWriteWithBackup_BackupDirIsFile(t *testing.T) {
if runtime.GOOS == "windows" {
t.Skip("test modifies system temp directory")
}
tmpDir := isolatedTempDir(t)
// Create a file at the backup directory path
backupPath := BackupDir()
// Clean up any existing directory first
os.RemoveAll(backupPath)
// Create a file instead of directory
os.WriteFile(backupPath, []byte("not a directory"), 0o644)
defer func() {
os.Remove(backupPath)
os.MkdirAll(backupPath, 0o755)
}()
path := filepath.Join(tmpDir, "test.json")
os.WriteFile(path, []byte(`{"original": true}`), 0o644)
err := WriteWithBackup(path, []byte(`{"updated": true}`))
if err == nil {
t.Error("expected error when backup dir is a file, got nil")
}
}
// TestWriteWithBackup_NoOrphanTempFiles verifies temp files are cleaned up on failure.
func TestWriteWithBackup_NoOrphanTempFiles(t *testing.T) {
if runtime.GOOS == "windows" {
t.Skip("permission tests unreliable on Windows")
}
tmpDir := isolatedTempDir(t)
// Count existing temp files
countTempFiles := func() int {
entries, _ := os.ReadDir(tmpDir)
count := 0
for _, e := range entries {
if len(e.Name()) > 4 && e.Name()[:4] == ".tmp" {
count++
}
}
return count
}
before := countTempFiles()
// Create a file, then make directory read-only to cause rename failure
path := filepath.Join(tmpDir, "orphan.json")
os.WriteFile(path, []byte(`{"v": 1}`), 0o644)
// Make a subdirectory and try to write there after making parent read-only
subDir := filepath.Join(tmpDir, "subdir")
os.MkdirAll(subDir, 0o755)
subPath := filepath.Join(subDir, "config.json")
os.WriteFile(subPath, []byte(`{"v": 1}`), 0o644)
// Make subdir read-only after creating temp file would succeed but rename would fail
// This is tricky to test - the temp file is created in the same dir, so if we can't
// rename, we also couldn't create. Let's just verify normal failure cleanup works.
// Force a failure by making the target a directory
badPath := filepath.Join(tmpDir, "isdir")
os.MkdirAll(badPath, 0o755)
_ = WriteWithBackup(badPath, []byte(`{"test": true}`))
after := countTempFiles()
if after > before {
t.Errorf("orphan temp files left behind: before=%d, after=%d", before, after)
}
}

87
cmd/launch/claude.go Normal file
View File

@@ -0,0 +1,87 @@
package launch
import (
"fmt"
"os"
"os/exec"
"path/filepath"
"runtime"
"strconv"
"github.com/ollama/ollama/envconfig"
)
// Claude implements Runner for Claude Code integration.
type Claude struct{}
func (c *Claude) String() string { return "Claude Code" }
func (c *Claude) args(model string, extra []string) []string {
var args []string
if model != "" {
args = append(args, "--model", model)
}
args = append(args, extra...)
return args
}
func (c *Claude) findPath() (string, error) {
if p, err := exec.LookPath("claude"); err == nil {
return p, nil
}
home, err := os.UserHomeDir()
if err != nil {
return "", err
}
name := "claude"
if runtime.GOOS == "windows" {
name = "claude.exe"
}
fallback := filepath.Join(home, ".claude", "local", name)
if _, err := os.Stat(fallback); err != nil {
return "", err
}
return fallback, nil
}
func (c *Claude) Run(model string, args []string) error {
claudePath, err := c.findPath()
if err != nil {
return fmt.Errorf("claude is not installed, install from https://code.claude.com/docs/en/quickstart")
}
cmd := exec.Command(claudePath, c.args(model, args)...)
cmd.Stdin = os.Stdin
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
env := append(os.Environ(),
"ANTHROPIC_BASE_URL="+envconfig.Host().String(),
"ANTHROPIC_API_KEY=",
"ANTHROPIC_AUTH_TOKEN=ollama",
"CLAUDE_CODE_ATTRIBUTION_HEADER=0",
)
env = append(env, c.modelEnvVars(model)...)
cmd.Env = env
return cmd.Run()
}
// modelEnvVars returns Claude Code env vars that route all model tiers through Ollama.
func (c *Claude) modelEnvVars(model string) []string {
env := []string{
"ANTHROPIC_DEFAULT_OPUS_MODEL=" + model,
"ANTHROPIC_DEFAULT_SONNET_MODEL=" + model,
"ANTHROPIC_DEFAULT_HAIKU_MODEL=" + model,
"CLAUDE_CODE_SUBAGENT_MODEL=" + model,
}
if isCloudModelName(model) {
if l, ok := lookupCloudModelLimit(model); ok {
env = append(env, "CLAUDE_CODE_AUTO_COMPACT_WINDOW="+strconv.Itoa(l.Context))
}
}
return env
}

171
cmd/launch/claude_test.go Normal file
View File

@@ -0,0 +1,171 @@
package launch
import (
"os"
"path/filepath"
"runtime"
"slices"
"strings"
"testing"
)
func TestClaudeIntegration(t *testing.T) {
c := &Claude{}
t.Run("String", func(t *testing.T) {
if got := c.String(); got != "Claude Code" {
t.Errorf("String() = %q, want %q", got, "Claude Code")
}
})
t.Run("implements Runner", func(t *testing.T) {
var _ Runner = c
})
}
func TestClaudeFindPath(t *testing.T) {
c := &Claude{}
t.Run("finds claude in PATH", func(t *testing.T) {
tmpDir := t.TempDir()
name := "claude"
if runtime.GOOS == "windows" {
name = "claude.exe"
}
fakeBin := filepath.Join(tmpDir, name)
os.WriteFile(fakeBin, []byte("#!/bin/sh\n"), 0o755)
t.Setenv("PATH", tmpDir)
got, err := c.findPath()
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if got != fakeBin {
t.Errorf("findPath() = %q, want %q", got, fakeBin)
}
})
t.Run("falls back to ~/.claude/local/claude", func(t *testing.T) {
tmpDir := t.TempDir()
setTestHome(t, tmpDir)
t.Setenv("PATH", t.TempDir()) // empty dir, no claude binary
name := "claude"
if runtime.GOOS == "windows" {
name = "claude.exe"
}
fallback := filepath.Join(tmpDir, ".claude", "local", name)
os.MkdirAll(filepath.Dir(fallback), 0o755)
os.WriteFile(fallback, []byte("#!/bin/sh\n"), 0o755)
got, err := c.findPath()
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if got != fallback {
t.Errorf("findPath() = %q, want %q", got, fallback)
}
})
t.Run("returns error when neither PATH nor fallback exists", func(t *testing.T) {
tmpDir := t.TempDir()
setTestHome(t, tmpDir)
t.Setenv("PATH", t.TempDir()) // empty dir, no claude binary
_, err := c.findPath()
if err == nil {
t.Fatal("expected error, got nil")
}
})
}
func TestClaudeArgs(t *testing.T) {
c := &Claude{}
tests := []struct {
name string
model string
args []string
want []string
}{
{"with model", "llama3.2", nil, []string{"--model", "llama3.2"}},
{"empty model", "", nil, nil},
{"with model and verbose", "llama3.2", []string{"--verbose"}, []string{"--model", "llama3.2", "--verbose"}},
{"empty model with help", "", []string{"--help"}, []string{"--help"}},
{"with allowed tools", "llama3.2", []string{"--allowedTools", "Read,Write,Bash"}, []string{"--model", "llama3.2", "--allowedTools", "Read,Write,Bash"}},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got := c.args(tt.model, tt.args)
if !slices.Equal(got, tt.want) {
t.Errorf("args(%q, %v) = %v, want %v", tt.model, tt.args, got, tt.want)
}
})
}
}
func TestClaudeModelEnvVars(t *testing.T) {
c := &Claude{}
envMap := func(envs []string) map[string]string {
m := make(map[string]string)
for _, e := range envs {
k, v, _ := strings.Cut(e, "=")
m[k] = v
}
return m
}
t.Run("maps all Claude model env vars to the provided model", func(t *testing.T) {
got := envMap(c.modelEnvVars("llama3.2"))
if got["ANTHROPIC_DEFAULT_OPUS_MODEL"] != "llama3.2" {
t.Errorf("OPUS = %q, want llama3.2", got["ANTHROPIC_DEFAULT_OPUS_MODEL"])
}
if got["ANTHROPIC_DEFAULT_SONNET_MODEL"] != "llama3.2" {
t.Errorf("SONNET = %q, want llama3.2", got["ANTHROPIC_DEFAULT_SONNET_MODEL"])
}
if got["ANTHROPIC_DEFAULT_HAIKU_MODEL"] != "llama3.2" {
t.Errorf("HAIKU = %q, want llama3.2", got["ANTHROPIC_DEFAULT_HAIKU_MODEL"])
}
if got["CLAUDE_CODE_SUBAGENT_MODEL"] != "llama3.2" {
t.Errorf("SUBAGENT = %q, want llama3.2", got["CLAUDE_CODE_SUBAGENT_MODEL"])
}
if got["CLAUDE_CODE_AUTO_COMPACT_WINDOW"] != "" {
t.Errorf("AUTO_COMPACT_WINDOW = %q, want empty for local models", got["CLAUDE_CODE_AUTO_COMPACT_WINDOW"])
}
})
t.Run("supports empty model", func(t *testing.T) {
got := envMap(c.modelEnvVars(""))
if got["ANTHROPIC_DEFAULT_OPUS_MODEL"] != "" {
t.Errorf("OPUS = %q, want empty", got["ANTHROPIC_DEFAULT_OPUS_MODEL"])
}
if got["ANTHROPIC_DEFAULT_SONNET_MODEL"] != "" {
t.Errorf("SONNET = %q, want empty", got["ANTHROPIC_DEFAULT_SONNET_MODEL"])
}
if got["ANTHROPIC_DEFAULT_HAIKU_MODEL"] != "" {
t.Errorf("HAIKU = %q, want empty", got["ANTHROPIC_DEFAULT_HAIKU_MODEL"])
}
if got["CLAUDE_CODE_SUBAGENT_MODEL"] != "" {
t.Errorf("SUBAGENT = %q, want empty", got["CLAUDE_CODE_SUBAGENT_MODEL"])
}
if got["CLAUDE_CODE_AUTO_COMPACT_WINDOW"] != "" {
t.Errorf("AUTO_COMPACT_WINDOW = %q, want empty", got["CLAUDE_CODE_AUTO_COMPACT_WINDOW"])
}
})
t.Run("sets auto compact window for known cloud models", func(t *testing.T) {
got := envMap(c.modelEnvVars("glm-5:cloud"))
if got["CLAUDE_CODE_AUTO_COMPACT_WINDOW"] != "202752" {
t.Errorf("AUTO_COMPACT_WINDOW = %q, want 202752", got["CLAUDE_CODE_AUTO_COMPACT_WINDOW"])
}
})
t.Run("does not set auto compact window for unknown cloud models", func(t *testing.T) {
got := envMap(c.modelEnvVars("unknown-model:cloud"))
if got["CLAUDE_CODE_AUTO_COMPACT_WINDOW"] != "" {
t.Errorf("AUTO_COMPACT_WINDOW = %q, want empty", got["CLAUDE_CODE_AUTO_COMPACT_WINDOW"])
}
})
}

104
cmd/launch/cline.go Normal file
View File

@@ -0,0 +1,104 @@
package launch
import (
"encoding/json"
"fmt"
"os"
"os/exec"
"path/filepath"
"github.com/ollama/ollama/cmd/internal/fileutil"
"github.com/ollama/ollama/envconfig"
)
// Cline implements Runner and Editor for the Cline CLI integration
type Cline struct{}
func (c *Cline) String() string { return "Cline" }
func (c *Cline) Run(model string, args []string) error {
if _, err := exec.LookPath("cline"); err != nil {
return fmt.Errorf("cline is not installed, install with: npm install -g cline")
}
cmd := exec.Command("cline", args...)
cmd.Stdin = os.Stdin
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
return cmd.Run()
}
func (c *Cline) Paths() []string {
home, err := os.UserHomeDir()
if err != nil {
return nil
}
p := filepath.Join(home, ".cline", "data", "globalState.json")
if _, err := os.Stat(p); err == nil {
return []string{p}
}
return nil
}
func (c *Cline) Edit(models []string) error {
if len(models) == 0 {
return nil
}
home, err := os.UserHomeDir()
if err != nil {
return err
}
configPath := filepath.Join(home, ".cline", "data", "globalState.json")
if err := os.MkdirAll(filepath.Dir(configPath), 0o755); err != nil {
return err
}
config := make(map[string]any)
if data, err := os.ReadFile(configPath); err == nil {
if err := json.Unmarshal(data, &config); err != nil {
return fmt.Errorf("failed to parse config: %w, at: %s", err, configPath)
}
}
// Set Ollama as the provider for both act and plan modes
baseURL := envconfig.Host().String()
config["ollamaBaseUrl"] = baseURL
config["actModeApiProvider"] = "ollama"
config["actModeOllamaModelId"] = models[0]
config["actModeOllamaBaseUrl"] = baseURL
config["planModeApiProvider"] = "ollama"
config["planModeOllamaModelId"] = models[0]
config["planModeOllamaBaseUrl"] = baseURL
config["welcomeViewCompleted"] = true
data, err := json.MarshalIndent(config, "", " ")
if err != nil {
return err
}
return fileutil.WriteWithBackup(configPath, data)
}
func (c *Cline) Models() []string {
home, err := os.UserHomeDir()
if err != nil {
return nil
}
config, err := fileutil.ReadJSON(filepath.Join(home, ".cline", "data", "globalState.json"))
if err != nil {
return nil
}
if config["actModeApiProvider"] != "ollama" {
return nil
}
modelID, _ := config["actModeOllamaModelId"].(string)
if modelID == "" {
return nil
}
return []string{modelID}
}

204
cmd/launch/cline_test.go Normal file
View File

@@ -0,0 +1,204 @@
package launch
import (
"encoding/json"
"os"
"path/filepath"
"testing"
)
func TestClineIntegration(t *testing.T) {
c := &Cline{}
t.Run("String", func(t *testing.T) {
if got := c.String(); got != "Cline" {
t.Errorf("String() = %q, want %q", got, "Cline")
}
})
t.Run("implements Runner", func(t *testing.T) {
var _ Runner = c
})
t.Run("implements Editor", func(t *testing.T) {
var _ Editor = c
})
}
func TestClineEdit(t *testing.T) {
c := &Cline{}
tmpDir := t.TempDir()
setTestHome(t, tmpDir)
configDir := filepath.Join(tmpDir, ".cline", "data")
configPath := filepath.Join(configDir, "globalState.json")
readConfig := func() map[string]any {
data, _ := os.ReadFile(configPath)
var config map[string]any
json.Unmarshal(data, &config)
return config
}
t.Run("creates config from scratch", func(t *testing.T) {
os.RemoveAll(filepath.Join(tmpDir, ".cline"))
if err := c.Edit([]string{"kimi-k2.5:cloud"}); err != nil {
t.Fatal(err)
}
config := readConfig()
if config["actModeApiProvider"] != "ollama" {
t.Errorf("actModeApiProvider = %v, want ollama", config["actModeApiProvider"])
}
if config["actModeOllamaModelId"] != "kimi-k2.5:cloud" {
t.Errorf("actModeOllamaModelId = %v, want kimi-k2.5:cloud", config["actModeOllamaModelId"])
}
if config["planModeApiProvider"] != "ollama" {
t.Errorf("planModeApiProvider = %v, want ollama", config["planModeApiProvider"])
}
if config["planModeOllamaModelId"] != "kimi-k2.5:cloud" {
t.Errorf("planModeOllamaModelId = %v, want kimi-k2.5:cloud", config["planModeOllamaModelId"])
}
if config["welcomeViewCompleted"] != true {
t.Errorf("welcomeViewCompleted = %v, want true", config["welcomeViewCompleted"])
}
})
t.Run("preserves existing fields", func(t *testing.T) {
os.RemoveAll(filepath.Join(tmpDir, ".cline"))
os.MkdirAll(configDir, 0o755)
existing := map[string]any{
"remoteRulesToggles": map[string]any{},
"remoteWorkflowToggles": map[string]any{},
"customSetting": "keep-me",
}
data, _ := json.Marshal(existing)
os.WriteFile(configPath, data, 0o644)
if err := c.Edit([]string{"glm-5:cloud"}); err != nil {
t.Fatal(err)
}
config := readConfig()
if config["customSetting"] != "keep-me" {
t.Errorf("customSetting was not preserved")
}
if config["actModeOllamaModelId"] != "glm-5:cloud" {
t.Errorf("actModeOllamaModelId = %v, want glm-5:cloud", config["actModeOllamaModelId"])
}
})
t.Run("updates model on re-edit", func(t *testing.T) {
os.RemoveAll(filepath.Join(tmpDir, ".cline"))
if err := c.Edit([]string{"kimi-k2.5:cloud"}); err != nil {
t.Fatal(err)
}
if err := c.Edit([]string{"glm-5:cloud"}); err != nil {
t.Fatal(err)
}
config := readConfig()
if config["actModeOllamaModelId"] != "glm-5:cloud" {
t.Errorf("actModeOllamaModelId = %v, want glm-5:cloud", config["actModeOllamaModelId"])
}
if config["planModeOllamaModelId"] != "glm-5:cloud" {
t.Errorf("planModeOllamaModelId = %v, want glm-5:cloud", config["planModeOllamaModelId"])
}
})
t.Run("empty models is no-op", func(t *testing.T) {
os.RemoveAll(filepath.Join(tmpDir, ".cline"))
if err := c.Edit(nil); err != nil {
t.Fatal(err)
}
if _, err := os.Stat(configPath); !os.IsNotExist(err) {
t.Error("expected no config file to be created for empty models")
}
})
t.Run("uses first model as primary", func(t *testing.T) {
os.RemoveAll(filepath.Join(tmpDir, ".cline"))
if err := c.Edit([]string{"kimi-k2.5:cloud", "glm-5:cloud"}); err != nil {
t.Fatal(err)
}
config := readConfig()
if config["actModeOllamaModelId"] != "kimi-k2.5:cloud" {
t.Errorf("actModeOllamaModelId = %v, want kimi-k2.5:cloud (first model)", config["actModeOllamaModelId"])
}
})
}
func TestClineModels(t *testing.T) {
c := &Cline{}
tmpDir := t.TempDir()
setTestHome(t, tmpDir)
configDir := filepath.Join(tmpDir, ".cline", "data")
configPath := filepath.Join(configDir, "globalState.json")
t.Run("returns nil when no config", func(t *testing.T) {
if models := c.Models(); models != nil {
t.Errorf("Models() = %v, want nil", models)
}
})
t.Run("returns nil when provider is not ollama", func(t *testing.T) {
os.MkdirAll(configDir, 0o755)
config := map[string]any{
"actModeApiProvider": "anthropic",
"actModeOllamaModelId": "some-model",
}
data, _ := json.Marshal(config)
os.WriteFile(configPath, data, 0o644)
if models := c.Models(); models != nil {
t.Errorf("Models() = %v, want nil", models)
}
})
t.Run("returns model when ollama is configured", func(t *testing.T) {
os.MkdirAll(configDir, 0o755)
config := map[string]any{
"actModeApiProvider": "ollama",
"actModeOllamaModelId": "kimi-k2.5:cloud",
}
data, _ := json.Marshal(config)
os.WriteFile(configPath, data, 0o644)
models := c.Models()
if len(models) != 1 || models[0] != "kimi-k2.5:cloud" {
t.Errorf("Models() = %v, want [kimi-k2.5:cloud]", models)
}
})
}
func TestClinePaths(t *testing.T) {
c := &Cline{}
tmpDir := t.TempDir()
setTestHome(t, tmpDir)
t.Run("returns nil when no config exists", func(t *testing.T) {
if paths := c.Paths(); paths != nil {
t.Errorf("Paths() = %v, want nil", paths)
}
})
t.Run("returns path when config exists", func(t *testing.T) {
configDir := filepath.Join(tmpDir, ".cline", "data")
os.MkdirAll(configDir, 0o755)
configPath := filepath.Join(configDir, "globalState.json")
os.WriteFile(configPath, []byte("{}"), 0o644)
paths := c.Paths()
if len(paths) != 1 || paths[0] != configPath {
t.Errorf("Paths() = %v, want [%s]", paths, configPath)
}
})
}

148
cmd/launch/codex.go Normal file
View File

@@ -0,0 +1,148 @@
package launch
import (
"fmt"
"os"
"os/exec"
"path/filepath"
"strings"
"github.com/ollama/ollama/envconfig"
"golang.org/x/mod/semver"
)
// Codex implements Runner for Codex integration
type Codex struct{}
func (c *Codex) String() string { return "Codex" }
const codexProfileName = "ollama-launch"
func (c *Codex) args(model string, extra []string) []string {
args := []string{"--profile", codexProfileName}
if model != "" {
args = append(args, "-m", model)
}
args = append(args, extra...)
return args
}
func (c *Codex) Run(model string, args []string) error {
if err := checkCodexVersion(); err != nil {
return err
}
if err := ensureCodexConfig(); err != nil {
return fmt.Errorf("failed to configure codex: %w", err)
}
cmd := exec.Command("codex", c.args(model, args)...)
cmd.Stdin = os.Stdin
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
cmd.Env = append(os.Environ(),
"OPENAI_API_KEY=ollama",
)
return cmd.Run()
}
// ensureCodexConfig writes a [profiles.ollama-launch] section to ~/.codex/config.toml
// with openai_base_url pointing to the local Ollama server.
func ensureCodexConfig() error {
home, err := os.UserHomeDir()
if err != nil {
return err
}
codexDir := filepath.Join(home, ".codex")
if err := os.MkdirAll(codexDir, 0o755); err != nil {
return err
}
configPath := filepath.Join(codexDir, "config.toml")
return writeCodexProfile(configPath)
}
// writeCodexProfile ensures ~/.codex/config.toml has the ollama-launch profile
// and model provider sections with the correct base URL.
func writeCodexProfile(configPath string) error {
baseURL := envconfig.Host().String() + "/v1/"
sections := []struct {
header string
lines []string
}{
{
header: fmt.Sprintf("[profiles.%s]", codexProfileName),
lines: []string{
fmt.Sprintf("openai_base_url = %q", baseURL),
`forced_login_method = "api"`,
fmt.Sprintf("model_provider = %q", codexProfileName),
},
},
{
header: fmt.Sprintf("[model_providers.%s]", codexProfileName),
lines: []string{
`name = "Ollama"`,
fmt.Sprintf("base_url = %q", baseURL),
},
},
}
content, readErr := os.ReadFile(configPath)
text := ""
if readErr == nil {
text = string(content)
}
for _, s := range sections {
block := strings.Join(append([]string{s.header}, s.lines...), "\n") + "\n"
if idx := strings.Index(text, s.header); idx >= 0 {
// Replace the existing section up to the next section header.
rest := text[idx+len(s.header):]
if endIdx := strings.Index(rest, "\n["); endIdx >= 0 {
text = text[:idx] + block + rest[endIdx+1:]
} else {
text = text[:idx] + block
}
} else {
// Append the section.
if text != "" && !strings.HasSuffix(text, "\n") {
text += "\n"
}
if text != "" {
text += "\n"
}
text += block
}
}
return os.WriteFile(configPath, []byte(text), 0o644)
}
func checkCodexVersion() error {
if _, err := exec.LookPath("codex"); err != nil {
return fmt.Errorf("codex is not installed, install with: npm install -g @openai/codex")
}
out, err := exec.Command("codex", "--version").Output()
if err != nil {
return fmt.Errorf("failed to get codex version: %w", err)
}
// Parse output like "codex-cli 0.87.0"
fields := strings.Fields(strings.TrimSpace(string(out)))
if len(fields) < 2 {
return fmt.Errorf("unexpected codex version output: %s", string(out))
}
version := "v" + fields[len(fields)-1]
minVersion := "v0.81.0"
if semver.Compare(version, minVersion) < 0 {
return fmt.Errorf("codex version %s is too old, minimum required is %s, update with: npm update -g @openai/codex", fields[len(fields)-1], "0.81.0")
}
return nil
}

229
cmd/launch/codex_test.go Normal file
View File

@@ -0,0 +1,229 @@
package launch
import (
"os"
"path/filepath"
"slices"
"strings"
"testing"
)
func TestCodexArgs(t *testing.T) {
c := &Codex{}
tests := []struct {
name string
model string
args []string
want []string
}{
{"with model", "llama3.2", nil, []string{"--profile", "ollama-launch", "-m", "llama3.2"}},
{"empty model", "", nil, []string{"--profile", "ollama-launch"}},
{"with model and extra args", "qwen3.5", []string{"-p", "myprofile"}, []string{"--profile", "ollama-launch", "-m", "qwen3.5", "-p", "myprofile"}},
{"with sandbox flag", "llama3.2", []string{"--sandbox", "workspace-write"}, []string{"--profile", "ollama-launch", "-m", "llama3.2", "--sandbox", "workspace-write"}},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got := c.args(tt.model, tt.args)
if !slices.Equal(got, tt.want) {
t.Errorf("args(%q, %v) = %v, want %v", tt.model, tt.args, got, tt.want)
}
})
}
}
func TestWriteCodexProfile(t *testing.T) {
t.Run("creates new file when none exists", func(t *testing.T) {
tmpDir := t.TempDir()
configPath := filepath.Join(tmpDir, "config.toml")
if err := writeCodexProfile(configPath); err != nil {
t.Fatal(err)
}
data, err := os.ReadFile(configPath)
if err != nil {
t.Fatal(err)
}
content := string(data)
if !strings.Contains(content, "[profiles.ollama-launch]") {
t.Error("missing [profiles.ollama-launch] header")
}
if !strings.Contains(content, "openai_base_url") {
t.Error("missing openai_base_url key")
}
if !strings.Contains(content, "/v1/") {
t.Error("missing /v1/ suffix in base URL")
}
if !strings.Contains(content, `forced_login_method = "api"`) {
t.Error("missing forced_login_method key")
}
if !strings.Contains(content, `model_provider = "ollama-launch"`) {
t.Error("missing model_provider key")
}
if !strings.Contains(content, "[model_providers.ollama-launch]") {
t.Error("missing [model_providers.ollama-launch] section")
}
if !strings.Contains(content, `name = "Ollama"`) {
t.Error("missing model provider name")
}
})
t.Run("appends profile to existing file without profile", func(t *testing.T) {
tmpDir := t.TempDir()
configPath := filepath.Join(tmpDir, "config.toml")
existing := "[some_other_section]\nkey = \"value\"\n"
os.WriteFile(configPath, []byte(existing), 0o644)
if err := writeCodexProfile(configPath); err != nil {
t.Fatal(err)
}
data, _ := os.ReadFile(configPath)
content := string(data)
if !strings.Contains(content, "[some_other_section]") {
t.Error("existing section was removed")
}
if !strings.Contains(content, "[profiles.ollama-launch]") {
t.Error("missing [profiles.ollama-launch] header")
}
})
t.Run("replaces existing profile section", func(t *testing.T) {
tmpDir := t.TempDir()
configPath := filepath.Join(tmpDir, "config.toml")
existing := "[profiles.ollama-launch]\nopenai_base_url = \"http://old:1234/v1/\"\n\n[model_providers.ollama-launch]\nname = \"Ollama\"\nbase_url = \"http://old:1234/v1/\"\n"
os.WriteFile(configPath, []byte(existing), 0o644)
if err := writeCodexProfile(configPath); err != nil {
t.Fatal(err)
}
data, _ := os.ReadFile(configPath)
content := string(data)
if strings.Contains(content, "old:1234") {
t.Error("old URL was not replaced")
}
if strings.Count(content, "[profiles.ollama-launch]") != 1 {
t.Errorf("expected exactly one [profiles.ollama-launch] section, got %d", strings.Count(content, "[profiles.ollama-launch]"))
}
if strings.Count(content, "[model_providers.ollama-launch]") != 1 {
t.Errorf("expected exactly one [model_providers.ollama-launch] section, got %d", strings.Count(content, "[model_providers.ollama-launch]"))
}
})
t.Run("replaces profile while preserving following sections", func(t *testing.T) {
tmpDir := t.TempDir()
configPath := filepath.Join(tmpDir, "config.toml")
existing := "[profiles.ollama-launch]\nopenai_base_url = \"http://old:1234/v1/\"\n[another_section]\nfoo = \"bar\"\n"
os.WriteFile(configPath, []byte(existing), 0o644)
if err := writeCodexProfile(configPath); err != nil {
t.Fatal(err)
}
data, _ := os.ReadFile(configPath)
content := string(data)
if strings.Contains(content, "old:1234") {
t.Error("old URL was not replaced")
}
if !strings.Contains(content, "[another_section]") {
t.Error("following section was removed")
}
if !strings.Contains(content, "foo = \"bar\"") {
t.Error("following section content was removed")
}
})
t.Run("appends newline to file not ending with newline", func(t *testing.T) {
tmpDir := t.TempDir()
configPath := filepath.Join(tmpDir, "config.toml")
existing := "[other]\nkey = \"val\""
os.WriteFile(configPath, []byte(existing), 0o644)
if err := writeCodexProfile(configPath); err != nil {
t.Fatal(err)
}
data, _ := os.ReadFile(configPath)
content := string(data)
if !strings.Contains(content, "[profiles.ollama-launch]") {
t.Error("missing [profiles.ollama-launch] header")
}
// Should not have double blank lines from missing trailing newline
if strings.Contains(content, "\n\n\n") {
t.Error("unexpected triple newline in output")
}
})
t.Run("uses custom OLLAMA_HOST", func(t *testing.T) {
t.Setenv("OLLAMA_HOST", "http://myhost:9999")
tmpDir := t.TempDir()
configPath := filepath.Join(tmpDir, "config.toml")
if err := writeCodexProfile(configPath); err != nil {
t.Fatal(err)
}
data, _ := os.ReadFile(configPath)
content := string(data)
if !strings.Contains(content, "myhost:9999/v1/") {
t.Errorf("expected custom host in URL, got:\n%s", content)
}
})
}
func TestEnsureCodexConfig(t *testing.T) {
t.Run("creates .codex dir and config.toml", func(t *testing.T) {
tmpDir := t.TempDir()
setTestHome(t, tmpDir)
if err := ensureCodexConfig(); err != nil {
t.Fatal(err)
}
configPath := filepath.Join(tmpDir, ".codex", "config.toml")
data, err := os.ReadFile(configPath)
if err != nil {
t.Fatalf("config.toml not created: %v", err)
}
content := string(data)
if !strings.Contains(content, "[profiles.ollama-launch]") {
t.Error("missing [profiles.ollama-launch] header")
}
if !strings.Contains(content, "openai_base_url") {
t.Error("missing openai_base_url key")
}
})
t.Run("is idempotent", func(t *testing.T) {
tmpDir := t.TempDir()
setTestHome(t, tmpDir)
if err := ensureCodexConfig(); err != nil {
t.Fatal(err)
}
if err := ensureCodexConfig(); err != nil {
t.Fatal(err)
}
configPath := filepath.Join(tmpDir, ".codex", "config.toml")
data, _ := os.ReadFile(configPath)
content := string(data)
if strings.Count(content, "[profiles.ollama-launch]") != 1 {
t.Errorf("expected exactly one [profiles.ollama-launch] section after two calls, got %d", strings.Count(content, "[profiles.ollama-launch]"))
}
if strings.Count(content, "[model_providers.ollama-launch]") != 1 {
t.Errorf("expected exactly one [model_providers.ollama-launch] section after two calls, got %d", strings.Count(content, "[model_providers.ollama-launch]"))
}
})
}

598
cmd/launch/command_test.go Normal file
View File

@@ -0,0 +1,598 @@
package launch
import (
"bytes"
"fmt"
"io"
"net/http"
"net/http/httptest"
"os"
"strings"
"testing"
"github.com/google/go-cmp/cmp"
"github.com/ollama/ollama/cmd/config"
"github.com/spf13/cobra"
)
func captureStderr(t *testing.T, fn func()) string {
t.Helper()
oldStderr := os.Stderr
r, w, err := os.Pipe()
if err != nil {
t.Fatalf("failed to create stderr pipe: %v", err)
}
os.Stderr = w
defer func() {
os.Stderr = oldStderr
}()
done := make(chan string, 1)
go func() {
var buf bytes.Buffer
_, _ = io.Copy(&buf, r)
done <- buf.String()
}()
fn()
_ = w.Close()
return <-done
}
func TestLaunchCmd(t *testing.T) {
mockCheck := func(cmd *cobra.Command, args []string) error {
return nil
}
mockTUI := func(cmd *cobra.Command) {}
cmd := LaunchCmd(mockCheck, mockTUI)
t.Run("command structure", func(t *testing.T) {
if cmd.Use != "launch [INTEGRATION] [-- [EXTRA_ARGS...]]" {
t.Errorf("Use = %q, want %q", cmd.Use, "launch [INTEGRATION] [-- [EXTRA_ARGS...]]")
}
if cmd.Short == "" {
t.Error("Short description should not be empty")
}
if cmd.Long == "" {
t.Error("Long description should not be empty")
}
})
t.Run("flags exist", func(t *testing.T) {
if cmd.Flags().Lookup("model") == nil {
t.Error("--model flag should exist")
}
if cmd.Flags().Lookup("config") == nil {
t.Error("--config flag should exist")
}
if cmd.Flags().Lookup("yes") == nil {
t.Error("--yes flag should exist")
}
})
t.Run("PreRunE is set", func(t *testing.T) {
if cmd.PreRunE == nil {
t.Error("PreRunE should be set to checkServerHeartbeat")
}
})
}
func TestLaunchCmdTUICallback(t *testing.T) {
mockCheck := func(cmd *cobra.Command, args []string) error {
return nil
}
t.Run("no args calls TUI", func(t *testing.T) {
tuiCalled := false
mockTUI := func(cmd *cobra.Command) {
tuiCalled = true
}
cmd := LaunchCmd(mockCheck, mockTUI)
cmd.SetArgs([]string{})
_ = cmd.Execute()
if !tuiCalled {
t.Error("TUI callback should be called when no args provided")
}
})
t.Run("integration arg bypasses TUI", func(t *testing.T) {
srv := httptest.NewServer(http.NotFoundHandler())
defer srv.Close()
t.Setenv("OLLAMA_HOST", srv.URL)
tuiCalled := false
mockTUI := func(cmd *cobra.Command) {
tuiCalled = true
}
cmd := LaunchCmd(mockCheck, mockTUI)
cmd.SetArgs([]string{"claude"})
_ = cmd.Execute()
if tuiCalled {
t.Error("TUI callback should NOT be called when integration arg provided")
}
})
t.Run("--model flag without integration returns error", func(t *testing.T) {
tuiCalled := false
mockTUI := func(cmd *cobra.Command) {
tuiCalled = true
}
cmd := LaunchCmd(mockCheck, mockTUI)
cmd.SetArgs([]string{"--model", "test-model"})
err := cmd.Execute()
if err == nil {
t.Fatal("expected --model without an integration to fail")
}
if !strings.Contains(err.Error(), "require an integration name") {
t.Fatalf("expected integration-name guidance, got %v", err)
}
if tuiCalled {
t.Error("TUI callback should NOT be called when --model is provided without an integration")
}
})
t.Run("--config flag without integration returns error", func(t *testing.T) {
tuiCalled := false
mockTUI := func(cmd *cobra.Command) {
tuiCalled = true
}
cmd := LaunchCmd(mockCheck, mockTUI)
cmd.SetArgs([]string{"--config"})
err := cmd.Execute()
if err == nil {
t.Fatal("expected --config without an integration to fail")
}
if !strings.Contains(err.Error(), "require an integration name") {
t.Fatalf("expected integration-name guidance, got %v", err)
}
if tuiCalled {
t.Error("TUI callback should NOT be called when --config is provided without an integration")
}
})
t.Run("--yes flag without integration returns error", func(t *testing.T) {
tuiCalled := false
mockTUI := func(cmd *cobra.Command) {
tuiCalled = true
}
cmd := LaunchCmd(mockCheck, mockTUI)
cmd.SetArgs([]string{"--yes"})
err := cmd.Execute()
if err == nil {
t.Fatal("expected --yes without an integration to fail")
}
if !strings.Contains(err.Error(), "require an integration name") {
t.Fatalf("expected integration-name guidance, got %v", err)
}
if tuiCalled {
t.Error("TUI callback should NOT be called when --yes is provided without an integration")
}
})
t.Run("extra args without integration return error", func(t *testing.T) {
tuiCalled := false
mockTUI := func(cmd *cobra.Command) {
tuiCalled = true
}
cmd := LaunchCmd(mockCheck, mockTUI)
cmd.SetArgs([]string{"--model", "test-model", "--", "--sandbox", "workspace-write"})
err := cmd.Execute()
if err == nil {
t.Fatal("expected flags and extra args without an integration to fail")
}
if !strings.Contains(err.Error(), "require an integration name") {
t.Fatalf("expected integration-name guidance, got %v", err)
}
if tuiCalled {
t.Error("TUI callback should NOT be called when flags or extra args are provided without an integration")
}
})
}
func TestLaunchCmdNilHeartbeat(t *testing.T) {
cmd := LaunchCmd(nil, nil)
if cmd == nil {
t.Fatal("LaunchCmd returned nil")
}
if cmd.PreRunE != nil {
t.Log("Note: PreRunE is set even when nil is passed (acceptable)")
}
}
func TestLaunchCmdModelFlagFiltersDisabledCloudFromSavedConfig(t *testing.T) {
tmpDir := t.TempDir()
setLaunchTestHome(t, tmpDir)
if err := config.SaveIntegration("stubeditor", []string{"glm-5:cloud"}); err != nil {
t.Fatalf("failed to seed saved config: %v", err)
}
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
switch r.URL.Path {
case "/api/status":
fmt.Fprintf(w, `{"cloud":{"disabled":true,"source":"config"}}`)
case "/api/show":
fmt.Fprintf(w, `{"model":"llama3.2"}`)
default:
w.WriteHeader(http.StatusNotFound)
}
}))
defer srv.Close()
t.Setenv("OLLAMA_HOST", srv.URL)
stub := &launcherEditorRunner{}
restore := OverrideIntegration("stubeditor", stub)
defer restore()
cmd := LaunchCmd(func(cmd *cobra.Command, args []string) error { return nil }, func(cmd *cobra.Command) {})
cmd.SetArgs([]string{"stubeditor", "--model", "llama3.2"})
if err := cmd.Execute(); err != nil {
t.Fatalf("launch command failed: %v", err)
}
saved, err := config.LoadIntegration("stubeditor")
if err != nil {
t.Fatalf("failed to reload integration config: %v", err)
}
if diff := cmp.Diff([]string{"llama3.2"}, saved.Models); diff != "" {
t.Fatalf("saved models mismatch (-want +got):\n%s", diff)
}
if diff := cmp.Diff([][]string{{"llama3.2"}}, stub.edited); diff != "" {
t.Fatalf("editor models mismatch (-want +got):\n%s", diff)
}
if stub.ranModel != "llama3.2" {
t.Fatalf("expected launch to run with llama3.2, got %q", stub.ranModel)
}
}
func TestLaunchCmdModelFlagClearsDisabledCloudOverride(t *testing.T) {
tmpDir := t.TempDir()
setLaunchTestHome(t, tmpDir)
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
switch r.URL.Path {
case "/api/status":
fmt.Fprintf(w, `{"cloud":{"disabled":true,"source":"config"}}`)
case "/api/tags":
fmt.Fprint(w, `{"models":[{"name":"llama3.2"}]}`)
case "/api/show":
fmt.Fprint(w, `{"model":"llama3.2"}`)
default:
w.WriteHeader(http.StatusNotFound)
}
}))
defer srv.Close()
t.Setenv("OLLAMA_HOST", srv.URL)
stub := &launcherSingleRunner{}
restore := OverrideIntegration("stubapp", stub)
defer restore()
oldSelector := DefaultSingleSelector
defer func() { DefaultSingleSelector = oldSelector }()
var selectorCalls int
var gotCurrent string
DefaultSingleSelector = func(title string, items []ModelItem, current string) (string, error) {
selectorCalls++
gotCurrent = current
return "llama3.2", nil
}
cmd := LaunchCmd(func(cmd *cobra.Command, args []string) error { return nil }, func(cmd *cobra.Command) {})
cmd.SetArgs([]string{"stubapp", "--model", "glm-5:cloud"})
stderr := captureStderr(t, func() {
if err := cmd.Execute(); err != nil {
t.Fatalf("launch command failed: %v", err)
}
})
if selectorCalls != 1 {
t.Fatalf("expected disabled cloud override to fall back to selector, got %d calls", selectorCalls)
}
if gotCurrent != "" {
t.Fatalf("expected disabled override to be cleared before selection, got current %q", gotCurrent)
}
if stub.ranModel != "llama3.2" {
t.Fatalf("expected launch to run with replacement local model, got %q", stub.ranModel)
}
if !strings.Contains(stderr, "Warning: ignoring --model glm-5:cloud because cloud is disabled") {
t.Fatalf("expected disabled-cloud warning, got stderr: %q", stderr)
}
saved, err := config.LoadIntegration("stubapp")
if err != nil {
t.Fatalf("failed to reload integration config: %v", err)
}
if diff := cmp.Diff([]string{"llama3.2"}, saved.Models); diff != "" {
t.Fatalf("saved models mismatch (-want +got):\n%s", diff)
}
}
func TestLaunchCmdYes_AutoConfirmsLaunchPromptPath(t *testing.T) {
tmpDir := t.TempDir()
setLaunchTestHome(t, tmpDir)
withLauncherHooks(t)
withInteractiveSession(t, false)
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
switch r.URL.Path {
case "/api/show":
fmt.Fprint(w, `{"model":"llama3.2"}`)
case "/api/status":
w.WriteHeader(http.StatusNotFound)
fmt.Fprint(w, `{"error":"not found"}`)
default:
w.WriteHeader(http.StatusNotFound)
}
}))
defer srv.Close()
t.Setenv("OLLAMA_HOST", srv.URL)
stub := &launcherEditorRunner{paths: []string{"/tmp/stubeditor.json"}}
restore := OverrideIntegration("stubeditor", stub)
defer restore()
DefaultConfirmPrompt = func(prompt string, options ConfirmOptions) (bool, error) {
t.Fatalf("unexpected prompt with --yes: %q", prompt)
return false, nil
}
cmd := LaunchCmd(func(cmd *cobra.Command, args []string) error { return nil }, func(cmd *cobra.Command) {})
cmd.SetArgs([]string{"stubeditor", "--model", "llama3.2", "--yes"})
if err := cmd.Execute(); err != nil {
t.Fatalf("launch command with --yes failed: %v", err)
}
if diff := cmp.Diff([][]string{{"llama3.2"}}, stub.edited); diff != "" {
t.Fatalf("editor models mismatch (-want +got):\n%s", diff)
}
if stub.ranModel != "llama3.2" {
t.Fatalf("expected launch to run with llama3.2, got %q", stub.ranModel)
}
}
func TestLaunchCmdHeadlessWithYes_AutoPullsMissingLocalModel(t *testing.T) {
tmpDir := t.TempDir()
setLaunchTestHome(t, tmpDir)
withLauncherHooks(t)
withInteractiveSession(t, false)
var pullCalled bool
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
switch r.URL.Path {
case "/api/show":
w.WriteHeader(http.StatusNotFound)
fmt.Fprint(w, `{"error":"model not found"}`)
case "/api/pull":
pullCalled = true
w.WriteHeader(http.StatusOK)
fmt.Fprint(w, `{"status":"success"}`)
default:
w.WriteHeader(http.StatusNotFound)
}
}))
defer srv.Close()
t.Setenv("OLLAMA_HOST", srv.URL)
stub := &launcherSingleRunner{}
restore := OverrideIntegration("stubapp", stub)
defer restore()
DefaultConfirmPrompt = func(prompt string, options ConfirmOptions) (bool, error) {
t.Fatalf("unexpected prompt with --yes in headless autopull path: %q", prompt)
return false, nil
}
cmd := LaunchCmd(func(cmd *cobra.Command, args []string) error { return nil }, func(cmd *cobra.Command) {})
cmd.SetArgs([]string{"stubapp", "--model", "missing-model", "--yes"})
if err := cmd.Execute(); err != nil {
t.Fatalf("launch command with --yes failed: %v", err)
}
if !pullCalled {
t.Fatal("expected missing local model to be auto-pulled with --yes in headless mode")
}
if stub.ranModel != "missing-model" {
t.Fatalf("expected launch to run with pulled model, got %q", stub.ranModel)
}
}
func TestLaunchCmdHeadlessWithoutYes_ReturnsActionableConfirmError(t *testing.T) {
tmpDir := t.TempDir()
setLaunchTestHome(t, tmpDir)
withLauncherHooks(t)
withInteractiveSession(t, false)
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
switch r.URL.Path {
case "/api/show":
fmt.Fprint(w, `{"model":"llama3.2"}`)
case "/api/status":
w.WriteHeader(http.StatusNotFound)
fmt.Fprint(w, `{"error":"not found"}`)
default:
w.WriteHeader(http.StatusNotFound)
}
}))
defer srv.Close()
t.Setenv("OLLAMA_HOST", srv.URL)
stub := &launcherEditorRunner{paths: []string{"/tmp/stubeditor.json"}}
restore := OverrideIntegration("stubeditor", stub)
defer restore()
DefaultConfirmPrompt = func(prompt string, options ConfirmOptions) (bool, error) {
t.Fatalf("unexpected prompt in headless non-yes mode: %q", prompt)
return false, nil
}
cmd := LaunchCmd(func(cmd *cobra.Command, args []string) error { return nil }, func(cmd *cobra.Command) {})
cmd.SetArgs([]string{"stubeditor", "--model", "llama3.2"})
err := cmd.Execute()
if err == nil {
t.Fatal("expected launch command to fail without --yes in headless mode")
}
if !strings.Contains(err.Error(), "re-run with --yes") {
t.Fatalf("expected actionable --yes guidance, got %v", err)
}
if len(stub.edited) != 0 {
t.Fatalf("expected no editor writes when confirmation is blocked, got %v", stub.edited)
}
if stub.ranModel != "" {
t.Fatalf("expected launch to abort before run, got %q", stub.ranModel)
}
}
func TestLaunchCmdIntegrationArgPromptsForModelWithSavedSelection(t *testing.T) {
tmpDir := t.TempDir()
setLaunchTestHome(t, tmpDir)
if err := config.SaveIntegration("stubapp", []string{"llama3.2"}); err != nil {
t.Fatalf("failed to seed saved config: %v", err)
}
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
switch r.URL.Path {
case "/api/tags":
fmt.Fprint(w, `{"models":[{"name":"llama3.2"},{"name":"qwen3:8b"}]}`)
case "/api/show":
fmt.Fprint(w, `{"model":"qwen3:8b"}`)
default:
w.WriteHeader(http.StatusNotFound)
}
}))
defer srv.Close()
t.Setenv("OLLAMA_HOST", srv.URL)
stub := &launcherSingleRunner{}
restore := OverrideIntegration("stubapp", stub)
defer restore()
oldSelector := DefaultSingleSelector
defer func() { DefaultSingleSelector = oldSelector }()
var gotCurrent string
DefaultSingleSelector = func(title string, items []ModelItem, current string) (string, error) {
gotCurrent = current
return "qwen3:8b", nil
}
cmd := LaunchCmd(func(cmd *cobra.Command, args []string) error { return nil }, func(cmd *cobra.Command) {})
cmd.SetArgs([]string{"stubapp"})
if err := cmd.Execute(); err != nil {
t.Fatalf("launch command failed: %v", err)
}
if gotCurrent != "llama3.2" {
t.Fatalf("expected selector current model to be saved model llama3.2, got %q", gotCurrent)
}
if stub.ranModel != "qwen3:8b" {
t.Fatalf("expected launch to run selected model qwen3:8b, got %q", stub.ranModel)
}
saved, err := config.LoadIntegration("stubapp")
if err != nil {
t.Fatalf("failed to reload integration config: %v", err)
}
if diff := cmp.Diff([]string{"qwen3:8b"}, saved.Models); diff != "" {
t.Fatalf("saved models mismatch (-want +got):\n%s", diff)
}
}
func TestLaunchCmdHeadlessYes_IntegrationRequiresModelEvenWhenSaved(t *testing.T) {
tmpDir := t.TempDir()
setLaunchTestHome(t, tmpDir)
withLauncherHooks(t)
withInteractiveSession(t, false)
if err := config.SaveIntegration("stubapp", []string{"llama3.2"}); err != nil {
t.Fatalf("failed to seed saved config: %v", err)
}
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
switch r.URL.Path {
case "/api/show":
fmt.Fprint(w, `{"model":"llama3.2"}`)
default:
w.WriteHeader(http.StatusNotFound)
}
}))
defer srv.Close()
t.Setenv("OLLAMA_HOST", srv.URL)
stub := &launcherSingleRunner{}
restore := OverrideIntegration("stubapp", stub)
defer restore()
oldSelector := DefaultSingleSelector
defer func() { DefaultSingleSelector = oldSelector }()
DefaultSingleSelector = func(title string, items []ModelItem, current string) (string, error) {
t.Fatal("selector should not be called for headless --yes saved-model launch")
return "", nil
}
cmd := LaunchCmd(func(cmd *cobra.Command, args []string) error { return nil }, func(cmd *cobra.Command) {})
cmd.SetArgs([]string{"stubapp", "--yes"})
err := cmd.Execute()
if err == nil {
t.Fatal("expected launch command to fail when --yes is used headlessly without --model")
}
if !strings.Contains(err.Error(), "requires --model <model>") {
t.Fatalf("expected actionable --model guidance, got %v", err)
}
if stub.ranModel != "" {
t.Fatalf("expected launch to abort before run, got %q", stub.ranModel)
}
}
func TestLaunchCmdHeadlessYes_IntegrationWithoutSavedModelReturnsError(t *testing.T) {
tmpDir := t.TempDir()
setLaunchTestHome(t, tmpDir)
withLauncherHooks(t)
withInteractiveSession(t, false)
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusNotFound)
}))
defer srv.Close()
t.Setenv("OLLAMA_HOST", srv.URL)
stub := &launcherSingleRunner{}
restore := OverrideIntegration("stubapp", stub)
defer restore()
oldSelector := DefaultSingleSelector
defer func() { DefaultSingleSelector = oldSelector }()
DefaultSingleSelector = func(title string, items []ModelItem, current string) (string, error) {
t.Fatal("selector should not be called for headless --yes without saved model")
return "", nil
}
cmd := LaunchCmd(func(cmd *cobra.Command, args []string) error { return nil }, func(cmd *cobra.Command) {})
cmd.SetArgs([]string{"stubapp", "--yes"})
err := cmd.Execute()
if err == nil {
t.Fatal("expected launch command to fail when --yes is used headlessly without --model")
}
if !strings.Contains(err.Error(), "requires --model <model>") {
t.Fatalf("expected actionable --model guidance, got %v", err)
}
if stub.ranModel != "" {
t.Fatalf("expected launch to abort before run, got %q", stub.ranModel)
}
}

Some files were not shown because too many files have changed in this diff Show More