ollama

mirror of https://github.com/ollama/ollama.git synced 2026-04-24 01:35:49 +02:00

Author	SHA1	Message	Date
Bruce MacDonald	9e190ac4d9	api: return structured error on unauthorized push This commit implements a structured error response system for the Ollama API, replacing ad-hoc error handling and string parsing with proper error types and codes through a new ErrorResponse struct. Instead of relying on regex to parse error messages for SSH keys, the API now passes this data in a structured format with standardized fields for error messages, codes, and additional data. This structured approach makes the API more maintainable and reliable while improving the developer experience by enabling programmatic error handling, consistent error formats, and better error documentation.	2024-12-12 11:49:36 -08:00
Bruce MacDonald	ea90ee7415	Update cmd.go	2024-11-27 15:22:27 -08:00
Bruce MacDonald	40134c6587	server: show user feedback when key is anonymous When an ollama key is not registered with any account on ollama.com this is not obvious. In the current CLI an error message that the user is not authorized is displayed. This change brings back previous behavior to show the user their key and where they should add it. It protects against adding unexpected keys by checking that the key is available locally. A follow-up change should add structured errors from the API. This change just relies on a known error message.	2024-11-27 15:01:12 -08:00
frob	30e88d7f31	cmd: don't submit svg files as images for now (#7830 )	2024-11-25 16:43:29 -08:00
Bruce MacDonald	a210ec74d2	cmd: print location of model after pushing (#7695 ) After a user pushes their model it is not clear what to do next. Add a link to the output of `ollama push` that tells the user where their model can now be found.	2024-11-25 09:40:16 -08:00
Bruce MacDonald	7b5585b9cb	server: remove out of date anonymous access check (#7785 ) In the past the ollama.com server would return a JWT that contained information about the user being authenticated. This was used to return different error messages to the user. This is no longer possible since the token used to authenticate does not contain information about the user anymore. Removing this code that no longer works. Follow up changes will improve the error messages returned here, but good to clean up first.	2024-11-22 11:57:35 -08:00
Daniel Hiltgen	d88972ea48	Be quiet when redirecting output (#7360 ) This avoids emitting the progress indicators to stderr, and the interactive prompts to the output file or pipe. Running "ollama run model > out.txt" now exits immediately, and "echo hello \| ollama run model > out.txt" produces zero stderr output and a typical response in out.txt	2024-11-22 08:04:54 -08:00
湛露先生	eaaf5d309d	cmd: delete duplicated call to sb.Reset() (#7308 ) Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>	2024-11-21 11:20:48 -08:00
Blake Mizerany	67691e410d	cmd: preserve exact bytes when displaying template/system layers (#7586 )	2024-11-13 23:53:30 -08:00
Daniel Hiltgen	35ec7f079f	Fix unicode output on windows with redirect to file (#7358 ) If we're not writing out to a terminal, avoid setting the console mode on windows, which corrupts the output file.	2024-10-25 13:43:16 -07:00
Patrick Devine	d78fb62056	default to "FROM ." if a Modelfile isn't present (#7250 )	2024-10-22 13:32:24 -07:00
Patrick Devine	c7cb0f0602	image processing for llama3.2 (#6963 ) Co-authored-by: jmorganca <jmorganca@gmail.com> Co-authored-by: Michael Yang <mxyng@pm.me> Co-authored-by: Jesse Gross <jesse@ollama.com>	2024-10-18 16:12:35 -07:00
Jesse Gross	7fe3902552	cli: Send all images in conversation history Currently the CLI only sends images from the most recent image- containing message. This prevents doing things like sending one message with an image and then a follow message with a second image and asking for comparision based on additional information not present in any text that was output. It's possible that some models have a problem with this but the CLI is not the right place to do this since any adjustments are model-specific and should affect all clients. Both llava:34b and minicpm-v do reasonable things with multiple images in the history.	2024-10-10 11:21:51 -07:00
Alex Mavrogiannis	f40bb398f6	Stop model before deletion if loaded (fixed #6957 ) (#7050 )	2024-10-01 15:45:43 -07:00
Patrick Devine	abed273de3	add "stop" command (#6739 )	2024-09-11 16:36:21 -07:00
Michael Yang	ecab6f1cc5	refactor show ouput fixes line wrapping on long texts	2024-09-11 14:23:09 -07:00
Daniel Hiltgen	6719097649	llm: make load time stall duration configurable via OLLAMA_LOAD_TIMEOUT With the new very large parameter models, some users are willing to wait for a very long time for models to load.	2024-09-05 14:00:08 -07:00
Daniel Hiltgen	b05c9e83d9	Introduce GPU Overhead env var (#5922 ) Provide a mechanism for users to set aside an amount of VRAM on each GPU to make room for other applications they want to start after Ollama, or workaround memory prediction bugs	2024-09-05 13:46:35 -07:00
Vimal Kumar	5f7b4a5e30	fix(cmd): show info may have nil ModelInfo (#6579 )	2024-08-31 21:12:17 -07:00
Patrick Devine	0c819e167b	convert safetensor adapters into GGUF (#6327 )	2024-08-23 11:29:56 -07:00
Michael Yang	beb49eef65	create bert models from cli	2024-08-20 17:27:34 -07:00
longtao	0a8d6ea86d	Fix typo and improve readability (#5964 ) * Fix typo and improve readability Summary: * Rename updatAvailableMenuID to updateAvailableMenuID * Replace unused cmd parameter with _ in RunServer function * Fix typos in comments (cherry picked from commit 5b8715f0b04773369e8eb1f9e6737995a0ab3ba7) * Update api/client.go Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> --------- Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>	2024-08-13 17:54:19 -07:00
Josh	f7e3b9190f	cmd: spinner progress for transfer model data (#6100 )	2024-08-12 11:46:32 -07:00
Michael Yang	b732beba6a	lint	2024-08-01 17:06:06 -07:00
Michael Yang	c4c84b7a0d	Merge pull request #5196 from ollama/mxyng/messages-2 include modelfile messages	2024-07-31 10:18:17 -07:00
Michael Yang	5c1912769e	Merge pull request #5473 from ollama/mxyng/environ fix: environ lookup	2024-07-31 10:18:05 -07:00
Daniel Hiltgen	1a83581a8e	Merge pull request #5895 from dhiltgen/sched_faq Better explain multi-gpu behavior	2024-07-29 14:25:41 -07:00
Michael Yang	38d9036b59	Merge pull request #5992 from ollama/mxyng/save fix: model save	2024-07-29 09:53:19 -07:00
Tibor Schmidt	f3d7a481b7	feat: add support for min_p (resolve #1142 ) (#1825 )	2024-07-27 14:37:40 -07:00
Michael Yang	a250c2cb13	display messages	2024-07-26 13:39:57 -07:00
Michael Yang	3d9de805b7	fix: model save stop parameter is saved as a slice which is incompatible with modelfile parsing	2024-07-26 13:23:06 -07:00
Michael Yang	15af558423	include modelfile messages	2024-07-26 11:40:11 -07:00
Daniel Hiltgen	830fdd2715	Better explain multi-gpu behavior	2024-07-23 15:16:38 -07:00
Michael Yang	55cd3ddcca	bool	2024-07-22 11:27:21 -07:00
Michael Yang	4f1afd575d	host	2024-07-22 11:25:30 -07:00
Daniel Hiltgen	cc269ba094	Remove no longer supported max vram var The OLLAMA_MAX_VRAM env var was a temporary workaround for OOM scenarios. With Concurrency this was no longer wired up, and the simplistic value doesn't map to multi-GPU setups. Users can still set `num_gpu` to limit memory usage to avoid OOM if we get our predictions wrong.	2024-07-22 09:08:11 -07:00
Patrick Devine	057d31861e	remove template (#5655 )	2024-07-13 20:56:24 -07:00
Patrick Devine	23ebbaa46e	Revert "remove template from tests" This reverts commit `9ac0a7a50b`.	2024-07-12 15:47:17 -07:00
Patrick Devine	9ac0a7a50b	remove template from tests	2024-07-12 15:41:31 -07:00
royjhan	5f034f5b63	Include Show Info in Interactive (#5342 )	2024-06-28 13:15:52 -07:00
royjhan	b910fa9010	Ollama Show: Check for Projector Type (#5307 ) * Check exists projtype * Maintain Ordering	2024-06-28 11:30:16 -07:00
Michael Yang	123a722a6f	zip: prevent extracting files into parent dirs (#5314 )	2024-06-26 21:38:21 -07:00
Blake Mizerany	2aa91a937b	cmd: defer stating model info until necessary (#5248 ) This commit changes the 'ollama run' command to defer fetching model information until it really needs it. That is, when in interactive mode. It also removes one such case where the model information is fetch in duplicate, just before calling generateInteractive and then again, first thing, in generateInteractive. This positively impacts the performance of the command: ; time ./before run llama3 'hi' Hi! It's nice to meet you. Is there something I can help you with, or would you like to chat? ./before run llama3 'hi' 0.02s user 0.01s system 2% cpu 1.168 total ; time ./before run llama3 'hi' Hi! It's nice to meet you. Is there something I can help you with, or would you like to chat? ./before run llama3 'hi' 0.02s user 0.01s system 2% cpu 1.220 total ; time ./before run llama3 'hi' Hi! It's nice to meet you. Is there something I can help you with, or would you like to chat? ./before run llama3 'hi' 0.02s user 0.01s system 2% cpu 1.217 total ; time ./after run llama3 'hi' Hi! It's nice to meet you. Is there something I can help you with, or would you like to chat? ./after run llama3 'hi' 0.02s user 0.01s system 4% cpu 0.652 total ; time ./after run llama3 'hi' Hi! It's nice to meet you. Is there something I can help you with, or would you like to chat? ./after run llama3 'hi' 0.01s user 0.01s system 5% cpu 0.498 total ; time ./after run llama3 'hi' Hi! It's nice to meet you. Is there something I can help you with or would you like to chat? ./after run llama3 'hi' 0.01s user 0.01s system 3% cpu 0.479 total ; time ./after run llama3 'hi' Hi! It's nice to meet you. Is there something I can help you with, or would you like to chat? ./after run llama3 'hi' 0.02s user 0.01s system 5% cpu 0.507 total ; time ./after run llama3 'hi' Hi! It's nice to meet you. Is there something I can help you with, or would you like to chat? ./after run llama3 'hi' 0.02s user 0.01s system 5% cpu 0.507 total	2024-06-24 20:14:03 -07:00
royjhan	fedf71635e	Extend api/show and ollama show to return more model info (#4881 ) * API Show Extended * Initial Draft of Information Co-Authored-By: Patrick Devine <pdevine@sonic.net> * Clean Up * Descriptive arg error messages and other fixes * Second Draft of Show with Projectors Included * Remove Chat Template * Touches * Prevent wrapping from files * Verbose functionality * Docs * Address Feedback * Lint * Resolve Conflicts * Function Name * Tests for api/show model info * Show Test File * Add Projector Test * Clean routes * Projector Check * Move Show Test * Touches * Doc update --------- Co-authored-by: Patrick Devine <pdevine@sonic.net>	2024-06-19 14:19:02 -07:00
Patrick Devine	c69bc19e46	move OLLAMA_HOST to envconfig (#5009 )	2024-06-12 18:48:16 -04:00
Michael Yang	201d853fdf	nolintlint	2024-06-04 11:13:30 -07:00
Michael Yang	e40145a39d	lint	2024-06-04 11:13:30 -07:00
Michael Yang	8ffb51749f	nolintlint	2024-06-04 11:13:30 -07:00
Michael Yang	04f3c12bb7	replace x/exp/slices with slices	2024-06-04 11:13:30 -07:00
Josh Yan	914f68f021	replaced duplicate call with variable	2024-05-30 10:38:07 -07:00

1 2 3 4 5 ...

282 Commits