Commit Graph

16 Commits

Author SHA1 Message Date
Inforithmics
37206cdf32 remvoe debug code 2025-10-05 20:56:21 +02:00
Inforithmics
218e57974f print out unknown library 2025-10-05 17:04:12 +02:00
Inforithmics
6bef63b0f9 fix format 2025-10-04 21:45:06 +02:00
Inforithmics
f8551bc631 merge fixes 2025-10-04 21:28:15 +02:00
Inforithmics
1e46db8748 fixed build 2025-10-04 15:44:23 +02:00
Inforithmics
c4d8c75e54 merge fixes 2025-10-04 15:27:52 +02:00
Inforithmics
294b179688 merge fixes 2025-10-04 15:20:33 +02:00
Inforithmics
ac6ba7d44b Merge remote-tracking branch 'upstream/main' into VulkanV3Update 2025-10-04 14:53:59 +02:00
Daniel Hiltgen
e4340667e3 Workaround broken NVIDIA iGPU free VRAM data (#12490)
The CUDA APIs for reporting free VRAM are useless on NVIDIA iGPU
systems as they only return the kernels actual free memory and ignore
buff/cache allocations which on a typical system will quickly fill up
most of the free system memory.  As a result, we incorrectly think
there's very little available for GPU allocations which is wrong.
2025-10-03 12:17:21 -07:00
Daniel Hiltgen
05a43e078a fix panic on bootstrapDevices (#12475)
Wrong index variable was used.
2025-10-01 17:39:29 -07:00
Daniel Hiltgen
bc8909fb38 Use runners for GPU discovery (#12090)
This revamps how we discover GPUs in the system by leveraging the Ollama
runner.  This should eliminate inconsistency between our GPU discovery and the
runners capabilities at runtime, particularly for cases where we try to filter
out unsupported GPUs.  Now the runner does that implicitly based on the actual
device list.  In some cases free VRAM reporting can be unreliable which can
leaad to scheduling mistakes, so this also includes a patch to leverage more
reliable VRAM reporting libraries if available.

Automatic workarounds have been removed as only one GPU leveraged this, which
is now documented. This GPU will soon fall off the support matrix with the next
ROCm bump.

Additional cleanup of the scheduler and discovery packages can be done in the
future once we have switched on the new memory management code, and removed
support for the llama runner.
2025-10-01 15:12:32 -07:00
Daniel Hiltgen
5f9f312bdb fix - give bootstrapping more time on slow systems 2025-09-24 16:25:56 -07:00
Daniel Hiltgen
2689357890 fix index bug 2025-09-24 12:22:46 -07:00
Daniel Hiltgen
c86af47ac0 WIP - wire up Vulkan with the new engine based discovery
Not a complete implementation - free VRAM is better, but not accurate on
windows
2025-09-24 10:49:39 -07:00
Daniel Hiltgen
3566fe0e7b timing info for runner 2025-09-21 13:53:24 -07:00
Daniel Hiltgen
f761292516 Use runners for GPU discovery
This revamps how we discover GPUs in the system by leveraging the Ollama
runner.  This should eliminate inconsistency between our GPU discovery and the
runners capabilities at runtime, particularly for cases where we try to filter
out unsupported GPUs.  Now the runner does that implicitly based on the actual
device list.  In some cases free VRAM reporting can be unreliable which can
leaad to scheduling mistakes, so this also includes a patch to leverage more
reliable VRAM reporting libraries if available.

Automatic workarounds have been removed as only one GPU leveraged this, which
is now documented. This GPU will soon fall off the support matrix with the next
ROCm bump.

Additional cleanup of the scheduler and discovery packages can be done in the
future once we have switched on the new memory management code, and removed
support for the llama runner.
2025-09-21 13:53:24 -07:00