Not a complete implementation - free VRAM is better, but not accurate on windows
github.com/jmorganca/ollama
github.com/ollama/ollama