GGML update to ec98e2002 (#13451)

* Revert "add support for NVIDIA Nemotron 3 Nano"

This reverts commit e7d2ae9d69.

* GGML update to 380b4c984

Remove MaskBatchPadding as GGML_KQ_MASK_PAD is no longer present (no
padding required)

* update to c45f89d55

* ec98e2002

solar pro needed more adjusting - needs verification

* review comments
This commit is contained in:
Daniel Hiltgen
2025-12-17 13:13:55 -08:00
committed by GitHub
parent 1c094038bc
commit 49a9c9ba6a
127 changed files with 8128 additions and 6710 deletions

View File

@@ -596,7 +596,7 @@ static void llama_model_quantize_impl(const std::string & fname_inp, const std::
}
std::vector<std::string> splits = {};
llama_model_loader ml(fname_inp, splits, use_mmap, /*check_tensors*/ true, kv_overrides, nullptr);
llama_model_loader ml(fname_inp, splits, use_mmap, /*check_tensors*/ true, /*no_alloc*/ false, kv_overrides, nullptr);
ml.init_mappings(false); // no prefetching
llama_model model(llama_model_default_params());