mirror of
https://github.com/ollama/ollama.git
synced 2026-04-18 22:54:13 +02:00
cmd: add eval command for lightweight model evals
This commit is contained in:
50
cmd/eval/README.md
Normal file
50
cmd/eval/README.md
Normal file
@@ -0,0 +1,50 @@
|
||||
# eval
|
||||
|
||||
Evaluation tool for testing Ollama models.
|
||||
|
||||
## Usage
|
||||
|
||||
Run all tests:
|
||||
|
||||
```bash
|
||||
go run . -model llama3.2:latest
|
||||
```
|
||||
|
||||
Run specific suite:
|
||||
|
||||
```bash
|
||||
go run . -model llama3.2:latest -suite tool-calling-basic -v
|
||||
```
|
||||
|
||||
List available suites:
|
||||
|
||||
```bash
|
||||
go run . -list
|
||||
```
|
||||
|
||||
## Adding Tests
|
||||
|
||||
Edit `suites.go` to add new test suites. Each test needs:
|
||||
|
||||
- `Name`: test identifier
|
||||
- `Prompt`: what to send to the model
|
||||
- `Check`: function to validate the response
|
||||
|
||||
Example:
|
||||
|
||||
```go
|
||||
{
|
||||
Name: "my-test",
|
||||
Prompt: "What is 2+2?",
|
||||
Check: Contains("4"),
|
||||
}
|
||||
```
|
||||
|
||||
Available check functions:
|
||||
|
||||
- `HasResponse()` - response is non-empty
|
||||
- `Contains(s)` - response contains substring
|
||||
- `CallsTool(name)` - model called specific tool
|
||||
- `NoTools()` - model called no tools
|
||||
- `MinTools(n)` - model called at least n tools
|
||||
- `All(checks...)` - all checks pass
|
||||
Reference in New Issue
Block a user