Add image generation documentation

- Add image generation capability page with API usage examples - Add image-generation to docs.json navigation - Update openapi.yaml with image generation request/response fields - Request: width, height, steps - Response: image, completed, total
2026-04-23 17:29:54 +02:00 · 2026-01-22 14:09:58 -08:00
3 changed files with 233 additions and 0 deletions
--- a/docs/capabilities/image-generation.mdx
+++ b/docs/capabilities/image-generation.mdx
@@ -0,0 +1,205 @@
 ---
 title: Image Generation
 ---
 <Warning>
 Image generation is experimental and currently only available on macOS. This feature may change in future versions.
 </Warning>
 Image generation models create images from text prompts. Ollama supports diffusion-based image generation models through both Ollama's API and OpenAI-compatible endpoints.
 ## Usage
 <Tabs>
  <Tab title="CLI">
    ```shell
    ollama run x/z-image-turbo "a sunset over mountains"
    ```
    The generated image will be saved to the current directory.
  </Tab>
  <Tab title="cURL">
    ```shell
    curl http://localhost:11434/api/generate -d '{
      "model": "x/z-image-turbo",
      "prompt": "a sunset over mountains",
      "stream": false
    }'
    ```
  </Tab>
  <Tab title="Python">
    ```python
    import ollama
    import base64
    response = ollama.generate(
        model='x/z-image-turbo',
        prompt='a sunset over mountains',
    )
    # Save the generated image
    with open('output.png', 'wb') as f:
        f.write(base64.b64decode(response['image']))
    print('Image saved to output.png')
    ```
  </Tab>
  <Tab title="JavaScript">
    ```javascript
    import ollama from 'ollama'
    import { writeFileSync } from 'fs'
    const response = await ollama.generate({
      model: 'x/z-image-turbo',
      prompt: 'a sunset over mountains',
    })
    // Save the generated image
    const imageBuffer = Buffer.from(response.image, 'base64')
    writeFileSync('output.png', imageBuffer)
    console.log('Image saved to output.png')
    ```
  </Tab>
 </Tabs>
 ### Response
 The response includes an `image` field containing the base64-encoded image data:
 ```json
 {
  "model": "x/z-image-turbo",
  "created_at": "2024-01-15T10:30:15.000000Z",
  "image": "iVBORw0KGgoAAAANSUhEUg...",
  "done": true,
  "done_reason": "stop",
  "total_duration": 15000000000,
  "load_duration": 2000000000
 }
 ```
 ## Image dimensions
 Customize the output image size using the `width` and `height` parameters:
 <Tabs>
  <Tab title="cURL">
    ```shell
    curl http://localhost:11434/api/generate -d '{
      "model": "x/z-image-turbo",
      "prompt": "a portrait of a robot artist",
      "width": 768,
      "height": 1024,
      "stream": false
    }'
    ```
  </Tab>
  <Tab title="Python">
    ```python
    import ollama
    response = ollama.generate(
        model='x/z-image-turbo',
        prompt='a portrait of a robot artist',
        width=768,
        height=1024,
    )
    ```
  </Tab>
  <Tab title="JavaScript">
    ```javascript
    import ollama from 'ollama'
    const response = await ollama.generate({
      model: 'x/z-image-turbo',
      prompt: 'a portrait of a robot artist',
      width: 768,
      height: 1024,
    })
    ```
  </Tab>
 </Tabs>
 ## Streaming progress
 When streaming is enabled (the default), progress updates are sent during image generation:
 ```json
 {
  "model": "x/z-image-turbo",
  "created_at": "2024-01-15T10:30:00.000000Z",
  "completed": 5,
  "total": 20,
  "done": false
 }
 ```
 The `completed` and `total` fields indicate the current progress through the diffusion steps.
 ## Parameters
 | Parameter | Description | Default |
 |-----------|-------------|---------|
 | `prompt` | Text description of the image to generate | Required |
 | `width` | Width of the generated image in pixels | Model default |
 | `height` | Height of the generated image in pixels | Model default |
 | `steps` | Number of diffusion steps | Model default |
 ## OpenAI compatibility
 Image generation is also available through the OpenAI-compatible `/v1/images/generations` endpoint:
 <Tabs>
  <Tab title="cURL">
    ```shell
    curl http://localhost:11434/v1/images/generations \
      -H "Content-Type: application/json" \
      -d '{
        "model": "x/z-image-turbo",
        "prompt": "a sunset over mountains",
        "size": "1024x1024",
        "response_format": "b64_json"
      }'
    ```
  </Tab>
  <Tab title="Python">
    ```python
    from openai import OpenAI
    client = OpenAI(
        base_url='http://localhost:11434/v1/',
        api_key='ollama',  # required but ignored
    )
    response = client.images.generate(
        model='x/z-image-turbo',
        prompt='a sunset over mountains',
        size='1024x1024',
        response_format='b64_json',
    )
    print(response.data[0].b64_json[:50] + '...')
    ```
  </Tab>
  <Tab title="JavaScript">
    ```javascript
    import OpenAI from 'openai'
    const openai = new OpenAI({
      baseURL: 'http://localhost:11434/v1/',
      apiKey: 'ollama', // required but ignored
    })
    const response = await openai.images.generate({
      model: 'x/z-image-turbo',
      prompt: 'a sunset over mountains',
      size: '1024x1024',
      response_format: 'b64_json',
    })
    console.log(response.data[0].b64_json.slice(0, 50) + '...')
    ```
  </Tab>
 </Tabs>
 See [OpenAI compatibility](/api/openai-compatibility#v1imagesgenerations-experimental) for more details.
--- a/docs/docs.json
+++ b/docs/docs.json
@@ -93,6 +93,7 @@
              "/capabilities/thinking",
              "/capabilities/structured-outputs",
              "/capabilities/vision",
              "/capabilities/image-generation",
              "/capabilities/embeddings",
              "/capabilities/tool-calling",
              "/capabilities/web-search"
--- a/docs/openapi.yaml
+++ b/docs/openapi.yaml
@@ -117,6 +117,15 @@ components:
        top_logprobs:
          type: integer
          description: Number of most likely tokens to return at each token position when logprobs are enabled
        width:
          type: integer
          description: (Experimental) Width of the generated image in pixels. For image generation models only.
        height:
          type: integer
          description: (Experimental) Height of the generated image in pixels. For image generation models only.
        steps:
          type: integer
          description: (Experimental) Number of diffusion steps. For image generation models only.
    GenerateResponse:
      type: object
      properties:
@@ -161,6 +170,15 @@ components:
          items:
            $ref: "#/components/schemas/Logprob"
          description: Log probability information for the generated tokens when logprobs are enabled
        image:
          type: string
          description: (Experimental) Base64-encoded generated image data. For image generation models only.
        completed:
          type: integer
          description: (Experimental) Number of completed diffusion steps. For image generation streaming progress.
        total:
          type: integer
          description: (Experimental) Total number of diffusion steps. For image generation streaming progress.
    GenerateStreamEvent:
      type: object
      properties:
@@ -200,6 +218,15 @@ components:
        eval_duration:
          type: integer
          description: Time spent generating tokens in nanoseconds
        image:
          type: string
          description: (Experimental) Base64-encoded generated image data. For image generation models only.
        completed:
          type: integer
          description: (Experimental) Number of completed diffusion steps. For image generation streaming progress.
        total:
          type: integer
          description: (Experimental) Total number of diffusion steps. For image generation streaming progress.
    ChatMessage:
      type: object
      required: [role, content]