Ollama Dart Client

tests ollama_dart Discord MIT

Dart client for the Ollama API to run LLMs locally (OpenAI gpt-oss, DeepSeek-R1, Gemma 3, Llama 4, and more).

Table of Contents

Features

Generation & Streaming

  • ✅ Text generation (generate)
  • ✅ Chat completions (chat)
  • ✅ Streaming support with NDJSON
  • ✅ Tool/function calling
  • ✅ Thinking mode (reasoning)
  • ✅ Structured output (JSON mode and JSON schema)
  • ✅ Multimodal support (images)
  • ✅ Context memory for conversation continuity

Embeddings

  • ✅ Generate embeddings (embed)

Model Management

  • ✅ List local models (list)
  • ✅ Show model details (show)
  • ✅ Pull models from library (pull)
  • ✅ Push models to library (push)
  • ✅ Create models from Modelfile (create)
  • ✅ Copy models (copy)
  • ✅ Delete models (delete)
  • ✅ List running models (ps)
  • ✅ Get server version (version)

Why choose this client?

  • ✅ Type-safe with sealed classes
  • ✅ Minimal dependencies (http, logging only)
  • ✅ Works on all compilation targets (native, web, WASM)
  • ✅ Interceptor-driven architecture
  • ✅ Comprehensive error handling
  • ✅ Automatic retry with exponential backoff
  • ✅ NDJSON streaming support

Quickstart

import 'package:ollama_dart/ollama_dart.dart';

void main() async {
  final client = OllamaClient();

  // Chat completion
  final response = await client.chat.create(
    request: ChatRequest(
      model: 'gpt-oss',
      messages: [
        ChatMessage.user('Hello, how are you?'),
      ],
    ),
  );

  print(response.message?.content);

  client.close();
}

Installation

dependencies:
  ollama_dart: ^x.y.z

Configuration

Configuration Options
import 'package:ollama_dart/ollama_dart.dart';

// From environment variables (reads OLLAMA_HOST, defaults to localhost:11434)
final client = OllamaClient.fromEnvironment();

// Or with explicit configuration
final clientWithConfig = OllamaClient(
  config: OllamaConfig(
    baseUrl: 'http://localhost:11434',  // Default Ollama server
    timeout: Duration(minutes: 5),
    retryPolicy: RetryPolicy(
      maxRetries: 3,
      initialDelay: Duration(seconds: 1),
    ),
  ),
);

Authentication (for remote Ollama servers):

final client = OllamaClient(
  config: OllamaConfig(
    baseUrl: 'https://my-ollama-server.example.com',
    authProvider: BearerTokenProvider('YOUR_TOKEN'),
  ),
);

Usage

Chat Completions

Chat Completions Example
import 'package:ollama_dart/ollama_dart.dart';

final client = OllamaClient();

final response = await client.chat.create(
  request: ChatRequest(
    model: 'gpt-oss',
    messages: [
      ChatMessage.system('You are a helpful assistant.'),
      ChatMessage.user('What is the capital of France?'),
    ],
  ),
);

print(response.message?.content);
client.close();

Streaming

Streaming Example
import 'package:ollama_dart/ollama_dart.dart';

final client = OllamaClient();

final stream = client.chat.createStream(
  request: ChatRequest(
    model: 'gpt-oss',
    messages: [
      ChatMessage.user('Tell me a story.'),
    ],
  ),
);

await for (final chunk in stream) {
  stdout.write(chunk.message?.content ?? '');
}

client.close();

Tool Calling

Tool Calling Example
import 'package:ollama_dart/ollama_dart.dart';

final client = OllamaClient();

final response = await client.chat.create(
  request: ChatRequest(
    model: 'gpt-oss',
    messages: [
      ChatMessage.user('What is the weather in Paris?'),
    ],
    tools: [
      ToolDefinition(
        type: ToolType.function,
        function: ToolFunction(
          name: 'get_weather',
          description: 'Get the current weather for a location',
          parameters: {
            'type': 'object',
            'properties': {
              'location': {'type': 'string', 'description': 'City name'},
            },
            'required': ['location'],
          },
        ),
      ),
    ],
  ),
);

if (response.message?.toolCalls != null) {
  for (final toolCall in response.message!.toolCalls!) {
    print('Tool: ${toolCall.function?.name}');
    print('Args: ${toolCall.function?.arguments}');
  }
}

client.close();

Text Generation

Text Generation Example
import 'package:ollama_dart/ollama_dart.dart';

final client = OllamaClient();

final result = await client.completions.generate(
  request: GenerateRequest(
    model: 'gpt-oss',
    prompt: 'Complete this: The capital of France is',
  ),
);

print(result.response);
client.close();

Embeddings

Embeddings Example
import 'package:ollama_dart/ollama_dart.dart';

final client = OllamaClient();

final response = await client.embeddings.create(
  request: EmbedRequest(
    model: 'nomic-embed-text',
    input: 'The quick brown fox jumps over the lazy dog.',
  ),
);

print(response.embeddings);
client.close();

Model Management

Model Management Examples
import 'package:ollama_dart/ollama_dart.dart';

final client = OllamaClient();

// List models
final models = await client.models.list();
for (final model in models.models ?? []) {
  print('${model.name}: ${model.size}');
}

// Pull a model
await for (final progress in client.models.pullStream(
  request: PullRequest(model: 'gpt-oss'),
)) {
  print('${progress.status}: ${progress.completed}/${progress.total}');
}

// Show model details
final info = await client.models.show(
  request: ShowRequest(model: 'gpt-oss'),
);
print(info.license);

// List running models
final running = await client.models.ps();
for (final model in running.models ?? []) {
  print('Running: ${model.model}');
}

// Get server version
final version = await client.version.get();
print('Ollama version: ${version.version}');

client.close();

Examples

See the example/ directory for comprehensive examples:

  1. ollama_dart_example.dart - Basic usage
  2. chat_example.dart - Chat completions
  3. streaming_example.dart - Streaming responses
  4. tool_calling_example.dart - Function calling
  5. embeddings_example.dart - Generate embeddings
  6. models_example.dart - Model management

API Coverage

This client implements 100% of the Ollama REST API:

Chat Resource (client.chat)

  • create - Generate a chat completion
  • createStream - Generate a streaming chat completion

Completions Resource (client.completions)

  • generate - Generate a text completion
  • generateStream - Generate a streaming text completion

Embeddings Resource (client.embeddings)

  • create - Generate embeddings for text

Models Resource (client.models)

  • list - List local models (GET /api/tags)
  • show - Show model details (POST /api/show)
  • create / createStream - Create a model from Modelfile (POST /api/create)
  • copy - Copy a model (POST /api/copy)
  • delete - Delete a model (DELETE /api/delete)
  • pull / pullStream - Pull a model from library (POST /api/pull)
  • push / pushStream - Push a model to library (POST /api/push)
  • ps - List running models (GET /api/ps)

Version Resource (client.version)

  • get - Get server version (GET /api/version)

If these packages are useful to you or your company, please sponsor the project. Development and maintenance are provided to the community for free, but integration tests against real APIs and the tooling required to build and verify releases still have real costs. Your support, at any level, helps keep these packages maintained and free for the Dart & Flutter community.

License

ollama_dart is licensed under the MIT License.

Libraries

ollama_dart
Dart client for the Ollama API.