onde_inference 0.1.4 copy "onde_inference: ^0.1.4" to clipboard
onde_inference: ^0.1.4 copied to clipboard

On-device LLM inference for Flutter & Dart. Run Qwen 2.5 models locally with Metal on iOS and macOS, CPU on Android and desktop. No cloud, no API key.

0.1.4 #

  • Added Qwen 3 4B GGUF model (bartowski/Qwen_Qwen3-4B-GGUF) with full OpenAI-compatible tool calling support.
  • Added GgufModelConfig.qwen3_4b() constructor and registered it in the supported model list.

0.1.3 #

  • Platform: Added support for watchOS and visionOS.

0.1.2 #

  • Engine: Switched all platform-specific mistralrs and mistralrs-core dependencies to the setoelkahfi/mistral.rs fork (branch fix/all-platform-fixes), picking up cross-platform stability fixes ahead of the upstream merge.
  • License: Dual-licensed under MIT OR Apache-2.0; added LICENSE-APACHE alongside the existing LICENSE-MIT for pub.dev compliance.
  • Dependencies: Upgraded freezed_annotation to ^3.1.0 and freezed to ^3.2.5.
  • Removed a stale ignore_for_file directive from the generated Flutter Rust Bridge glue code.

0.1.1 #

  • CI/CD: release-sdk-dart.yml publishes onde_inference to pub.dev on tag push.
  • Copyright headers on all hand-written source files (engine.dart, types.dart, dart_test.dart, iOS and macOS Swift plugin classes).
  • Example app README rewritten — branding, platform notes, and an SDK quick reference.
  • android/local.properties gitignored; local SDK paths no longer pollute diffs.

0.1.0 #

  • Initial MVP release.
  • Multi-turn chat inference with Qwen 2.5 1.5B and 3B GGUF Q4_K_M models.
  • Streaming token delivery via Dart Stream<StreamChunk> — display tokens as they are generated.
  • Metal acceleration on iOS and macOS (Apple silicon and Intel).
  • CPU inference on Android, Linux, and Windows.
  • Platform-aware default model selection (1.5B on iOS / Android, 3B on macOS / Linux / Windows).
  • Conversation history management: history(), clearHistory(), pushHistory().
  • One-shot generate() API that does not affect the conversation history.
  • Configurable sampling: temperature, top-p, top-k, min-p, max tokens, frequency and presence penalties.
  • Built-in sampling presets: SamplingConfig.defaultConfig(), SamplingConfig.deterministic(), SamplingConfig.mobile().
  • EngineInfo snapshot: status, loaded model name, approximate memory, and history length.
  • OndeInference static helper namespace for library initialisation and model / sampling config factories.
  • Compilation stub (frb_generated_stub.dart) so the package compiles before the native Rust bridge is built.
  • Powered by flutter_rust_bridge v2 and the Onde Rust engine.
7
likes
0
points
345
downloads

Publisher

verified publisherondeinference.com

Weekly Downloads

On-device LLM inference for Flutter & Dart. Run Qwen 2.5 models locally with Metal on iOS and macOS, CPU on Android and desktop. No cloud, no API key.

Homepage
Repository (GitHub)
View/report issues

License

unknown (license)

Dependencies

flutter, flutter_rust_bridge, freezed_annotation

More

Packages that depend on onde_inference

Packages that implement onde_inference