170+ AI Models, Running Locally Private AI for iPhone, iPad, Mac, and Vision Pro

Run Llama, Gemma, DeepSeek, and 170+ other models on your own hardware. Multi-agent teams, Knowledge Libraries, voice transcription with speaker ID, text-to-speech. No accounts, no tracking, no data leaves your device.

170+ Local Models Multi-Agent Teams 100% Private Works Offline

Available for iPhone, iPad, Mac, and Vision Pro

On Device AI Mac Chat
On Device AI iPhone Voice Note
On Device AI iPad Library
q AI Teams
Offline

Unparalleled Freedom & Power

Designed for power users who demand native performance, choice, and absolute privacy.

170+ Local Models & 19 Cloud APIs

Never get locked into a single ecosystem. Run Llama, DeepSeek, and Gemma directly on your device with dual GGUF and MLX engines. Need massive reasoning capabilities? Instantly switch to OpenAI, Anthropic, or 19+ other cloud APIs when connected.

Multi-Agent Teams

Orchestrate specialized AI agents that collaborate autonomously. Assign distinct roles and models for complex tasks.

Knowledge Libraries

Your documents, your memory. Import PDFs and notes to create focused contexts. The AI retrieves answers securely locally.

Pure Native Ecosystem

One seamless app across macOS, iOS, iPadOS, and visionOS. Use your Mac as a local inference server while interacting effortlessly from your iPhone. No sluggish web wrappers or electron apps—just lightning-fast, native SwiftUI performance universally.

What Can On Device AI Do?

Powerful AI features right on your device, with no network dependency. Say goodbye to privacy concerns.

Chat Flows participant setup on Mac Chat Flows multi-agent result on Mac

Subagents & Workflow Automation

Multiply your productivity. Deploy specialized agents (e.g., Researcher, Analyst, Writer) that consult each other to tackle complex projects simultaneously across your devices.

Learn more
Voice note recording screen on Mac Voice note transcription and diarization on iPhone

Voice Notes & Transcription

Never miss a detail. Quickly capture, transcribe, and summarize meetings with flawless, on-device transcription. Real-time speaker identification (diarization) keeps your notes perfectly organized.

Learn more
Web search answer with cited results on Mac Tool calling web search workflow on iPad

Active Tools & Web Search

Equip your AI with live knowledge. Let your agents fetch real-time financial news, run calculators, and search the web securely directly from your iPad or Mac, all with full privacy.

Learn more
Text to speech playback controls on Mac Text to speech history screen on Mac

Professional Text-to-Speech

Generate natural, human-like speech using the powerful Kokoro engine. Create lifelike audio content offline natively on your machine to narrate PDFs, books, or articles without bandwidth limits.

Learn more
Knowledge Library file analysis view on Mac Knowledge Library AI Chat

Personal Knowledge Libraries & Files

Build dedicated offline memory spaces for your projects. Import PDFs and notes, and let your AI securely read, search, and synthesize answers directly from your documents.

Learn more

Your Data. Your Choice.

Run leading models locally with zero data leakage, or opt-in to cloud providers when you need maximum power.

Traditional Cloud AI

Your Data
Cloud Server
AI Response
  • Privacy risk — data leaves your device
  • Limited control over your data
  • Internet connection required

On-Device AI

Your Data
Local AI
Instant Response
  • 100% private — data never leaves your device
  • Fully customizable models and roles
  • Works completely offline

Why Is On Device AI More Private?

All processing stays within your device boundary. No accounts, no tracking, no data collection.

  • Zero Data Leakage

    Your conversations, documents, and voice recordings never leave your device during AI processing.

  • Complete Anonymity

    No accounts required, no telemetry, no usage tracking. You own your data completely.

  • Offline First

    Core features work without internet. Cloud providers are optional and require your explicit consent.

Voice Input (STT)
Local AI Processing
Text & Speech Output (TTS)
External Blocked

Spatial Intelligence on Vision Pro

Immerse yourself in your AI workflows with breathtaking native visionOS support.

Web search tool calling interface on Apple Vision Pro

Available on All Apple Platforms

One app, optimized for every device in your ecosystem.

iOS

iPhone & iPad

  • Voice transcription on the go
  • Siri Shortcuts integration
  • Camera & Vision model support
  • Background recording

macOS

Apple Silicon Macs

  • Maximum performance & large context
  • Menu bar & global hotkeys
  • IM integration (Discord, Slack, Telegram)
  • Serve as remote inference server

visionOS

Apple Vision Pro

  • Immersive AI experience
  • Spatial computing ready
  • Hand gesture controls
  • Private AI in mixed reality
Download on the App Store

Free download · No ads · Privacy first

Frequently Asked Questions

On Device AI runs 170+ language models (Llama 3, Gemma 3, Qwen 3, DeepSeek, Phi-4, Mistral) directly on your iPhone, iPad, Mac, or Vision Pro using GGUF and MLX inference engines. All processing happens on your device with zero data collection. No internet required for core features.
Yes. Once you download a model, voice transcription, document analysis, AI chat, Knowledge Libraries, vision models, and text-to-speech all work without internet. Cloud AI providers are available as opt-in additions but disabled by default.
170+ models including Llama 3, Gemma 3, Qwen 3, DeepSeek, Phi-4, and Mistral. You can also import custom GGUF models from Hugging Face. The app uses both llama.cpp (GGUF) and MLX inference engines for broad compatibility on Apple Silicon.
You deploy specialized AI agents that collaborate on a single task. A Researcher gathers information, an Analyst evaluates it, a Writer produces the output. Each agent can use a different model and role. The whole workflow runs locally on your device.
iPhone 14 and newer, iPad mini 7th gen and newer, iPad Air 5th gen and newer, iPad Pro 3rd gen and newer, iPad 11th gen and newer, all Apple Silicon Macs, and Apple Vision Pro. Devices with 6GB+ RAM provide optimal performance.
Yes. On Device AI supports 19+ cloud providers including OpenAI, Anthropic Claude, Google Gemini, Groq, OpenRouter, Nvidia, LM Studio, and Ollama. Cloud is off by default and requires your explicit configuration. You can mix local and cloud models in the same conversation.
Project-specific document stores. Import PDFs, notes, web captures, and images into a Library. The AI retrieves and answers questions sourced from those documents using on-device embeddings. Your files never leave your device.
Yes. Local vision models using both MLX and GGUF engines, plus cloud vision APIs as opt-in. Analyze photos, screenshots, diagrams, and documents with OCR text extraction, all processed on your device.
Yes. ChatGPT sends your data to OpenAI's servers for processing. On Device AI runs models directly on your hardware. No data collection, no accounts, no analytics, no telemetry. Your conversations never leave your device during local processing. ChatGPT has stronger frontier reasoning capabilities, but if privacy matters to you, local processing is the only real guarantee.
On Device AI runs 170+ models on Apple Silicon Macs with both GGUF and MLX engines. Mac-specific features include menu bar access, global hotkeys, IM integration for Discord/Slack/Telegram, and the ability to serve as a remote inference server for your iPhone and iPad.
Yes. Voice transcription includes speaker diarization, which identifies who said what during meetings, interviews, and conversations. It runs on-device using a Whisper-based speech-to-text engine. No audio is sent externally.
Yes. The Kokoro TTS engine generates natural-sounding speech offline on your device. You can read aloud PDFs, articles, or AI responses without internet. Multiple voice options are available.
GGUF models use llama.cpp and work across a wide range of hardware with broad model compatibility. MLX models are optimized for Apple Silicon and can offer better memory efficiency. On Device AI supports both, so you can pick the right engine for your device and workload.
On Device AI includes IM chatbot integration for Discord, Slack, and Telegram on macOS. Your Mac acts as the inference server, so messages are processed locally on your hardware rather than a cloud server.
Unlike cloud-based AI that sends your prompts to remote servers, On Device AI runs entirely on your hardware. No data collection. No analytics. No backend. No internet required. The core features are free forever. If you need Pro functions, you can choose what works best for you: a monthly or annual subscription, or a one-time purchase to own it outright. Every option is priced lower than ChatGPT.

Get in touch

Questions, bug reports, feature ideas — all welcome.

[email protected]