Cloud Providers

While On Device AI is designed to work completely offline, sometimes you just need a bigger brain. When you're tackling massive coding projects or need extreme reasoning, you can easily plug into cloud AI platforms. The cloud is always opt-in and turned off by default.

On this page

Supported Providers
Setting Up (The Folder View)
AWS Bedrock
Qwen Portal
Switching Between Local & Cloud
Privacy Considerations
Local Servers (Ollama & LM Studio)

Supported Providers

On Device AI supports connecting to the following platforms:

OpenAI: GPT-4o, GPT-4 Turbo, GPT-3.5 Turbo
Anthropic: Claude 3.5 Sonnet, Opus, Haiku
Google Gemini: Gemini 1.5 Pro, Flash
Mistral: Large, NeMo, MathStrall
Groq: Ultra-fast inference
xAI: Grok family
Nvidia, OpenRouter, AWS Bedrock, Z.ai, Opencode Zen, Qwen Portal, Kimi, Hugging Face
Cloudflare AI Gateway
Microsoft Foundry
GitHub Models
Local Servers: LM Studio, Ollama

Setting Up (The Folder View)

Configuring a dozen different API settings can become a messy nightmare quickly. That's why your providers are organized neatly into three simple folders in the settings:

Open Settings → Cloud Providers
Browse the Folders
You'll see three categories: Connected, Local Engines, and Cloud Platforms.
Pick Your Flavor
Open the Cloud Platforms folder and choose a service (like OpenAI or Anthropic). Paste your API key straight from their dashboard.
Select a Model
The app instantly talks to the provider and lists the available models right there. Choose your favorite, and the provider automatically jumps up into your Connected folder so it's always easy to find.

⚠️ Important

When using cloud providers, your conversation data leaves your device and goes to their servers. We don't sit in the middle or log your chats, but you should still review each provider's privacy policy before pasting your key.

AWS Bedrock

AWS Bedrock requires AWS credentials rather than a standard API key. It's built for enterprise security.

Set your AWS Region
Enter the region where Bedrock is active (e.g. us-east-1).
Enter AWS Credentials
Tap Enter Credentials. Provide your AWS Access Key ID, Secret Access Key, and an optional session token if you're using temporary credentials.
Enter a Bedrock Model ID
Since AWS doesn't auto-list models through an easy endpoint, you'll need to manually type the Bedrock model ID (like anthropic.claude-3-sonnet-20240229-v1:0).

Qwen Portal

Alibaba's Qwen Portal is flexible and supports two ways to plug in:

API Key: The standard method if you have a traditional key.
OAuth Refresh Token: A much smoother option. Drop in your refresh token, and the app seamlessly handles getting fresh access tokens for you behind the scenes before every chat.

Select your preferred authentication mode in Settings → Cloud Providers → Qwen Portal before entering your credentials.

Switching Between Local & Cloud

You can switch between local and cloud models at any time, even within the same conversation:

Open the model picker
Cloud models appear alongside local models, clearly labeled with the provider name
Select any model to switch — the conversation context is maintained

ℹ️ Note

When you switch from a local model to a cloud model mid-conversation, your conversation history is sent to the cloud provider. Consider starting a new conversation if you have sensitive content.

Privacy Considerations

On Device AI is designed with privacy first:

Cloud is always opt-in: No data is ever sent to any server unless you explicitly configure and select a cloud provider
API keys stored in Keychain: Your credentials are stored using Apple's secure Keychain, not in UserDefaults or plain text
Direct connection: Data goes directly from your device to the provider — we don't proxy or store anything
Clear indicators: The UI clearly shows when you're using a cloud model vs. a local one

Local Servers (Ollama & LM Studio)

Ollama and LM Studio are special cases — they run AI models on your own hardware (Mac, PC, or server) rather than in the cloud. This gives you the power of larger models while maintaining privacy:

Ollama: Set the server URL (default: http://localhost:11434) in Settings
LM Studio: Set the server URL (default: http://localhost:1234) in Settings

Your data stays on your local network when using these providers. This is a great option for running larger models on a powerful Mac while chatting from your iPhone.

💡 Tip

On Device AI can also serve as a remote inference server itself (macOS). Connect your iPhone to your Mac running On Device AI for the best of both worlds.