Cloud Providers
While On Device AI is designed to work completely offline, sometimes you just need a bigger brain. When you're tackling massive coding projects or need extreme reasoning, you can easily plug into cloud AI platforms. The cloud is always opt-in and turned off by default.
Supported Providers
On Device AI supports connecting to the following platforms:
- OpenAI: GPT-4o, GPT-4 Turbo, GPT-3.5 Turbo
- Anthropic: Claude 3.5 Sonnet, Opus, Haiku
- Google Gemini: Gemini 1.5 Pro, Flash
- Mistral: Large, NeMo, MathStrall
- Groq: Ultra-fast inference
- xAI: Grok family
- Nvidia, OpenRouter, AWS Bedrock, Z.ai, Opencode Zen, Qwen Portal, Kimi, Hugging Face
- Cloudflare AI Gateway
- Microsoft Foundry
- GitHub Models
- Local Servers: LM Studio, Ollama
Setting Up (The Folder View)
Configuring a dozen different API settings can become a messy nightmare quickly. That's why your providers are organized neatly into three simple folders in the settings:
- Open Settings → Cloud Providers
- Browse the Folders
You'll see three categories: Connected, Local Engines, and Cloud Platforms.
- Pick Your Flavor
Open the Cloud Platforms folder and choose a service (like OpenAI or Anthropic). Paste your API key straight from their dashboard.
- Select a Model
The app instantly talks to the provider and lists the available models right there. Choose your favorite, and the provider automatically jumps up into your Connected folder so it's always easy to find.
When using cloud providers, your conversation data leaves your device and goes to their servers. We don't sit in the middle or log your chats, but you should still review each provider's privacy policy before pasting your key.
AWS Bedrock
AWS Bedrock requires AWS credentials rather than a standard API key. It's built for enterprise security.
- Set your AWS Region
Enter the region where Bedrock is active (e.g.
us-east-1). - Enter AWS Credentials
Tap Enter Credentials. Provide your AWS Access Key ID, Secret Access Key, and an optional session token if you're using temporary credentials.
- Enter a Bedrock Model ID
Since AWS doesn't auto-list models through an easy endpoint, you'll need to manually type the Bedrock model ID (like
anthropic.claude-3-sonnet-20240229-v1:0).
Qwen Portal
Alibaba's Qwen Portal is flexible and supports two ways to plug in:
- API Key: The standard method if you have a traditional key.
- OAuth Refresh Token: A much smoother option. Drop in your refresh token, and the app seamlessly handles getting fresh access tokens for you behind the scenes before every chat.
Select your preferred authentication mode in Settings → Cloud Providers → Qwen Portal before entering your credentials.
Switching Between Local & Cloud
You can switch between local and cloud models at any time, even within the same conversation:
- Open the model picker
- Cloud models appear alongside local models, clearly labeled with the provider name
- Select any model to switch — the conversation context is maintained
When you switch from a local model to a cloud model mid-conversation, your conversation history is sent to the cloud provider. Consider starting a new conversation if you have sensitive content.
Privacy Considerations
On Device AI is designed with privacy first:
- Cloud is always opt-in: No data is ever sent to any server unless you explicitly configure and select a cloud provider
- API keys stored in Keychain: Your credentials are stored using Apple's secure Keychain, not in UserDefaults or plain text
- Direct connection: Data goes directly from your device to the provider — we don't proxy or store anything
- Clear indicators: The UI clearly shows when you're using a cloud model vs. a local one
Local Servers (Ollama & LM Studio)
Ollama and LM Studio are special cases — they run AI models on your own hardware (Mac, PC, or server) rather than in the cloud. This gives you the power of larger models while maintaining privacy:
- Ollama: Set the server URL (default:
http://localhost:11434) in Settings - LM Studio: Set the server URL (default:
http://localhost:1234) in Settings
Your data stays on your local network when using these providers. This is a great option for running larger models on a powerful Mac while chatting from your iPhone.
On Device AI can also serve as a remote inference server itself (macOS). Connect your iPhone to your Mac running On Device AI for the best of both worlds.