The privacy problem with team chat AI
Most AI chatbots for Discord, Slack, and Telegram work by sending every message to a cloud service. That means your team conversations, internal discussions, and direct messages pass through third-party AI providers where they may be logged, analyzed, or used for model training.
For teams handling sensitive information—legal discussions, medical data, financial planning, proprietary code—this creates a real problem. You want the productivity benefits of an AI assistant, but you cannot accept the privacy tradeoff.
Self-hosted alternatives exist, but they usually require running dedicated servers, managing infrastructure, and dealing with deployment complexity. On Device AI takes a different approach: the AI runs directly on your Mac, processing messages locally without requiring server setup or cloud dependencies.
What IM Communication actually does
IM Communication is a macOS-only feature that connects On Device AI to Discord, Slack, or Telegram using standard bot APIs. You create a bot on your chosen platform, paste the authentication token into On Device AI's settings, and toggle the connection on.
Once connected, the bot listens for messages in channels and direct messages. When someone sends a message, On Device AI processes it using the local model you have loaded—whether that is a downloaded GGUF or MLX model running entirely on your Mac, or a cloud API model using your own credentials.
The feature is macOS-only because it relies on background processing capabilities and system-level integration that work best on the Mac platform. iPhone and iPad users can still use On Device AI's main conversation features, but IM Communication requires a Mac.
How the Autopilot works
The Autopilot system processes messages one at a time through a sequential queue. You can configure the queue size from 5 to 50 messages in the settings. When the queue fills up, the system sends a polite busy-reply and waits 30 seconds before accepting new overflow messages.
Each Discord channel, Slack conversation, or Telegram chat maintains its own context in memory. The bot remembers prior messages from that specific conversation, so follow-up questions work naturally. Different conversations do not share context, which prevents information leakage between unrelated discussions.
The system ignores messages sent by the bot itself, preventing infinite loops. It also skips command messages (more on those below) so they do not pollute the conversation context.
For Telegram direct messages specifically, the bot can stream partial responses as drafts while generating the answer. This gives users real-time feedback during longer responses. The feature is limited to Telegram DMs to avoid rate limiting issues in group chats.
You can optionally enable history persistence, which rebuilds conversation context from saved messages when you restart the app. This is useful if you want the bot to remember prior discussions across sessions. History is capped at 500 messages per conversation to prevent unbounded storage growth.
Control the assistant with chat commands
In direct messages, you can control the AI using commands that start with an exclamation mark. Type !help to see the full list. Commands let you switch models, enable or disable tools, list available models, and reset the conversation—all without leaving the chat interface.
For example, !model list shows all downloaded local models and configured cloud providers. !model switch gpt-4o switches to GPT-4o if you have OpenAI credentials configured. !new archives the current conversation and starts fresh.
Commands only work in direct messages, not in group channels. This prevents accidental command execution when someone types an exclamation-prefixed message in a public discussion. Command messages and their responses are excluded from conversation history and transcripts, keeping the context clean.
Why on-device processing changes everything
When you use a local model, messages never leave your Mac for AI processing. The inference happens entirely on your hardware using the same on-device models you use in the main On Device AI app. Bot authentication tokens are stored in the macOS system Keychain, not in plain text configuration files.
This means you can run an AI assistant in your team chat without sending every message to OpenAI, Anthropic, or another cloud provider. For teams working with confidential information, this is the difference between using AI and not using it at all.
If you do choose to use a cloud API model (by configuring your own API keys), the messages go directly from your Mac to that provider using your credentials. You control which provider sees what, and you are not locked into a specific AI service.
Because the bot runs on your Mac, it works offline when using local models. No internet connection is required for the AI inference itself—only for the bot to connect to Discord, Slack, or Telegram.
Supported providers and setup
On Device AI supports three messaging platforms: Discord, Slack, and Telegram. Each uses standard bot APIs that you configure through the platform's developer portal.
For Discord, you create a bot application, enable the necessary Gateway intents, and copy the bot token. The connector uses Discord's Gateway WebSocket API to receive messages in real time.
For Slack, you create a Slack app, enable Socket Mode, and configure the required OAuth scopes. The connector uses Slack's Socket Mode and Web API to handle messages and send replies.
For Telegram, you create a bot through BotFather and copy the HTTP API token. The connector uses long polling to receive updates from the Telegram Bot API.
All credentials are stored securely in the macOS Keychain using the same encryption and security mechanisms as your saved passwords. The app never stores tokens in plain text or in user-accessible configuration files.
View IM conversations inside On Device AI
IM conversations appear as first-class items in the On Device AI conversation list on macOS. When you select an IM conversation, the main chat view displays the full transcript with provider badges, timestamps, and direction indicators showing whether each message was inbound or outbound.
The UI visually distinguishes IM messages from local chat messages, so you always know which conversation type you are viewing. You can browse IM transcripts even when the Autopilot is disabled, which is useful for reviewing past interactions without the bot actively responding.
IM history is persisted to the local database with a 500-message cap per conversation. Older messages are automatically trimmed to prevent unbounded storage growth. The in-memory UI display shows up to 100 recent messages and resets when you restart the app, but the full persisted history remains available.
Configuration options
The IM Integration settings panel gives you control over how the bot behaves. You can toggle IM connectivity globally, enable or disable the Autopilot, and configure individual providers independently.
The queue size setting lets you choose how many messages can wait for processing before the system sends busy-replies. Smaller queues (5-10) respond faster to overload but may send more busy-replies. Larger queues (30-50) buffer more messages but may delay responses during high traffic.
You can customize the system prompt that the bot uses when generating replies. This lets you define the assistant's personality, expertise, and response style specifically for IM interactions. The prompt is stored in a JSON configuration file and persists across app restarts.
IM-specific reasoning controls let you enable or disable the model's thinking process for IM replies. If enabled, you can choose whether to include the thinking content in the outbound message or strip it before sending. This is useful for models that support chain-of-thought reasoning but where you only want to send the final answer to the chat.
The history reuse toggle controls whether the app rebuilds conversation context from persisted history when you restart. When enabled, the bot remembers prior discussions. When disabled, each app launch starts with fresh context.
Privacy and security considerations
Messages are transmitted between your Mac and the messaging platform using the platform's standard bot APIs. This is the same data flow as any other bot integration—Discord, Slack, and Telegram receive the messages as part of normal platform operation.
The AI processing happens on your Mac when using local models. If you configure a cloud API model, messages are sent to that provider using your API credentials. You control which provider processes which conversations.
Bot tokens are stored in the macOS Keychain, which uses hardware-backed encryption on Macs with a Secure Enclave. Conversation context may be stored in memory during active sessions or persisted to the local database based on your settings. No conversation data is sent to On Device AI's servers—the app does not have a backend service.
Messages sent through Discord, Slack, or Telegram are subject to those platforms' privacy policies. The platforms can see message content as part of their normal operation. On Device AI does not add additional third-party data sharing beyond what the messaging platform itself requires.
Who this is for
This feature is designed for small teams who want a private AI assistant in their existing chat infrastructure without running dedicated servers or sending messages to cloud AI providers.
Developers can use it to add a coding assistant to their Discord or Slack workspace that answers questions about documentation, explains code snippets, or helps debug issues—all while keeping proprietary code discussions private.
Privacy-conscious users who already rely on On Device AI for local AI processing can now extend that same privacy model to their team chat. If you trust On Device AI with your documents and conversations, you can trust it with your IM messages using the same on-device processing.
Anyone who wants AI benefits in their chat but cannot accept the privacy tradeoff of cloud-based chatbots will find this useful. The feature gives you the option to run AI locally while still participating in team discussions on popular messaging platforms.