Tool Calling

On Device AI lets compatible models call built-in tools during a conversation — searching the web, fetching pages, making HTTP requests, running calculations, and more. The AI decides when to use a tool automatically; you control which tools are available and their default parameters.

How It Works

When tool calling is enabled, the app injects tool definitions into the model's context. The model can then respond with a tool call instead of (or before) a text answer. The app executes the tool, feeds the result back to the model, and the model continues generating its final response.

  1. User sends a message
  2. Model decides a tool is needed and emits a tool call
  3. App executes the tool (network, calculation, memory lookup, etc.)
  4. Tool result is injected back into the conversation
  5. Model generates its final answer using the result

Multiple tool calls can happen in a single turn, and the loop can repeat up to 8 times before the model is forced to produce a final answer.

ℹ️ Loop Reliability

The app includes several safeguards to keep the tool loop healthy: duplicate calls within the same response are suppressed; if the model gets stuck repeating identical calls, a forced final-answer pass synthesises a response from accumulated results; if all tool calls in a round fail, the model is instructed to explain the errors rather than silently stopping.

ℹ️ Model Compatibility

Tool calling works best with models that have been fine-tuned for function calling (e.g. Llama 3, Qwen 2.5, Mistral, and most instruction-tuned models). GGUF models use text-based tool call detection; MLX and API models use native structured tool call APIs.

Enabling Tools

Go to Settings → Tool Calling to see all available tools. Each tool card shows:

You can also toggle tool calling on or off per-conversation from the model picker in the chat header.

Role & Agent Tools

You can configure a specific set of default tools for an AI Role or a Chat Flow Participant. This is useful for creating specialized agents that only have access to relevant tools (e.g., a "Researcher" role that only has Web Search, or a "Math Tutor" that only has Calculator).

Effective Tools: A tool is only available to a role/agent if it is also enabled in Global Settings and you have the required access (e.g. Pro subscription).

Default Parameters

Some tools expose optional parameters that the model may omit. You can set app-wide defaults for these in the Default Params section of each tool card in Settings → Tool Calling.

For tools with a Local / API split, you can set separate defaults depending on whether you are using a local model or a cloud API provider — useful when you want a shorter timeout for fast cloud calls and a longer one for local inference.

Defaults are persisted across app restarts. Clear a field to revert to the built-in default.

Tool Order

The order of tools in the Settings list is the order they are presented to the model. Drag and drop tool cards to reorder them. The same order is reflected in the per-session tool selector in the chat header.

Tools Reference

Web Search

Tool name: web_search

Searches the web for a query, fetches the top result pages, and uses a sub-agent pass to synthesize a structured answer with source citations.

Parameters

All parameters except query are configurable in Settings → Tool Calling → Web Search → Default Params.

Web Fetch

Tool name: web_fetch

Fetches the content of a single URL and returns it as markdown text. Useful when the model already knows the URL it needs to read.

Parameters

Calculator

Tool name: calculator

Evaluates mathematical expressions and returns the numeric result. Handles arithmetic, percentages, and basic algebra. No configurable defaults.

Parameters

Global Memory

Tool name: global_memory

Stores and retrieves personal preferences, facts, and notes that persist across all conversations. The model can write new memories or read existing ones to personalize its responses over time.

Parameters

💡 Tip

Tell the model to remember something — for example, "Remember that I prefer concise answers" — and it will store that preference in Global Memory for future conversations.

Task Planner (PRO)

Tool name: task_planner

Creates and updates a task plan so the model can break complex requests into ordered steps and track progress. Plans are scoped to each conversation — starting a new conversation begins with an empty plan.

ℹ️ Conversation Scope

Each conversation has its own task plan. Plans are not shared across different conversations and are not persisted across app restarts. This ensures a clean slate for each new conversation.

Operations

Parameters

When enabled, Task Planner injects a compact plan snapshot into the prompt context to keep the model aligned with current progress. Task status is simple: pending or done.

HTTP Request

Tool name: http_request

Makes arbitrary HTTP requests to any endpoint. The model can call REST APIs, send data to webhooks, fetch JSON from external services, or download files — all from within the conversation.

⚠️ Privacy Notice

The HTTP Request tool can transmit conversation data to external services. Only enable it when you trust the model and the endpoints it may call. Review requests in the tool call log before sensitive data leaves your device.

Parameters

Response Format

The tool always returns a structured JSON object:

Redirect Handling

Redirects are followed automatically up to 5 hops. 301/302 responses to POST requests and all 303 responses are converted to GET on redirect (standard browser behaviour). If the redirect limit is exceeded, the tool returns a structured error.

Download Mode

When download: true, the response bytes are saved to the app's private storage directory (OnDevice-AI/HTTPDownloads/). Each download is assigned a UUID and indexed in a persistent JSON file so the model can reference it in follow-up turns. Binary responses that cannot be decoded as UTF-8 text automatically suggest retrying with download: true.

Configurable Default Timeout

Go to Settings → Tool Calling → HTTP Request → Default Params to set separate timeout defaults for local models and API providers. The built-in default is 30 seconds for both.