Tool Calling

On Device AI lets compatible models call built-in tools during a conversation — searching the web, fetching pages, making HTTP requests, running calculations, and more. The AI decides when to use a tool automatically; you control which tools are available and their default parameters.

On this page

How It Works
Enabling Tools
Role & Agent Tools
Default Parameters
Tool Order
Tools Reference
Web Search
Web Fetch
Calculator
Global Memory
Task Planner (PRO)
HTTP Request

How It Works

When tool calling is enabled, the app injects tool definitions into the model's context. The model can then respond with a tool call instead of (or before) a text answer. The app executes the tool, feeds the result back to the model, and the model continues generating its final response.

User sends a message
Model decides a tool is needed and emits a tool call
App executes the tool (network, calculation, memory lookup, etc.)
Tool result is injected back into the conversation
Model generates its final answer using the result

Multiple tool calls can happen in a single turn, and the loop can repeat up to 8 times before the model is forced to produce a final answer.

ℹ️ Loop Reliability

The app includes several safeguards to keep the tool loop healthy: duplicate calls within the same response are suppressed; if the model gets stuck repeating identical calls, a forced final-answer pass synthesises a response from accumulated results; if all tool calls in a round fail, the model is instructed to explain the errors rather than silently stopping.

ℹ️ Model Compatibility

Tool calling works best with models that have been fine-tuned for function calling (e.g. Llama 3, Qwen 2.5, Mistral, and most instruction-tuned models). GGUF models use text-based tool call detection; MLX and API models use native structured tool call APIs.

Enabling Tools

Go to Settings → Tool Calling to see all available tools. Each tool card shows:

The tool name and a short description
An Enable / Disable button
A Default Params section (if the tool has configurable defaults)

You can also toggle tool calling on or off per-conversation from the model picker in the chat header.

Role & Agent Tools

You can configure a specific set of default tools for an AI Role or a Chat Flow Participant. This is useful for creating specialized agents that only have access to relevant tools (e.g., a "Researcher" role that only has Web Search, or a "Math Tutor" that only has Calculator).

Role Defaults: When you create or edit a Role, you can select up to 5 default tools. Starting a conversation with this role automatically selects these tools.
Chat Flow Participants: In a Chat Flow, each participant can have a custom "Tool Allowlist" (max 5 tools).

Effective Tools: A tool is only available to a role/agent if it is also enabled in Global Settings and you have the required access (e.g. Pro subscription).

Default Parameters

Some tools expose optional parameters that the model may omit. You can set app-wide defaults for these in the Default Params section of each tool card in Settings → Tool Calling.

For tools with a Local / API split, you can set separate defaults depending on whether you are using a local model or a cloud API provider — useful when you want a shorter timeout for fast cloud calls and a longer one for local inference.

Defaults are persisted across app restarts. Clear a field to revert to the built-in default.

Tool Order

The order of tools in the Settings list is the order they are presented to the model. Drag and drop tool cards to reorder them. The same order is reflected in the per-session tool selector in the chat header.

Tools Reference

Web Search

Tool name: web_search

Searches the web for a query, fetches the top result pages, and uses a sub-agent pass to synthesize a structured answer with source citations.

Parameters

query (required) — The search query string
max_pages — Number of result pages to read (default: 2, range: 1–5)
max_chars — Maximum characters of content to read per page (local default: 8 000, API default: 32 000)
max_chunks — Maximum content chunks per page (default: 3, range: 1–20). Set via Settings only — the model cannot override this value.
accept_markdown — Send Accept: text/markdown header when fetching pages (default: true)

All parameters except query are configurable in Settings → Tool Calling → Web Search → Default Params.

Web Fetch

Tool name: web_fetch

Fetches the content of a single URL and returns it as markdown text. Useful when the model already knows the URL it needs to read.

Parameters

url (required) — The URL to fetch
max_chars — Maximum characters to return (local default: 8 000, API default: 32 000)
accept_markdown — Send Accept: text/markdown header (default: true)

Calculator

Tool name: calculator

Evaluates mathematical expressions and returns the numeric result. Handles arithmetic, percentages, and basic algebra. No configurable defaults.

Parameters

expression (required) — The math expression to evaluate (e.g. "(12 * 8) / 3 + 5%")

Global Memory

Tool name: global_memory

Stores and retrieves personal preferences, facts, and notes that persist across all conversations. The model can write new memories or read existing ones to personalize its responses over time.

Parameters

action (required) — "read", "write", or "delete"
key — Memory key for write/delete operations
value — Memory value for write operations

💡 Tip

Tell the model to remember something — for example, "Remember that I prefer concise answers" — and it will store that preference in Global Memory for future conversations.

Task Planner (PRO)

Tool name: task_planner

Creates and updates a task plan so the model can break complex requests into ordered steps and track progress. Plans are scoped to each conversation — starting a new conversation begins with an empty plan.

ℹ️ Conversation Scope

Each conversation has its own task plan. Plans are not shared across different conversations and are not persisted across app restarts. This ensures a clean slate for each new conversation.

Operations

add_task — Add a new task with a title and optional detail (creates plan implicitly)
fetch_next_task — Get the next pending task in order and set it as the current focus; returns all_done: true when no pending tasks remain
set_task_completed — Mark the current task as done (or specify a task_id to complete a specific task)
delete_task — Remove the current task (or specify a task_id to delete a specific task)

Parameters

operation (required) — One of the operations above
title — Task title (required for add_task)
detail — Optional task detail (for add_task)
task_id — Optional task UUID (for set_task_completed, delete_task; defaults to current task if omitted)

When enabled, Task Planner injects a compact plan snapshot into the prompt context to keep the model aligned with current progress. Task status is simple: pending or done.

HTTP Request

Tool name: http_request

Makes arbitrary HTTP requests to any endpoint. The model can call REST APIs, send data to webhooks, fetch JSON from external services, or download files — all from within the conversation.

⚠️ Privacy Notice

The HTTP Request tool can transmit conversation data to external services. Only enable it when you trust the model and the endpoints it may call. Review requests in the tool call log before sensitive data leaves your device.

Parameters

url (required) — The full endpoint URL (http:// or https://)
method (required) — HTTP method: GET, POST, PUT, or PATCH
headers — Optional key/value object of request headers
body — Optional request body — a string, JSON object, or JSON array. JSON bodies automatically receive a Content-Type: application/json header.
timeout_seconds — Request timeout in seconds (default: 30). Configurable per Local / API mode in Settings.
download — When true, save the response bytes to app-managed storage instead of returning text/JSON inline
suggested_filename — Optional display name stored in download metadata when download is enabled

Response Format

The tool always returns a structured JSON object:

status — "success" or "error"
message — Human-readable summary (e.g. "Request complete" or "Downloaded report.pdf")
request — Echo of the outgoing request parameters
response — Status code, headers, redirect count, and either json or text body
download — Download metadata when download: true (file path, UUID, stored filename, byte count)

Redirect Handling

Redirects are followed automatically up to 5 hops. 301/302 responses to POST requests and all 303 responses are converted to GET on redirect (standard browser behaviour). If the redirect limit is exceeded, the tool returns a structured error.

Download Mode

When download: true, the response bytes are saved to the app's private storage directory (OnDevice-AI/HTTPDownloads/). Each download is assigned a UUID and indexed in a persistent JSON file so the model can reference it in follow-up turns. Binary responses that cannot be decoded as UTF-8 text automatically suggest retrying with download: true.

Configurable Default Timeout

Go to Settings → Tool Calling → HTTP Request → Default Params to set separate timeout defaults for local models and API providers. The built-in default is 30 seconds for both.