What is On Device AI's Browser Agent?

Browser Agent is a Pro feature in On Device AI that lets the app's private AI open a real webpage, click buttons, fill forms, and pull information back for you on Mac, iPad, and iPhone. The browser stays visible the whole time and you can take over whenever you want.

How does the Browser Agent handle logins, CAPTCHAs, and 2FA?

When the agent hits a login screen, CAPTCHA, or two-factor prompt, it pauses and hands the browser to you. You sign in yourself, like you always would, then tap Hand Back to AI and it continues the task. Your password and credentials stay on your device.

Is this a private AI browser tool or a cloud service?

It is fully private. On Device AI runs the AI and the browser on your Apple device. Pages, cookies, and screenshots stay on your machine, unless you've explicitly added a cloud model provider with your own API key.

How is this different from cloud-based browser agents?

Most browser agents send the page you are on to a remote service so a cloud model can read it. On Device AI keeps both the AI and the browser on your Apple device, so nothing about the site you visit is shared with anyone else.

Which Apple devices support the private AI Browser Agent?

Browser Agent is available on Mac, iPad, and iPhone. Browser Agent is a Pro feature; free users see an upgrade prompt the first time it runs.

What can On Device AI's Browser Agent actually do on a webpage?

It can open a page, click buttons and links, type into forms, pick from dropdowns, scroll, go back, wait for content to load, and ask you to take over for logins. When the task is done, it sends a short summary back to your chat.

On Device AI Browser Agent: Private AI Web Automation for Mac & iPad

2026's browser automation problem

A lot of new AI tools promise to browse the web for you. Most of them work the same way under the hood: the page you are on, the cookies in your session, and the site's content all get sent off to a remote service so a cloud AI can read them and decide what to do next.

That's fine for public marketing pages. It gets uncomfortable the moment the site is your email, your company dashboard, your banking portal, or anything behind a login. "Let a cloud service read what's on my screen" is a very different trust decision than "let an app on my own device read it for me."

What On Device AI's Browser Agent does

Ask your On Device AI chat to do something on the web. Open an article and summarize the comments. Pull the prices off a competitor page. Check your team's status dashboard. Fill in a form you've been putting off.

When the task needs a real webpage, the app opens a full-screen browser view. You watch it work. It reads the page, decides what to do next, and takes one small step at a time: click this link, type in this field, pick this option, scroll down, go back. When it has the answer, it closes the loop and hands a short summary back to your chat.

The whole thing runs on your Apple device. There's no separate sign-in, no cloud browser waiting in a data center, no remote copy of the page.

Simple, predictable actions

The agent is deliberately kept to a short list of things a person would do on a webpage:

Click a button or a link
Type into a text field
Pick an option from a dropdown
Scroll the page or scroll to a specific item
Open a URL or go back
Wait a moment for the page to load
Pause and ask you to take over
Stop when the task is done and report back

It cannot run arbitrary code, open extra windows, or start downloads. The short list is what makes the browser feel calm instead of chaotic, and it's also what makes a small, fast, on-device AI a good fit for the job.

Logins, CAPTCHAs, and anything you'd rather do yourself

Most browser agents try to log in on your behalf. They ask for your password, stash it somewhere, and hope nothing breaks. That's exactly the situation a private AI should avoid.

On Device AI does the opposite. When the agent reaches a login screen, a CAPTCHA, a two-factor prompt, or a page you clearly don't want automated, it simply stops and hands the browser to you. A Hand Back to AI button appears. You sign in the normal way, in the same browser window, and tap Hand Back when you're ready. The agent picks up from there.

You can also drop in a short note while you're there: "pick the second result", "skip the popup", "the date format is DD/MM/YYYY". The agent will use that hint on the next step, and it stays scoped to the browser task instead of cluttering your chat.

What you see while it runs

The browser screen isn't a black box. While the agent works:

The live webpage is front and center, the way any browser would look.
A small status panel tells you what the agent is doing right now, in plain language, such as "Clicking the Submit button" or "Waiting for the page to load".
Whatever the agent is about to interact with gets a brief highlight on the page so you can see exactly where it's looking.
A Take Over button is always available if you want to grab the browser yourself.
A Stop button ends the task instantly.
When the task finishes, the final page stays on screen so you can read, scroll, or copy anything you want.

The short version: it looks like someone else is using your browser, and you can politely ask for the mouse back whenever you want.

Calm behavior by design

On Device AI is careful not to let the agent wander. A few built-in behaviors keep each run sensible:

Clear stopping points. Every task has a reasonable upper bound on how long the agent will try. It won't loop on a broken page forever.
Notices when it's stuck. If the same step keeps failing, the agent stops and tells you, instead of pretending to make progress.
Honest failure. If a task can't be finished, it returns a clear explanation of how far it got and why it had to stop.
One browser at a time. The agent never opens a second hidden session behind your back.

These aren't cosmetic. They're what separates a browser agent that looks cool for five minutes from one that's useful during an actual workday.

Why a private browser agent matters in 2026

Cloud browser agents are having a moment. New tools launch every week promising to "give your AI eyes" with a single sign-up. Almost all of them share the same catch: the webpage you're on has to travel across the internet to somebody else's server before an AI can read it.

On Device AI is for a different kind of user. The one who wants an AI that can actually go look something up on a real site for them, fill a form, copy a result back into a report, and never send the page anywhere. For that person, the interesting comparison isn't which cloud service has the most features. It's simply: does my data leave my device, or not?

Tasks worth trying

Research behind a login. Ask the agent to pull your recent tickets, saved articles, or internal reports into a summary. You sign in when the browser pauses. The rest of the reading is the agent's job.

Tedious form-filling. Expense reports, vendor signup forms, repetitive status updates. Describe the task once, let the agent click through, and take over whenever you want.

Comparisons and extractions. "Open this competitor's pricing page and compare their tiers to what we shipped last month." Everything happens on your machine, using the private context you already built up in chat.

Checking in on dashboards. Status pages, monitoring views, admin consoles. Ask what broke overnight and let the agent click around for you. The browser stays visible so you can stop it the moment something looks off.

Privacy, as a feature

The short version of the privacy story:

Pages, cookies, and screenshots stay on your Apple device.
The AI that reads the page also runs on your device.
Logins stay in your hands. The agent never asks for your password.
Nothing about a browser task is shared with the chat unless it needs to be in the summary the agent hands back.

If you've configured a cloud AI provider with your own API key, that provider sees whatever your model's reasoning sends it, same as any other chat. The default experience is fully local.

Availability

Browser Agent is available on Mac, iPad, and iPhone. It's a Pro feature; free users can see it in the tool list and get an upgrade prompt the first time they try it. Existing Pro subscribers get it automatically with the latest On Device AI update.

Getting started

Open a conversation in On Device AI. Turn on the Browser Action tool in the tool list, then just ask for what you want in plain language:

"Open Hacker News, find the top post about local AI, and summarize the comments."
"Check our team's status page and tell me which services look unhealthy."
"Fill out this signup form using the details I gave you earlier."

The browser opens. You watch it work. You take over when you feel like it. When the task is done, the answer is waiting back in your chat, and nothing about the pages it visited went anywhere else.

← Back to News & Blogs Multi-agent workflows → Tool calling docs → Download On Device AI →