The original is one click away. Open original ↗
AI browser agents: what works, what fails, and how to use them now
Executive overview
Most software is built by specialists for specialists — too many menus, too much terminology, too little time. AI browser agents (ChatGPT Atlas, Claude, Gemini autobrowse) can now navigate interfaces and click buttons on your behalf.
These tools are early and fail constantly on complex tasks. The key constraint is context: the AI fills up its memory with each step and starts making mistakes beyond ~10–12 steps.
Keep every task under 10 steps. Use a planner AI to break big tasks into chunks, then pass those chunks to the browser agent one at a time.
How browser agents work
- The agent takes a screenshot, analyses it, acts, then repeats
- Each loop consumes context window space — intelligence degrades as steps accumulate
- At 5 steps: sharp. At 15: starting to slip. At 25: usually lost
- Sweet spot: 8–12 steps per chunk
Three use cases that work today
- Admin tasks — extract data from dashboards (Stripe, Mercury, Workday) without knowing where the settings are; AI bridges plain-language requests to exact menu paths
- Technical setups — configure tools you've never used (e.g. Google Cloud credentials → Supabase); AI navigates specialist interfaces non-technical users can't parse
- Repetitive data entry — give the agent a source-of-truth document; it fills forms (vendor applications, insurance, health intake) without you retyping the same data
Bonus: automated discount code testing
- Have the agent find active discount codes for a purchase and test each one to identify the highest valid discount
Two strategies for reliable results
- Break big into small: never give a 25-step task to a browser agent; split it into chunks of 5–8 steps and pass them sequentially
- Use a planner + executor split: ask a chat AI (ChatGPT, Claude, Gemini) to list the exact steps for your goal; copy chunks of those steps to the browser agent to execute
Planner prompt template
Ask the planner: "What are the exact steps to [goal] using [tool A] and [tool B]? Break it down into specific actions."
Copy 5–8 steps at a time into the executor (Atlas, Comet, Claude extension, Gemini autobrowse).
More like this — when you're ready for early access.
Join the waitlist for a personal account and content recommendations based on what you're working on.
No spam. Unsubscribe at any time.
You're on the list. We'll be in touch before launch.