Browser Automation
Automate web browsers with Playwright. Navigate pages, take screenshots, extract data, and use LLM vision to understand and interact with any website.
How It Works
Browser automation adds a browser tool type to workflows. Each step controls a headless Chromium instance — navigate, click, type, scroll, screenshot, and extract data. For AI-driven automation, the analyze action sends a screenshot to an LLM vision model that decides what to do next.
Tool Definition
{
"name": "browser",
"type": "browser",
"params": {
"timeout": 60000
}
}Actions
launch
Start a browser session.
{
"id": "start",
"action": "browser",
"params": {
"action": "launch",
"headless": true,
"viewport": { "width": 800, "height": 600 }
}
}navigate
Go to a URL.
{
"id": "go",
"action": "browser",
"params": {
"action": "navigate",
"url": "https://example.com",
"waitUntil": "domcontentloaded"
}
}Returns { url, title }.
click
Click an element by CSS selector.
{
"id": "click_login",
"action": "browser",
"params": {
"action": "click",
"selector": "#login-button"
}
}type
Type text into an input field.
{
"id": "enter_email",
"action": "browser",
"params": {
"action": "type",
"selector": "input[name='email']",
"text": "user@example.com"
}
}scroll
Scroll the page up or down.
{
"id": "scroll_down",
"action": "browser",
"params": {
"action": "scroll",
"direction": "down",
"amount": 500
}
}wait
Wait for an element to appear.
{
"id": "wait_results",
"action": "browser",
"params": {
"action": "wait",
"selector": ".results-loaded",
"timeout": 10000
}
}screenshot
Capture the page as a compressed JPEG (~50-100KB).
{
"id": "capture",
"action": "browser",
"params": {
"action": "screenshot"
}
}Returns { screenshot, success } where screenshot is a data:image/jpeg;base64,... string.
extract
Extract text content from the page or a specific element.
{
"id": "get_prices",
"action": "browser",
"params": {
"action": "extract",
"selector": ".price-list"
}
}Returns { content, success }.
html
Get raw HTML content.
{
"id": "get_html",
"action": "browser",
"params": {
"action": "html",
"selector": ".product-grid"
}
}Returns { html, success }.
evaluate
Run JavaScript in the browser context.
{
"id": "get_data",
"action": "browser",
"params": {
"action": "evaluate",
"script": "Array.from(document.querySelectorAll('.item')).map(el => ({ name: el.textContent, href: el.href }))"
}
}Returns { result, success }.
analyze
Screenshot + LLM vision in one step. Sends the screenshot to your configured LLM provider and returns the AI's analysis.
{
"id": "understand_page",
"action": "browser",
"params": {
"action": "analyze",
"prompt": "What products are shown on this page? List their names and prices."
}
}Returns { screenshot, analysis }.
close
End the browser session.
{
"id": "cleanup",
"action": "browser",
"params": {
"action": "close"
}
}Example Workflow
A scraper that navigates to a URL, analyzes the page with AI vision, and extracts data:
{
"id": "browser-scrape",
"name": "Browser Scraper",
"tools": [
{ "name": "browser", "type": "browser" }
],
"workflows": [{
"name": "scrape_and_analyze",
"trigger": { "type": "manual" },
"steps": [
{
"id": "launch",
"action": "browser",
"params": { "action": "launch", "headless": true }
},
{
"id": "navigate",
"action": "browser",
"dependsOn": ["launch"],
"params": { "action": "navigate", "url": "{{url}}" }
},
{
"id": "analyze",
"action": "browser",
"dependsOn": ["navigate"],
"params": { "action": "analyze", "prompt": "{{prompt}}" }
},
{
"id": "extract",
"action": "browser",
"dependsOn": ["analyze"],
"params": { "action": "extract", "selector": "{{selector}}" },
"optional": true,
"defaultValue": { "content": "" }
},
{
"id": "close",
"action": "browser",
"dependsOn": ["extract"],
"params": { "action": "close" }
}
]
}]
}Route Configuration
Expose browser automation as an API endpoint:
{
"path": "/browser/scrape",
"method": "post",
"requireAuth": true,
"authProvider": "firebase",
"subscribable": {
"enabled": true,
"queueName": "default",
"estimatedTime": "30s"
},
"integrations": {
"actions": [{
"type": "workflow",
"workflowId": "browser-scrape",
"input": {
"url": "{{body.url}}",
"prompt": "{{body.prompt}}",
"selector": "{{body.selector}}"
}
}]
}
}Subscribable Streaming
With subscribable enabled, the endpoint returns immediately with subscription URLs:
curl -X POST /browser/scrape \
-H "Authorization: Bearer $TOKEN" \
-d '{"url": "https://example.com", "prompt": "Extract all product info"}'Response:
{
"accepted": true,
"jobId": "abc-123",
"estimatedTime": "30s",
"subscribe": {
"sse": "/queues/default/jobs/abc-123/subscribe",
"websocket": { "path": "/ws", "channel": "job:abc-123" },
"poll": "/queues/default/jobs/abc-123"
}
}Subscribe via SSE to receive live updates:
const eventSource = new EventSource("/queues/default/jobs/abc-123/subscribe");
eventSource.onmessage = (e) => {
const data = JSON.parse(e.data);
if (data.screenshot) {
// Render live browser frame (~50-100KB JPEG)
img.src = data.screenshot;
}
if (data.analysis) {
// Show AI analysis
console.log(data.analysis);
}
};Session Isolation
Each workflow execution gets its own Playwright browser context with separate cookies, storage, and state. No data leaks between sessions.
Vision Loop
For autonomous multi-step browsing, chain actions in a loop pattern:
screenshot— capture current pageanalyze— LLM decides what to do next- Execute the action (click, type, scroll)
- Repeat until goal is met
The analyze action is a shortcut that combines steps 1 and 2. Use evaluate or extract when you know the CSS selectors to skip LLM costs entirely.
Setup
npm install playwright
npx playwright install chromium