Skip to content

Browser automation

When a page needs a real browser — JavaScript rendering, modern frameworks, login-aware pages — Han AI drives a headless Chromium via Playwright on the VPS.

What it does

Loads a page in a real browser, waits for it to render, and returns either the visible page content or a clean structured scrape.

FieldValue
Schema namesbrowse, scrape_clean
Powered byPlaywright plus Chromium
Browser path/var/hanai/playwright-browsers
API key requiredNo

When Han AI uses it

  • A static page fetch returned mostly empty markup because the page is a single-page app.
  • A site requires interaction (a click, a scroll, an accept) before its content is visible.
  • A scrape needs to be clean of nav chrome, ads, and pop-overs.

Examples

  • “Pull the live listings from that broker’s portal page.”
  • “Get the current room availability from the property management site.”
  • “Open this page, accept the cookie banner, and read me the article body.”

Limits

  • Slower than plain page fetch — seconds, not milliseconds.
  • Heavily-protected sites (Cloudflare, hCaptcha, fingerprinting) may still block headless Chromium.
  • Long-running interactive flows should be modeled as recurring tasks rather than packed into a single live turn.

Why this stack

Playwright with Chromium is the industry-standard open-source browser automation stack. Running it on your VPS means dynamic pages can be read without paying a hosted scraping service, and without your queries being logged by a third party.

See also