Browser Harness: Let LLMs Drive Your Real Browser with Self-Healing CDP Automation |

文章目录

Self-healing CDP harness — Agents write their own browser helper code on the fly. Skills are filed as agent-workspace/domain-skills/ folders, effectively teaching the agent permanent knowledge about specific websites and workflows. Domain Skills framework — Community-contributed per-site playbooks (github/, linkedin/, amazon/, etc.) let the agent reuse best-practice selectors and flows across runs. Set BH_DOMAIN_SKILLS=1 to enable them. Cross-platform with cloud fallback — Works on Linux, macOS, and Windows (with proper Chrome profile handling and TCP transport fallback). Browser Use Cloud offers a free tier with 3 concurrent browsers, CAPTCHA solving, and proxy rotation — no credit card required.
The project has generated active technical discussion on GitHub, particularly around platform compatibility and browser automation edge cases. Here are two stand-out conversations from the Issues tracker:
- The harness originally assumed Unix sockets and /tmp paths everywhere, which broke the control channel on Windows Python builds. The community collaborated on a cross-platform daemon transport with an authenticated TCP fallback, plus a reliable Windows Chrome setup that opens the inspect page via the omnibox rather than a direct chrome:// URL launch. One participant noted: "The agent-generated fix now uses os.path.normpath for cross-platform path handling — no more hardcoded forward slashes breaking on Windows."
- Chrome 147 silently blocks --remote-debugging-port when the default user data directory is in use, causing the "Allow remote debugging?" dialog to appear on every restart. A contributor diagnosed the root cause in Chrome's browser_process_impl.cc (IsUsingDefaultDataDirectory() check) and implemented platform-specific workarounds: Linux uses CHROME_CONFIG_HOME to trick Chrome into thinking it's not using the default directory (zero data copying), while macOS/Windows copy the profile once to a persistent temp location and refresh only when the real profile has newer files.
Browser Harness is a refreshingly minimal take on LLM-driven browser automation. Instead of trying to anticipate every DOM interaction, it lets the agent discover and codify the right approach in real time. The community is actively contributing domain skills and fixing cross-platform edge cases — making it increasingly viable as a general-purpose AI browser agent framework. Worth exploring if you need agents that can reliably navigate complex, dynamic web interfaces. @browser-use · github.com/browser-use/browser-harness · ★ 10,750

Connecting an LLM directly to your real browser — no wrappers, no abstractions, just a thin WebSocket layer over Chrome's native DevTools Protocol (CDP). That's the core promise of Browser Harness, an open-source project from the browser-use team that lets AI agents operate in your actual browser environment rather than a simulated one.

Unlike traditional browser automation tools that offer a fixed API surface, Browser Harness is designed to be self-healing: when an agent encounters something the harness doesn't know how to handle, it writes the missing helper directly into agent_helpers.py and keeps going. Over time, the harness accumulates domain knowledge — and those improvements persist for every future run.

With a codebase of roughly 1,000 lines across four core files, it's remarkably lean. It works locally with Chrome's remote debugging port, or scales out to Browser Use Cloud for distributed, stealthy headless deployment.

Self-healing CDP harness — Agents write their own browser helper code on the fly. Skills are filed as `agent-workspace/domain-skills/` folders, effectively teaching the agent permanent knowledge about specific websites and workflows.

Domain Skills framework — Community-contributed per-site playbooks (`github/`, `linkedin/`, `amazon/`, etc.) let the agent reuse best-practice selectors and flows across runs. Set `BH_DOMAIN_SKILLS=1` to enable them.

Cross-platform with cloud fallback — Works on Linux, macOS, and Windows (with proper Chrome profile handling and TCP transport fallback). Browser Use Cloud offers a free tier with 3 concurrent browsers, CAPTCHA solving, and proxy rotation — no credit card required.

The project has generated active technical discussion on GitHub, particularly around platform compatibility and browser automation edge cases. Here are two stand-out conversations from the Issues tracker:

The harness originally assumed Unix sockets and `/tmp` paths everywhere, which broke the control channel on Windows Python builds. The community collaborated on a cross-platform daemon transport with an authenticated TCP fallback, plus a reliable Windows Chrome setup that opens the inspect page via the omnibox rather than a direct `chrome://` URL launch. One participant noted: "The agent-generated fix now uses `os.path.normpath` for cross-platform path handling — no more hardcoded forward slashes breaking on Windows."

Chrome 147 silently blocks `--remote-debugging-port` when the default user data directory is in use, causing the "Allow remote debugging?" dialog to appear on every restart. A contributor diagnosed the root cause in Chrome's `browser_process_impl.cc` (`IsUsingDefaultDataDirectory()` check) and implemented platform-specific workarounds: Linux uses `CHROME_CONFIG_HOME` to trick Chrome into thinking it's not using the default directory (zero data copying), while macOS/Windows copy the profile once to a persistent temp location and refresh only when the real profile has newer files.

Browser Harness is a refreshingly minimal take on LLM-driven browser automation. Instead of trying to anticipate every DOM interaction, it lets the agent discover and codify the right approach in real time. The community is actively contributing domain skills and fixing cross-platform edge cases — making it increasingly viable as a general-purpose AI browser agent framework. Worth exploring if you need agents that can reliably navigate complex, dynamic web interfaces.

@browser-use · github.com/browser-use/browser-harness · ★ 10,750

Browser Harness: Let LLMs Drive Your Real Browser with Self-Healing CDP Automation

The project has generated active technical discussion on GitHub, particularly around platform compatibility and browser automation edge cases. Here are two stand-out conversations from the Issues tracker:

vhs: Turn Your Terminal Sessions into Beautiful GIFs and Videos with a Simple Script

deepsec: Agent-Powered Vulnerability Scanner for Your Codebase

发表评论点击这里取消回复。

归档

分类

Browser Harness: Let LLMs Drive Your Real Browser with Self-Healing CDP Automation

The project has generated active technical discussion on GitHub, particularly around platform compatibility and browser automation edge cases. Here are two stand-out conversations from the Issues tracker:

微信扫一扫,分享到朋友圈

vhs: Turn Your Terminal Sessions into Beautiful GIFs and Videos with a Simple Script

deepsec: Agent-Powered Vulnerability Scanner for Your Codebase

猜你喜欢

发表评论 点击这里取消回复。

归档

分类

关注我们的公众号

发表评论点击这里取消回复。