← 返回
开发者工具 Key 中文

Browser Use Local

Automate browser actions locally via browser-use CLI/Python: open pages, click/type, screenshot, extract HTML/links, debug sessions, and capture login QR codes.
通过 browser-use CLI/Python 本地自动化浏览器操作:打开页面、点击输入、截图、提取HTML/链接、调试会话及捕获登录二维码。
fengjiajie fengjiajie 来源
开发者工具 clawhub v1.0.0 1 版本 99921.7 Key: 需要
★ 0
Stars
📥 2,551
下载
💾 283
安装
1
版本
#latest

概述

browser-use (local) playbook

Default constraints in this environment

  • Prefer browser-use (CLI/Python) over OpenClaw browser tool here; OpenClaw browser may fail if no supported system browser is present.
  • Use persistent sessions to do multi-step flows: --session .

Quick CLI workflow (non-agent)

1) Open

browser-use --session demo open https://example.com

2) Inspect (sometimes state returns 0 elements on heavy/JS sites)

browser-use --session demo --json state | jq '.data | {url,title,elements:(.elements|length)}'

3) Screenshot (always works; best debugging primitive)

browser-use --session demo screenshot /home/node/.openclaw/workspace/page.png

4) HTML for link discovery (works even when state is empty)

browser-use --session demo --json get html > /tmp/page_html.json
python3 - <<'PY'
import json,re
html=json.load(open('/tmp/page_html.json')).get('data',{}).get('html','')
urls=set(re.findall(r"https?://[^\s\"'<>]+", html))
for u in sorted([u for u in urls if any(k in u for k in ['demo','login','console','qr','qrcode'])])[:200]:
    print(u)
PY

5) Lightweight DOM queries via JS (useful when state is empty)

browser-use --session demo --json eval "location.href"
browser-use --session demo --json eval "document.title"

Agent workflow with OpenAI-compatible LLM (Moonshot/Kimi)

Use Python for Agent runs when the CLI run path requires Browser-Use cloud keys or when you need strict control over LLM parameters.

Minimal working Kimi example

Create .env (or export env vars) with:

  • OPENAI_API_KEY=...
  • OPENAI_BASE_URL=https://api.moonshot.cn/v1

Then run the bundled script:

source /home/node/.openclaw/workspace/.venv-browser-use/bin/activate
python /home/node/.openclaw/workspace/skills/browser-use-local/scripts/run_agent_kimi.py

Kimi/Moonshot quirks observed in practice (fixes):

  • temperature must be 1 for kimi-k2.5.
  • frequency_penalty must be 0 for kimi-k2.5.
  • Moonshot can reject strict JSON Schema used for structured output. Enable:
  • remove_defaults_from_schema=True
  • remove_min_items_from_schema=True

If you get a 400 error mentioning response_format.json_schema ... keyword 'default' is not allowed or min_items unsupported, those two flags are the first thing to set.

QR code extraction (login/demo pages)

Preferred order

1) Screenshot the page and crop candidate regions (fast, robust).

2) If HTML contains data:image/png;base64,..., extract and decode it.

Crop candidates

Use scripts/crop_candidates.py to generate multiple likely QR crops from a screenshot.

source /home/node/.openclaw/workspace/.venv-browser-use/bin/activate
python skills/browser-use-local/scripts/crop_candidates.py \
  --in /home/node/.openclaw/workspace/login.png \
  --outdir /home/node/.openclaw/workspace/qr_crops

Extract base64-embedded images from HTML

source /home/node/.openclaw/workspace/.venv-browser-use/bin/activate
browser-use --session demo --json get html > /tmp/page_html.json
python skills/browser-use-local/scripts/extract_data_images.py \
  --in /tmp/page_html.json \
  --outdir /home/node/.openclaw/workspace/data_imgs

Troubleshooting

  • state shows elements: 0: use get html + regex discovery, plus screenshots; use eval to query DOM.
  • Page readiness timeout warnings: usually harmless; rely on screenshot + HTML.
  • CLI flags order: global flags go before the subcommand:
  • browser-use --browser chromium --json open https://...
  • browser-use open https://... --browser chromium

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-03-28 20:01 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

suspicious
查看报告

🔗 相关推荐

ai-agent

Skill Vetter

spclaudehome
AI智能体技能安全预审工具。安装ClawdHub、GitHub等来源技能前,检查风险信号、权限范围及可疑模式。
★ 1,232 📥 268,321
knowledge-management

Gemini Web Search

fengjiajie
使用 Gemini CLI(@google/gemini-cli)进行网络搜索或事实查找,并返回有来源的摘要。当用户询问“今天X发生了什么”“最新消息是什么”“搜索网页”“查找来源/链接”,或任何需要最新信息的任务时使用。若 Gemini
★ 5 📥 3,627
ai-agent

self-improving agent

pskoett
捕获经验教训、错误及修正内容,以实现持续改进。适用于以下场景:(1)命令或操作意外失败;(2)用户纠正Claude(如“不,那不对……”“实际上……”);(3)用户请求的功能不存在;(4)外部API或工具出现故障;(5)Claude发现自身
★ 4,086 📥 814,886