← 返回
AI智能 Key 中文

image-reader

Image recognition and understanding tool. Uses a multimodal model (e.g. doubao-seed-2.0-pro, kimi-k2.5) to analyze image content and supports OCR text extrac...
图像识别与理解工具。使用多模态模型(如doubao-seed-2.0-pro、kimi-k2.5)分析图像内容,支持OCR文字提取。
simonjoe246
AI智能 clawhub v1.0.0 1 版本 98804.7 Key: 需要
★ 4
Stars
📥 4,301
下载
💾 386
安装
1
版本
#latest

概述

Image Reader Skill

Image recognition and understanding tool that leverages Doubao multimodal models to analyze image content.


Features

  • Text Extraction (OCR): Extract text from images, suitable for documents, screenshots, posters, menus, etc.
  • Image Description: Generate detailed descriptions of images, suitable for photos, illustrations, memes, UI screens, etc.
  • General Analysis: Automatically choose the best analysis strategy based on the image type.

API Configuration

ItemValue
------------
API Endpointhttps://ark.cn-beijing.volces.com/api/coding/v3
Modeldoubao-seed-2.0-pro
AuthenticationAPI Key (configured in config.yaml)

Usage

Command Line

# General analysis
python image_reader.py /path/to/image.png

# Extract text (OCR)
python image_reader.py /path/to/image.png -p "Extract all text from the image"

# Describe the image
python image_reader.py /path/to/image.png -p "Describe this image in detail"

OpenClaw Skill Invocation

Once installed, you can invoke it using natural language:

Analyze this image
Extract the text from the image
Describe this screenshot

Output

  • Text-heavy images: Returns all extracted text, preserving original formatting.
  • Non-text images: Returns a detailed scene description, including objects, people, colors, style, etc.
  • Mixed content: Provides both text extraction and a visual description.

Technical Details

  • Uses an OpenAI-compatible API to call Doubao multimodal models
  • Images are sent as base64-encoded data
  • The system prompt adapts to the image type to select the most appropriate analysis strategy

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-03-29 23:40 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

ai-intelligence

Proactive Agent

halthelobster
将AI智能体从任务执行者升级为主动预判需求、持续优化的智能伙伴。集成WAL协议、工作缓冲区、自主定时任务及实战验证模式。Hal Stack核心组件 🦞
★ 833 📥 212,776
ai-intelligence

self-improving agent

pskoett
捕获经验教训、错误和纠正,以实现持续改进。使用时机:(1)命令或操作意外失败;(2)用户纠正……
★ 4,055 📥 795,905
ai-intelligence

ontology

oswalpalash
类型化知识图谱,用于结构化智能体记忆与可组合技能。支持创建/查询实体(人员、项目、任务、事件、文档)及关联...
★ 709 📥 243,527