← 返回
效率工具 中文

PDF Reader (Iyeque)

Extract text, search inside PDFs, and produce summaries.
提取文本、检索PDF内容及生成摘要。
iyeque
效率工具 clawhub v1.1.0 1 版本 99712.1 Key: 无需
★ 5
Stars
📥 2,324
下载
💾 193
安装
1
版本
#latest

概述

PDF Reader Skill

The pdf-reader skill provides functionality to extract text and retrieve metadata from PDF files using PyMuPDF (fitz).

Tool API

The skill provides two commands:

extract

Extracts plain text from the specified PDF file.

  • Parameters:
  • file_path (string, required): Path to the PDF file to extract text from.
  • --max_pages (integer, optional): Maximum number of pages to extract.

Usage:

python3 skills/pdf-reader/reader.py extract /path/to/document.pdf
python3 skills/pdf-reader/reader.py extract /path/to/document.pdf --max_pages 5

Output: Plain text content from the PDF.

metadata

Retrieve metadata about the document.

  • Parameters:
  • file_path (string, required): Path to the PDF file.

Usage:

python3 skills/pdf-reader/reader.py metadata /path/to/document.pdf

Output: JSON object with PDF metadata including:

  • title: Document title
  • author: Document author
  • subject: Document subject
  • creator: Application that created the PDF
  • producer: PDF producer
  • creationDate: Creation date
  • modDate: Modification date
  • format: PDF format version
  • encryption: Encryption info (if any)

Implementation Notes

  • Uses PyMuPDF (imported as pymupdf) for fast, reliable PDF processing
  • Supports encrypted PDFs (will return error if password required)
  • Handles large PDFs efficiently with max_pages option
  • Returns structured JSON for metadata command

Example

# Extract text from first 3 pages
python3 skills/pdf-reader/reader.py extract report.pdf --max_pages 3

# Get document metadata
python3 skills/pdf-reader/reader.py metadata report.pdf
# Output:
# {
#   "title": "Annual Report 2024",
#   "author": "John Doe",
#   "creationDate": "D:20240115120000",
#   ...
# }

Error Handling

  • Returns error message if file not found or not a valid PDF
  • Returns error if PDF is encrypted and requires password
  • Gracefully handles corrupted or malformed PDFs

版本历史

共 1 个版本

  • v1.1.0 当前
    2026-03-29 01:36 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

Audio Processing (Iyeque)

iyeque
音频摄取、分析、转换与生成(转录、语音合成、语音活动检测、特征提取)
★ 0 📥 404
productivity

Word / DOCX

ivangdavila
创建、检查和编辑 Microsoft Word 文档及 DOCX 文件,支持样式、编号、修订记录、表格、分节符及兼容性检查等功能。
★ 438 📥 147,304
productivity

Weather

steipete
获取当前天气和预报(无需API密钥)
★ 445 📥 226,154