← 返回
内容创作 中文

MinerU PDF Parser

An AI-Native skill for parsing PDF / Office / image files into clean Markdown with MinerU — a fast, zero-config document parser for AI agents. Works with NO...
An AI-Native skill for parsing PDF / Office / image files into clean Markdown with MinerU — a fast, zero-config document parser for AI agents. Works with NO...
tsekaluk
内容创作 clawhub v3.3.1 3 版本 100000 Key: 无需
★ 0
Stars
📥 921
下载
💾 33
安装
3
版本
#latest

概述

MinerU PDF Parser

Parse PDF, Office (Word/PPT/Excel), and image files into clean Markdown — with

LaTeX formulas, tables, images, and OCR. One zero-dependency script, two backends,

automatic routing.

Zero-config quick start (no token, no install)

# Parse a local file or URL — the Agent API needs no login
python3 scripts/mineru.py paper.pdf

# Pipe the Markdown straight back to an agent
python3 scripts/mineru.py paper.pdf --stdout

# Machine-readable status for tool pipelines
python3 scripts/mineru.py paper.pdf --json

No pip install, no API key. The free Agent API handles files ≤ 10 MB / ≤ 20 pages.

Run with uv (zero-install, managed Python)

scripts/mineru.py carries PEP 723 inline

metadata, so uv runs it directly — no venv, no

pip install, with a uv-managed interpreter:

uv run scripts/mineru.py paper.pdf --stdout       # zero-install run
uv run --no-project --with pytest pytest -q       # dev suite via uv

Power mode (token) — large files, batches, extra formats

export MINERU_TOKEN="..."          # https://mineru.net/apiManage/token

# Parallel batch a directory, resume on re-run
python3 scripts/mineru.py ./pdfs/ --output ./out/ --workers 8 --resume

# Export DOCX/HTML/LaTeX alongside Markdown (auto-routes to the Standard API)
python3 scripts/mineru.py report.pdf --format docx --format latex

When a token is set, the tool auto-routes: small single files still use the

free Agent API; anything large (> 10 MB / > 20 pages), batched, or needing extra

export formats uses the Standard API (≤ 200 MB / ≤ 200 pages). If the Agent

API hits a size/page limit, it auto-escalates to the Standard API.

Supported modalities

ModalityExtensionsOCR
--------------------------
PDF.pdf--ocr
Image.png .jpg .jpeg .jp2 .webp .gif .bmpbuilt-in
Word.doc .docx
Slides.ppt .pptx
Sheet.xls .xlsx
HTML.html (Standard API, MinerU-HTML model)

Common options

INPUT...          One or more files, a directory, or a URL
--output, -o      Output directory (default: ./output)
--api             auto | agent | standard   (default: auto)
--model           pipeline | vlm | MinerU-HTML  (default: vlm)
--format          docx | html | latex  (repeatable; forces Standard API)
--lang            OCR/document language (default: ch)
--ocr             Enable OCR for scanned documents
--pages           Page range, e.g. "1-10" or "2,4-6"
--workers, -w     Concurrent submit/upload/download slots (default: 8)
--resume          Skip inputs already parsed
--stdout          Print Markdown to stdout
--json            Print machine-readable status to stdout
--to SINK         Deliver into a content tool (repeatable); --list-sinks to enumerate
--obsidian PATH   Shortcut for --to obsidian with this vault
--engine          cloud | local | auto  (local/auto parse born-digital PDFs offline)
--split           Split oversized PDFs past the page caps, parse parts, merge (needs pypdf)
--chunk           Emit heading-aware RAG chunks (.chunks.json + --json)
--doctor          Environment self-check and exit

MCP server

Expose MinerU over MCP (zero-dependency stdio JSON-RPC) so an MCP host can call it:

python3 scripts/mineru_mcp.py

Tools: mineru_parse, mineru_parse_to (parse + deliver to sinks), mineru_list_sinks.

Deliver into your tools (--to)

Parse once and push the Markdown into content tools via each one's official path:

python3 scripts/mineru.py paper.pdf --to obsidian --to notion --to feishu

Targets: obsidian logseq siyuan notion linear yuque coda slack

feishu confluence onenote ticktick dingtalk airtable wecom (all

zero-dependency), plus roam and wps via optional extras. Each reads its config

from env vars (run --list-sinks). Per-target auth, fidelity, and image notes:

references/integrations.md.

Output

output/
└── document-name/
    ├── document-name.md    # clean Markdown
    └── images/             # extracted figures (Standard API)

Performance (real, measured)

End-to-end latency for the official demo PDF via the free Agent API:

cold ≈ 14 s · warm ≈ 13 s (submit → poll → download). Batches scale with

--workers. Numbers come from the no-mock live benchmark in tests/test_live.py.

Testing

python3 -m pytest                      # fast unit suite (offline)
MINERU_LIVE=1 python3 -m pytest -m live -s   # real API + benchmark (no mocks)

API Reference

See references/api_reference.md. Official docs:

https://mineru.net/apiManage/docs · Token: https://mineru.net/apiManage/token

版本历史

共 3 个版本

  • v3.3.1 当前
    2026-06-03 12:43
  • v3.1.0
    2026-06-01 12:10
  • v2.1.0
    2026-03-30 00:04 安全 安全

安全检测

腾讯云安全 (Keen)

队列中

腾讯云安全 (Sanbu)

队列中

🔗 相关推荐

content-creation

YouTube

byungkyu
使用托管OAuth集成YouTube Data API,支持搜索视频、管理播放列表、获取频道数据及评论互动,适用于用户需要时使用此技能。
★ 141 📥 41,002
content-creation

AdMapix

fly0pants
广告情报与应用数据分析助手,支持搜索广告素材、分析应用排名、下载量、收入及市场洞察,用于广告素材和竞品分析。
★ 294 📥 136,382
content-creation

Baidu Wenku AIPPT

ide-rea
使用百度文库 AI 智能生成 PPT,自动根据内容选择模板。
★ 66 📥 46,112