← 返回
内容创作 Key 中文

Mistral OCR

Extract text, tables, and images from PDFs or images using Mistral OCR API and output in Markdown, JSON, or HTML formats.
使用 Mistral OCR API 从 PDF 或图片中提取文本、表格和图片,并以 Markdown、JSON 或 HTML 格式输出。
yzdame
内容创作 clawhub v1.0.4 1 版本 99738.1 Key: 需要
★ 5
Stars
📥 2,185
下载
💾 313
安装
1
版本
#document#latest#ocr#pdf

概述

⚠️ Privacy Warning - 隐私警告

IMPORTANT - READ BEFORE INSTALLING:

This skill uploads your files to Mistral's cloud servers for OCR processing.

Do NOT use with sensitive or confidential documents unless:

  • You trust Mistral's data handling policies
  • You have reviewed Mistral's privacy policy
  • You accept that file contents will be transmitted and processed remotely

For sensitive documents, use offline/local OCR tools instead.


Mistral OCR Skill

A powerful OCR tool that converts PDF files and images into Markdown, JSON, or HTML formats using Mistral's state-of-the-art OCR API.

Installation

# Clone or download this repository
git clone https://github.com/YZDame/Mistral-OCR-SKILL.git
cd Mistral-OCR-SKILL

# Install dependencies
pip install -r requirements.txt

🔑 API Key Setup (Required)

Get your API key:

👉 https://console.mistral.ai/home

Set the environment variable:

export MISTRAL_API_KEY=your_api_key

CLI Usage

cd scripts

# Process PDF to Markdown
python3 mistral_ocr.py -i input.pdf

# Process PDF to JSON
python3 mistral_ocr.py -i input.pdf -f json

# Specify output directory
python3 mistral_ocr.py -i input.pdf -o ~/my_ocr_results

Arguments

FlagDescription
-------------------
-i, --inputInput file path (required)
-f, --formatOutput format: markdown/json/html (default: markdown)
-o, --outputOutput directory

Data Privacy

What happens to your files:

  1. Files are uploaded to Mistral's OCR API
  2. Files are processed on Mistral servers
  3. Processing results are returned to you
  4. Files are not stored on Mistral servers (per Mistral policy)

For more details, see: https://mistral.ai/privacy-policy

License

MIT

版本历史

共 1 个版本

  • v1.0.4 当前
    2026-03-28 19:24 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

content-creation

Baidu Wenku AIPPT

ide-rea
使用百度文库 AI 智能生成 PPT,自动根据内容选择模板。
★ 66 📥 46,149
content-creation

Humanizer

biostartechnology
消除AI写作痕迹,使文本更自然真实。基于维基百科"AI写作特征"指南,识别并修正夸张象征、宣传用语、肤浅-ing分析、模糊归因、破折号滥用、三项排比、AI词汇、负面平行结构及冗长连接词等模式。
★ 857 📥 199,460
content-creation

AdMapix

fly0pants
广告情报与应用数据分析助手,支持搜索广告素材、分析应用排名、下载量、收入及市场洞察,用于广告素材和竞品分析。
★ 295 📥 136,438