← 返回
未分类 Key 中文

audio-transcribe-summarize

Transcribe audio/video files to text and generate structured summaries using SenseAudio ASR API. Use when the user asks to transcribe, summarize, or take not...
使用SenseAudio ASR API将音视频文件转录为文字,并生成结构化摘要。用于用户要求转录、总结或记录...
q1lin570
未分类 clawhub v1.0.1 1 版本 100000 Key: 需要
★ 0
Stars
📥 490
下载
💾 0
安装
1
版本
#latest

概述

Audio/Video Transcription & Summarization

Transcribe audio/video files using the SenseASR API (api.senseaudio.cn), then summarize the content into structured notes.

{baseDir} refers to this skill's directory.

Prerequisites

  • Environment variable SENSEAUDIO_API_KEY configured (get your key at https://senseaudio.cn/platform/api-key)
  • Python 3.8+ with requests installed
  • For large files (>10MB): ffmpeg installed for splitting(macOS: brew install ffmpeg,Windows: ffmpeg.org 下载并加入 PATH,Linux: apt install ffmpeg

Quick Start

  1. Run the transcription script:
python {baseDir}/scripts/transcribe.py <audio_file> [--model sense-asr-pro] [--language zh] [--speakers] [--sentiment] [--translate en]
  1. The script outputs a transcript .txt file alongside the source file
  2. Read the transcript and generate a summary (see Summary Format below)

Workflow

Step 1: Assess the Audio File

Check file size and format:

  • Supported formats: wav, mp3, ogg, flac, aac, m4a, mp4
  • Max file size per request: 10MB
  • If file > 10MB, the script auto-splits using ffmpeg

Step 2: Choose the Right Model

ModelUse When
-----------------
sense-asr-liteQuick batch transcription, simple audio, cost-sensitive
sense-asrGeneral transcription, need speaker separation or timestamps
sense-asr-proHigh accuracy needed: meetings, interviews, complex audio
sense-asr-deepthinkNoisy audio, dialects, heavy jargon, speech-to-clean-text

Default to sense-asr-pro for best quality.

Step 3: Transcribe

Run the transcription script. Key options:

# Basic transcription
python {baseDir}/scripts/transcribe.py recording.mp3

# Meeting with multiple speakers + emotion
python {baseDir}/scripts/transcribe.py meeting.wav \
  --model sense-asr-pro \
  --speakers --max-speakers 4 \
  --sentiment \
  --timestamps segment

# Transcribe and translate to English
python {baseDir}/scripts/transcribe.py lecture.mp3 \
  --model sense-asr \
  --translate en

Step 4: Summarize

After transcription, read the transcript file and produce a summary using the format below.

Summary Format

Generate summaries in this structure:

# [Title - inferred from content]

**Source**: filename.mp3
**Duration**: X min Y sec
**Date**: YYYY-MM-DD
**Speakers**: [if speaker diarization was used]

## Key Points
- Point 1
- Point 2
- ...

## Detailed Summary
[2-4 paragraph summary of the content organized by topic/chronology]

## Action Items
- [ ] Action item 1 (assigned to Speaker X, if applicable)
- [ ] Action item 2

## Notable Quotes
> "Direct quote from transcript" — Speaker X, [timestamp if available]

## Full Transcript
<details>
<summary>Click to expand full transcript</summary>

[Full transcript text here, with speaker labels and timestamps if available]

</details>

Adapt the template based on content type:

  • Meeting: emphasize action items, decisions, speaker contributions
  • Lecture/Talk: emphasize key concepts, learning points, structure
  • Interview: emphasize Q&A pairs, key responses
  • Podcast: emphasize topics discussed, interesting insights

API Reference

For full SenseASR API parameters and response formats, see api-reference.md.

版本历史

共 1 个版本

  • v1.0.1 当前
    2026-05-02 01:18 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

developer-tools

Github

steipete
使用 `gh` CLI 与 GitHub 交互,通过 `gh issue`、`gh pr`、`gh run` 和 `gh api` 管理议题、PR、CI 运行及高级查询。
★ 666 📥 323,767
security-compliance

Skill Vetter

spclaudehome
AI智能体技能安全预审工具。安装ClawdHub、GitHub等来源技能前,检查风险信号、权限范围及可疑模式。
★ 1,210 📥 266,122
ai-intelligence

Self-Improving + Proactive Agent

ivangdavila
自我反思+自我批评+自我学习+自组织记忆。智能体评估自身工作、发现错误并持续改进。
★ 1,349 📥 317,674