Use this skill to run voice interaction with user-preferred timbre.
MOSI_API_KEY (required)MOSI_BASE_URL (optional, default https://studio.mosi.cn)Always send:
Authorization: Bearer Collect:
text (required, what to speak)voice_id (preferred when available), orreference_audio (public URL), orOptional:
expected_duration_secsampling_params:max_new_tokens (default 512)temperature (default 1.7)top_p (default 0.8)top_k (default 25)meta_info (default false)voice_id is available, use it directly.POST /api/v1/files/upload with multipart field file.POST /api/v1/voice/clone with file_id (or url).GET /api/v1/voices/{voice_id} until ACTIVE or timeout.reference_audio URL is available, use it directly in TTS.POST /v1/audio/tts.model: "moss-tts"textvoice_id or reference_audioaudio_data (base64) to WAV.duration_s and usage when present.voice_id used4010 or 4011: API key missing/invalid, ask user to fix MOSI_API_KEY.4020: insufficient credits, ask user to recharge.4029: rate limited, retry with exponential backoff.5002: invalid audio URL or decode failed, ask user for another clip.5004: timeout, shorten text and retry.voice_id for multi-turn voice chat to reduce latency.共 1 个版本