← 返回
AI智能 Key

Qwen

Build and route Qwen chat, coding, reasoning, and vision workflows across hosted and self-hosted endpoints with safer debugging.
在托管和自托管端点上构建并路由 Qwen 聊天、编码、推理和视觉工作流,使用更安全的调试。
ivangdavila
AI智能 clawhub v1.0.0 1 版本 99794.5 Key: 需要
★ 1
Stars
📥 951
下载
💾 62
安装
1
版本
#latest

概述

When to Use

User needs Qwen to work reliably for chat, coding, reasoning, structured outputs, or vision. Agent handles surface selection, live model verification, hosted-versus-local tradeoffs, and failure recovery before the workflow reaches production.

Architecture

Memory lives in ~/qwen/. If ~/qwen/ does not exist, run setup.md. See memory-template.md for structure.

~/qwen/
├── memory.md         # Status, activation rules, and deployment defaults
├── routes.md         # Preferred route per workload
├── servers.md        # Known local or hosted endpoints
├── experiments.md    # Prompt, parser, and latency notes
└── logs/             # Optional sanitized repro payloads

Quick Reference

Use the smallest file that resolves the blocker.

TopicFile
-------------
Setup processsetup.md
Memory templatememory-template.md
Hosted and local request patternsapi-patterns.md
Workload routing matrixrouting-matrix.md
Hosted versus self-hosted decisionsdeployment-paths.md
Tool-calling and structured output guardrailstool-calling.md
Debugging and recoverytroubleshooting.md

Requirements

  • curl and jq for minimal endpoint checks
  • Hosted Qwen usually needs a DASHSCOPE_API_KEY
  • Self-hosted Qwen may use Ollama, vLLM, SGLang, or another OpenAI-compatible server
  • Keep secrets in environment variables only

Core Rules

1. Lock the Surface Before Tuning the Model

  • Identify the real execution surface first: Alibaba Model Studio hosted API, another OpenAI-compatible provider, or a self-hosted server.
  • Most "Qwen issues" are actually endpoint, region, server, or chat-template issues rather than model quality issues.

2. Verify Live Availability Before Naming Any Model

  • Start with a /models or equivalent health check and copy the live model ID from the response.
  • Never trust stale screenshots, old blog posts, or remembered IDs for production routing.

3. Route by Workload, Not by Brand Loyalty

  • Split the request into one of these paths: fast chat, deep reasoning, coding agent, deterministic JSON, or vision.
  • Pick the smallest Qwen family and server path that can reliably do that job.

4. Treat Structured Output as a Separate Reliability Problem

  • If Qwen is feeding tools, JSON, or downstream writes, use strict schemas, low temperature, and parser validation before acting.
  • If the first pass is creative or reasoning-heavy, add a second deterministic normalization pass instead of forcing one prompt to do both.

5. Separate Model Problems From Server Problems

  • When behavior changes after migration, isolate the variable: model family, quantization, chat template, reasoning mode, parser, or backend.
  • Reproduce with one minimal payload before changing prompts, infrastructure, and business logic at the same time.

6. Compare Hosted and Self-Hosted Explicitly

  • Hosted Qwen usually wins on speed to first success and managed multimodal access.
  • Self-hosted Qwen only wins when privacy, local cost control, or offline use clearly outweigh operational overhead.

7. Ask Before Creating Persistent State

  • Work statelessly by default.
  • Only create ~/qwen/ notes, saved routes, or repro logs after the user wants continuity across Qwen tasks.

Common Traps

  • Treating "Qwen" as one interchangeable thing -> hosted APIs, Ollama, vLLM, and agent frameworks behave differently.
  • Hardcoding dated model IDs -> region and release cadence make old IDs fail fast.
  • Mixing free-form reasoning with strict JSON output -> parsing breaks when one prompt is asked to do both.
  • Blaming the model for local slowness -> Apple Silicon and Ollama often fail because of model size, quantization, or oversized context.
  • Migrating from another OpenAI-compatible backend without rechecking tool-calling -> parser and chat-template differences can break automation.

External Endpoints

Use only the smallest hosted endpoint that answers the current question.

EndpointData SentPurpose
------------------------------
https://dashscope.aliyuncs.com/compatible-mode/v1/modelsAuth header onlyMainland China model discovery
https://dashscope-intl.aliyuncs.com/compatible-mode/v1/modelsAuth header onlyInternational model discovery
https://dashscope-us.aliyuncs.com/compatible-mode/v1/modelsAuth header onlyUnited States model discovery
https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completionsPrompt messages and optionsHosted Qwen chat completions in Beijing region
https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completionsPrompt messages and optionsHosted Qwen chat completions in Singapore region
https://dashscope-us.aliyuncs.com/compatible-mode/v1/chat/completionsPrompt messages and optionsHosted Qwen chat completions in Virginia region

No other data is sent externally.

Security & Privacy

Data that leaves your machine:

  • Prompt content sent to Alibaba Cloud Model Studio when using hosted Qwen
  • Optional images or multimodal payloads sent to hosted Qwen vision endpoints when requested

Data that stays local:

  • Deployment preferences and routing notes in ~/qwen/ after user approval
  • Local server URLs, workload notes, and sanitized repro payloads kept for debugging

This skill does NOT:

  • Store API keys in markdown files
  • Send data to undeclared third-party endpoints
  • Assume local servers are safe to expose publicly
  • Modify its own skill files

Scope

This skill ONLY:

  • routes Qwen work across hosted and self-hosted execution surfaces
  • chooses model families for chat, coding, reasoning, vision, and automation
  • debugs migration, parser, latency, and endpoint problems
  • stores lightweight local notes only after user approval

This skill NEVER:

  • invent live model availability without checking
  • persist secrets in ~/qwen/
  • execute destructive downstream automation without validated output
  • pretend one backend's tool-calling behavior applies everywhere

Trust

Using hosted Qwen sends prompt data to Alibaba Cloud Model Studio.

Only install if you trust that service with your data, or keep Qwen fully self-hosted.

Related Skills

Install with clawhub install if user confirms:

  • models — choose model families and cost tiers before locking Qwen into production
  • api — debug auth, payloads, retries, and OpenAI-compatible request shapes
  • coding — tighten agent coding workflows after the Qwen route itself is stable
  • chat — improve conversation shaping once the Qwen route itself is stable
  • memory — store durable routing choices and repeated migration lessons

Feedback

  • If useful: clawhub star qwen
  • Stay updated: clawhub sync

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-03-29 16:24 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

ai-intelligence

self-improving agent

pskoett
捕获经验教训、错误及修正内容,以实现持续改进。适用于以下场景:(1)命令或操作意外失败;(2)用户纠正Claude(如“不,那不对……”“实际上……”);(3)用户请求的功能不存在;(4)外部API或工具出现故障;(5)Claude发现自身
★ 4,067 📥 802,831
ai-intelligence

ontology

oswalpalash
类型化知识图谱,用于结构化智能体记忆与可组合技能。适用于以下场景:创建/查询实体(人物、项目、任务、事件、文档)、关联相关对象、强制执行约束、将多步操作规划为图谱变换,或当技能需要共享状态时。触发关键词包括"记住""我知道关于什么""将X链
★ 716 📥 244,356
ai-intelligence

Proactive Agent

halthelobster
将AI智能体从任务执行者升级为主动预判需求、持续优化的智能伙伴。集成WAL协议、工作缓冲区、自主定时任务及实战验证模式。Hal Stack核心组件 🦞
★ 842 📥 213,713