← 返回
未分类 中文

CogDx Calibration Audit

Run a calibration audit on an AI agent's outputs via Cerebratech CogDx API ($0.05 per call, credits accepted). Use when an agent's stated confidence doesn't...
通过 Cerebratech CogDx API 对 AI 代理的输出进行校准审计(每次调用 0.05 美元,支持积分)。适用于代理声称的置信度与实际不符时。
drkavner
未分类 clawhub v1.0.1 1 版本 100000 Key: 无需
★ 0
Stars
📥 612
下载
💾 0
安装
1
版本
#latest

概述

CogDx Calibration Audit

External statistical audit of your confidence calibration. Pure math — no LLM backend. Results in < 2 minutes.

Cost

  • $0.05 per call (x402 payment on Base/USDC, or use credit balance)
  • Credits from /feedback submissions apply — first audit often effectively free
  • Payment address: Cerebratech.eth

When to Use

  • Confidence scores don't feel reliable
  • Downstream agents or humans keep second-guessing your outputs
  • Pre-deployment validation for high-stakes domains
  • After retraining (pair with /feedback to verify improvement)

API Call

1. Check if you have credit balance first:

GET https://api.cerebratech.ai/credits?wallet=your-agent-id

2. Run the audit:

POST https://api.cerebratech.ai/calibration_audit
Content-Type: application/json
X-PAYMENT: <x402-signature>  # or omit if using credits
X-WALLET: your-agent-id      # for credit balance payment

Minimum payload (10 samples):

{
  "agent_id": "your-agent-id",
  "sample_outputs": [
    {
      "prompt": "Is X true?",
      "response": "Yes",
      "stated_confidence": 0.92,
      "correct": true
    },
    {
      "prompt": "Will Y happen?",
      "response": "Likely",
      "stated_confidence": 0.75,
      "correct": false
    }
  ],
  "domain": "classification"
}

Recommended: 50–200 samples for reliable results.

Response

{
  "diagnosis_id": "cal_abc123",
  "calibration_score": 0.71,
  "overconfidence_index": 0.23,
  "underconfidence_index": 0.04,
  "brier_score": 0.18,
  "confidence_bands": [
    {
      "stated": "0.9-1.0",
      "actual_accuracy": 0.67,
      "sample_size": 23,
      "calibration_error": 0.28
    }
  ],
  "recommendations": [
    "Reduce confidence on high-stakes single-source claims",
    "Your 0.9+ band is overconfident by 28%. Retrain on 200 negative examples in this confidence range."
  ],
  "retrain_targets": {
    "distribution": "high_confidence_errors",
    "suggested_sample_count": 200,
    "domain_focus": "classification"
  }
}

After the Audit

  1. Retrain on the retrain_targets distribution
  2. Wait 7 days, collect new outputs
  3. Run cogdx-feedback (FREE) to verify improvement transferred + earn credits

Full Reference

See references/api.md for complete field docs, x402 payment setup, and error codes.

版本历史

共 1 个版本

  • v1.0.1 当前
    2026-05-02 02:31 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

CogDx Bias Scan

drkavner
Detect systematic inference-level biases in an AI agent's reasoning via Cerebratech CogDx API ($0.10 per call, credits a
★ 0 📥 628

CogDx Health Check (Free)

drkavner
Free cognitive health check for AI agents via Cerebratech CogDx. Use as entry point before committing to paid diagnostic
★ 0 📥 613
developer-tools

Omi Integration

drkavner
通过API和webhook从Omi AI可穿戴设备(Omi、Limitless等)同步录音。自动同步转录文本、处理录音,并按设备/日期组织。
★ 0 📥 699