← 返回
AI智能 中文

MetriLLM

Find the best local LLM for your machine. Tests speed, quality and RAM fit, then tells you if a model is worth running on your hardware.
在本地找出最适合您机器的大语言模型。测试速度、质量和内存占用,并判断该模型是否值得在您的硬件上运行。
thebluehouse75
AI智能 clawhub v0.2.11 1 版本 100000 Key: 无需
★ 0
Stars
📥 798
下载
💾 20
安装
1
版本
#latest

概述

MetriLLM — Find the Best LLM for Your Hardware

Test any local model and get a clear verdict: is it worth running on your machine?

Prerequisites

  1. Node.js 20+ — check with node -v
  2. Ollama or LM Studio installed and running
  3. MetriLLM CLI — install globally:
npm install -g metrillm

Usage

List available models

ollama list

Run a full benchmark

metrillm bench --model $ARGUMENTS --json

This measures:

  • Performance: tokens/second, time to first token, memory usage
  • Quality: reasoning, math, coding, instruction following, structured output, multilingual
  • Fitness verdict: EXCELLENT / GOOD / MARGINAL / NOT RECOMMENDED

Performance-only benchmark (faster)

metrillm bench --model $ARGUMENTS --perf-only --json

Skips quality evaluation — measures speed and memory only.

View previous results

ls ~/.metrillm/results/

Read any JSON file to see full benchmark details.

Share to the public leaderboard

metrillm bench --model $ARGUMENTS --share

Uploads your result to the MetriLLM community leaderboard — an open, community-driven ranking of local LLM performance across real hardware. Compare your results with others and help the community find the best models for every setup. Shared data includes: model name, scores, hardware specs (CPU, RAM, GPU). No personal data is sent.

Interpreting Results

VerdictScoreMeaning
---------
EXCELLENT>= 80Fast and accurate — great fit
GOOD>= 60Solid — suitable for most tasks
MARGINAL>= 40Usable but with tradeoffs
NOT RECOMMENDED< 40Too slow or inaccurate

Key metrics to highlight:

  • tokensPerSecond > 30 = good for interactive use
  • ttft < 500ms = responsive
  • memoryUsedGB vs available RAM = will it fit?

Tips

  • Use --perf-only for quick tests
  • Close GPU-intensive apps before benchmarking
  • Benchmark duration varies depending on model speed and response length

Open Source

MetriLLM is free and open source (Apache 2.0). Contributions, issues, and feedback are welcome: github.com/MetriLLM/metrillm

版本历史

共 1 个版本

  • v0.2.11 当前
    2026-03-30 08:59 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

ai-intelligence

self-improving agent

pskoett
捕获经验教训、错误和纠正,以实现持续改进。使用时机:(1)命令或操作意外失败;(2)用户纠正……
★ 4,055 📥 795,674
ai-intelligence

Self-Improving + Proactive Agent

ivangdavila
自我反思+自我批评+自我学习+自组织记忆。智能体评估自身工作、发现错误并持续改进。
★ 1,349 📥 317,678
ai-intelligence

Proactive Agent

halthelobster
将AI智能体从任务执行者升级为主动预判需求、持续优化的智能伙伴。集成WAL协议、工作缓冲区、自主定时任务及实战验证模式。Hal Stack核心组件 🦞
★ 833 📥 212,747