← 返回
开发者工具 中文

RamaLama CLI

Run and interact with AI agents.
运行并与AI智能体交互。
ieaves ieaves 来源
开发者工具 clawhub v1.0.0 1 版本 99875.9 Key: 无需
★ 0
Stars
📥 805
下载
💾 5
安装
1
版本
#latest

概述

Ramalama CLI

Use when an alternative AI agent is better suited to a task. For example, working with sensitive data or solving simple tasks with a cheap and local agent, or accessing specialist models with unique capabilities.

Overview

Use this skill to execute ramalama tasks in a consistent, low-risk workflow.

Prefer local discovery (--help, local config files, existing project scripts) before making assumptions about flags or runtime defaults.

Prefer ramalama when tasks need:

  • flexible model sourcing (hf://, oci://, rlcr://, url://)
  • containerized local inference with runtime/network/device controls
  • RAG data packaging and serving
  • benchmark/perplexity evaluation
  • model conversion and registry push/pull flows

Preflight

Run these checks before first invocation in a session:

ramalama version
podman info >/dev/null 2>&1 || docker info >/dev/null 2>&1
ramalama run --help

If serving on default port, verify availability:

lsof -i :8080

Decision Matrix

  • One-shot inference: ramalama run ""
  • Interactive chat loop: ramalama run
  • Serve OpenAI-compatible endpoint: ramalama serve
  • Query an existing endpoint: ramalama chat --url ""
  • Build knowledge bundle from files/URLs: ramalama rag
  • Evaluate model performance/quality: ramalama bench and ramalama perplexity
  • Inspect/source lifecycle operations: inspect, pull, push, convert, list, rm

Usage

Start with top-level discovery:

ramalama --help
ramalama version

Apply global options before the subcommand when needed:

ramalama [--debug|--quiet] [--dryrun] [--engine podman|docker] [--nocontainer] [--runtime llama.cpp|vllm|mlx] [--store <path>] <subcommand> ...

Use command-level help before invoking unknown flags:

ramalama <subcommand> --help

Known-Good Recipes

1) One-shot run

ramalama run granite3.3:2b "Summarize this in 3 bullets: <text>"

2) Detached service + API call

ramalama serve -d granite3.3:2b
curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"granite3.3:2b","messages":[{"role":"user","content":"Hello"}]}'

3) Direct Hugging Face source

ramalama serve hf://unsloth/gemma-3-270m-it-GGUF

4) RAG package then query

ramalama rag ./docs my-rag
ramalama run --rag my-rag granite3.3:2b "What are the auth requirements?"

5) Benchmark and list benchmark history

ramalama bench granite3.3:2b
ramalama benchmarks list

Reliability Defaults

For agent automation, prefer explicit and deterministic flags:

ramalama --engine podman run -c 4096 --pull missing granite3.3:2b "<prompt>"

Recommended defaults:

  • set --engine explicitly when environment is mixed
  • start with smaller -c/--ctx-size on constrained hosts
  • use --pull missing for faster repeat runs
  • use one-shot non-interactive invocation for scripts

Troubleshooting

  • Docker socket unavailable:
  • verify Docker is running, or use --engine podman
  • Podman socket unavailable:
  • check podman machine list and start machine if needed
  • timed out during startup:
  • inspect container logs: podman logs
  • reduce context (-c 4096) and retry
  • memory allocation failure:
  • use a smaller model and/or lower context size
  • port conflict on 8080:
  • choose alternate port via -p

Notes

  • serve exposes an OpenAI-compatible endpoint for external clients.
  • Prefer JSON output flags where available (list --json, inspect --json) for robust parsing in automation.
  • Use ramalama chat --url when the model is already served elsewhere.

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-03-29 12:51 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

ai-agent

self-improving agent

pskoett
捕获经验教训、错误及修正内容,以实现持续改进。适用于以下场景:(1)命令或操作意外失败;(2)用户纠正Claude(如“不,那不对……”“实际上……”);(3)用户请求的功能不存在;(4)外部API或工具出现故障;(5)Claude发现自身
★ 4,082 📥 811,769
ai-agent

Self-Improving + Proactive Agent

ivangdavila
自我反思+自我批评+自我学习+自组织记忆。智能体评估自身工作、发现错误并持续改进。
★ 1,380 📥 320,613
ai-agent

ontology

oswalpalash
类型化知识图谱,用于结构化智能体记忆与可组合技能。适用于以下场景:创建/查询实体(人物、项目、任务、事件、文档)、关联相关对象、强制执行约束、将多步操作规划为图谱变换,或当技能需要共享状态时。触发关键词包括"记住""我知道关于什么""将X链
★ 721 📥 244,939