MeshWorld India Logo MeshWorld.
Hermes Agent Configuration Templates DevOps Setup 5 min read

Hermes Agent Config Templates: 5 Copy-Paste Ready Setups

Jena
By Jena
Hermes Agent Config Templates: 5 Copy-Paste Ready Setups

Stop configuring from scratch. These are five production-tested Hermes Agent configs. Copy the one that matches your situation, fill in your tokens, and you’re done.

Which config should I use?
  • Just me, personal use → Config 1
  • Small team (2–10 people) → Config 2
  • Team in production → Config 3
  • Enterprise / Kubernetes → Config 4
  • Cost-conscious with quality fallback → Config 5

Config 1: Personal Setup (CLI + Ollama, Fully Local)

Use when: Solo developer, privacy-first, zero API costs.

Requirements: 8GB RAM, Ollama installed, 4GB disk for models.

Monthly cost: $0

yaml
# ~/.hermes/config.yml
llm:
  provider: "ollama"
  endpoint: "http://localhost:11434"
  model: "mistral"
  streaming: true
  context_window: 4096
  temperature: 0.7

platforms:
  cli:
    enabled: true

memory:
  skill_generation: true
  skill_auto_save: true
  conversation_retention: 365 # days

performance:
  max_concurrent_tasks: 1
  cache_embeddings: true

Setup steps:

bash
# 1. Install Ollama and pull model
brew install ollama && ollama serve
ollama pull mistral

# 2. Install Hermes
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash

# 3. Save config above to ~/.hermes/config.yml

# 4. Start
hermes

Config 2: Small Team (Discord + Ollama, Local Server)

Use when: 2–10 person team, single server, want free inference.

Requirements: 16GB RAM server, NVIDIA GPU (optional), Ollama, Discord bot token.

Monthly cost: ~$30–50 (server electricity/hosting), $0 for AI

yaml
# ~/.hermes/config.yml
llm:
  provider: "ollama"
  endpoint: "http://localhost:11434"
  model: "mistral:q4"
  streaming: true
  context_window: 8192
  temperature: 0.7
  num_gpu: 1 # Remove if no GPU

platforms:
  discord:
    token: ${DISCORD_BOT_TOKEN}
    learn_from_conversations: true
    allowed_servers:
      - ${DISCORD_SERVER_ID}
  telegram:
    token: ${TELEGRAM_BOT_TOKEN}
    learn_from_chats: true

memory:
  skill_generation: true
  skill_auto_save: true
  skill_versioning: true
  conversation_retention: 180

performance:
  max_concurrent_tasks: 4
  cache_embeddings: true

security:
  log_sensitive: false

Environment file:

bash
# ~/.hermes/.env
DISCORD_BOT_TOKEN=your_discord_token_here
DISCORD_SERVER_ID=your_server_id_here
TELEGRAM_BOT_TOKEN=your_telegram_token_here
Never hardcode tokens

Always use environment variables. Add .env to .gitignore if you’re version-controlling your config.


Config 3: Production Team (Multi-Platform + Cloud LLM)

Use when: 10–100 person team, cloud infrastructure, need highest quality responses.

Requirements: VPS/cloud server (4 vCPU, 8GB RAM min), cloud LLM API key.

Monthly cost: $50–200 (server) + $20–100 (LLM API usage)

yaml
# ~/.hermes/config.yml
llm:
  provider: "openai"
  api_key: ${OPENAI_API_KEY}
  model: "gpt-4"
  fallback_model: "gpt-3.5-turbo" # Cost fallback
  streaming: true
  context_window: 16000
  temperature: 0.7

  # Cost controls
  cost_limit:
    daily_usd: 50
    monthly_usd: 1000
  fallback_trigger:
    on_cost_exceed: true

platforms:
  discord:
    token: ${DISCORD_BOT_TOKEN}
    learn_from_conversations: true

  slack:
    token: ${SLACK_BOT_TOKEN}
    channels:
      - "dev-team"
      - "ops"
      - "general"
    learn_from_messages: true

  telegram:
    token: ${TELEGRAM_BOT_TOKEN}
    learn_from_chats: true

memory:
  skill_generation: true
  skill_versioning: true
  skill_auto_refinement: true
  prune_old_skills: true
  prune_threshold_days: 90
  conversation_retention: 180
  summarize_old_conversations: true

performance:
  max_concurrent_tasks: 8
  queue_strategy: "priority"
  cache_embeddings: true
  cache_ttl: 86400

monitoring:
  log_level: "info"
  log_to_file: true
  log_path: "/var/log/hermes/hermes.log"

security:
  log_sensitive: false
  token_rotation_days: 90

Config 4: Enterprise (Kubernetes + Monitoring)

Use when: 100+ users, high availability, full observability needed.

Requirements: Kubernetes cluster, PostgreSQL for shared memory, Prometheus/Grafana.

Monthly cost: $200–1000+ (depends on scale)

yaml
# ~/.hermes/config.yml (deployed via ConfigMap)
llm:
  provider: "openai"
  api_key: ${OPENAI_API_KEY}
  model: "gpt-4"
  fallback_model: "gpt-3.5-turbo"
  context_window: 16000
  streaming: true

platforms:
  slack:
    token: ${SLACK_BOT_TOKEN}
    workspaces:
      - ${SLACK_WORKSPACE_1}
      - ${SLACK_WORKSPACE_2}
    learn_from_messages: true

  discord:
    token: ${DISCORD_BOT_TOKEN}
    learn_from_conversations: true

memory:
  backend: "postgresql"
  connection: ${DATABASE_URL}
  replication: true
  skill_versioning: true
  prune_threshold_days: 90

performance:
  max_concurrent_tasks: 16
  queue_strategy: "priority"
  cache_backend: "redis"
  cache_connection: ${REDIS_URL}
  cache_ttl: 86400

monitoring:
  prometheus:
    enabled: true
    port: 9090
  log_level: "info"
  log_format: "json"
  track_latency: true
  track_skill_usage: true

security:
  log_sensitive: false
  audit_log: true
  audit_log_path: "/var/log/hermes/audit.log"

Kubernetes deployment snippet:

yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: hermes-agent
  namespace: ai-tools
spec:
  replicas: 3
  selector:
    matchLabels:
      app: hermes-agent
  template:
    spec:
      containers:
        - name: hermes
          image: ghcr.io/nousresearch/hermes-agent:latest
          envFrom:
            - secretRef:
                name: hermes-secrets
          resources:
            requests:
              memory: "4Gi"
              cpu: "1"
            limits:
              memory: "8Gi"
              cpu: "2"
          volumeMounts:
            - name: config
              mountPath: /root/.hermes/config.yml
              subPath: config.yml
      volumes:
        - name: config
          configMap:
            name: hermes-config

Config 5: Hybrid (Local Ollama Primary, Cloud Fallback)

Use when: Cost-sensitive but need quality headroom. Best of both worlds.

Requirements: Machine with 16GB+ RAM, GPU preferred, OpenAI API key for fallback.

Monthly cost: $5–20 (minimal cloud fallback usage)

yaml
# ~/.hermes/config.yml
llm:
  # Primary: free local inference
  provider: "ollama"
  endpoint: "http://localhost:11434"
  model: "mistral:q4"
  streaming: true
  context_window: 8192
  timeout: 30 # seconds before fallback

  # Fallback: cloud for complex tasks
  fallback:
    provider: "openai"
    api_key: ${OPENAI_API_KEY}
    model: "gpt-4"
    trigger_on_timeout: true
    trigger_on_complexity: "high"
    cost_limit:
      daily_usd: 5
      monthly_usd: 20

platforms:
  discord:
    token: ${DISCORD_BOT_TOKEN}
    learn_from_conversations: true
  cli:
    enabled: true

memory:
  skill_generation: true
  skill_versioning: true
  prune_threshold_days: 90

performance:
  max_concurrent_tasks: 4
  cache_embeddings: true
  preload_models:
    - "mistral:q4" # Keep loaded in memory
Hybrid cost breakdown

Typical usage: 90% of requests handled by Ollama ($0), 10% by OpenAI (~$2–5/month). You get near-GPT-4 quality at near-zero cost.


Quick Token Setup Reference

Every config that uses platforms needs these tokens. Here’s where to get each:

PlatformWhere to get tokenConfig key
Discorddiscord.com/developers → Bot tabDISCORD_BOT_TOKEN
Slackapi.slack.com/apps → OAuth tokensSLACK_BOT_TOKEN
Telegram@BotFather on Telegram → /newbotTELEGRAM_BOT_TOKEN
OpenAIplatform.openai.com → API keysOPENAI_API_KEY
Anthropicconsole.anthropic.com → API keysANTHROPIC_API_KEY

Store all of these in environment variables, never in the config file itself.