# Herma AI — Intelligent LLM Router

> Herma is an AI gateway that routes queries to the optimal model for each task. Same quality as frontier models, 60-90% cheaper.

## What Herma Does
- Routes AI queries to the best model based on task complexity
- OpenAI-compatible API — change two lines of code to switch
- Maintains frontier-level quality while reducing costs by 60-90%
- Supports all major models: Claude, GPT-4, Gemini, DeepSeek, Mistral

## Pricing
- $2 per million input tokens
- $8 per million output tokens
- No subscriptions, no minimums
- Pay-as-you-go with free $1.00 starting credit

## How Routing Works
- Classifies queries into categories: coding, analysis, creative, math, factual, chat
- Estimates difficulty: easy, medium, hard
- Routes to the cheapest model that maintains quality for that specific task
- Hard tasks (system design, formal verification) always use frontier models
- Simple tasks (factual lookups, basic coding) use cost-effective models

## Quality Validation
- Benchmarked against Claude Opus 4.6 on 8 established benchmarks
- MMLU: 98.2% of Opus quality
- ARC-Challenge: 100.7% of Opus quality
- GSM8K: 100.0% of Opus quality
- HumanEval+: 102.1% of Opus quality
- MBPP+: 105.8% of Opus quality

## API Compatibility
- Drop-in replacement for OpenAI SDK
- Works with LangChain, Vercel AI SDK, LlamaIndex
- Streaming support
- Tool/function calling support

## Environment Variables
- HERMA_API_KEY — your API key (starts with hk-)
- HERMA_BASE_URL — base URL (default: https://api.hermaai.com/v1)

## Quick Start

Python:
```python
import os
from openai import OpenAI
client = OpenAI(
    base_url="https://api.hermaai.com/v1",
    api_key=os.environ["HERMA_API_KEY"]
)
response = client.chat.completions.create(
    model="herma-auto",
    messages=[{"role": "user", "content": "Hello!"}]
)
```

Node.js:
```javascript
import OpenAI from "openai";
const client = new OpenAI({
  apiKey: process.env.HERMA_API_KEY,
  baseURL: "https://api.hermaai.com/v1"
});
const response = await client.chat.completions.create({
  model: "herma-auto",
  messages: [{ role: "user", content: "Hello!" }]
});
```

LangChain (Python):
```python
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
    model="herma-auto",
    openai_api_key=os.environ["HERMA_API_KEY"],
    openai_api_base="https://api.hermaai.com/v1"
)
```

Vercel AI SDK:
```javascript
import { createOpenAI } from "@ai-sdk/openai";
import { generateText } from "ai";
const herma = createOpenAI({ apiKey: process.env.HERMA_API_KEY, baseURL: "https://api.hermaai.com/v1" });
const { text } = await generateText({ model: herma("herma-auto"), prompt: "Hello!" });
```

LlamaIndex (Python):
```python
from llama_index.llms.openai import OpenAI as LlamaOpenAI
llm = LlamaOpenAI(model="herma-auto", api_key=os.environ["HERMA_API_KEY"], api_base="https://api.hermaai.com/v1")
```

## API Endpoints
- POST /v1/chat/completions — Chat completion requests (OpenAI-compatible)
- GET /v1/models — List available models (herma-auto + upstream)
- POST /v1/classify — Test the router: see classification, model selection, savings (free, no auth)

## Open-Source
- herma-eval: Benchmark toolkit for LLM routers (pip install herma-eval)
  - 7 benchmarks: GSM8K, HumanEval+, MMLU, MBPP+, ARC-Challenge, LiveCodeBench, Aider Polyglot
  - https://github.com/Nikobar5/herma-eval

## AI Coding Tool Integration
Drop a configuration file into your project root so your AI coding assistant automatically uses Herma for all LLM calls:
- Claude Code (CLAUDE.md): https://hermaai.com/integration/CLAUDE.md
- Cursor (.cursorrules): https://hermaai.com/integration/cursor-rules.txt
- Windsurf (.windsurfrules): https://hermaai.com/integration/windsurf-rules.txt
- Codex / Devin / other agents (AGENTS.md): https://hermaai.com/integration/AGENTS.md
- Environment variables (.env.example): https://hermaai.com/integration/.env.example

## Links
- Website: https://hermaai.com
- Documentation: https://hermaai.com/docs
- Blog: https://hermaai.com/blog
- Benchmarking Methodology: https://hermaai.com/blog/how-we-benchmark
- Try it free: https://hermaai.com/demo