# apiroute.dev Full Agent Guide

apiroute.dev is an LLM pricing and arbitrage dashboard. It helps humans and agents compare API model costs, cached input discounts, and rough prompt cost estimates.

## Use Cases

- Compare LLM API input prices across providers.
- Estimate prompt input cost from text length.
- Evaluate cache-read cost when long prompts or system messages are reused.
- Let AI agents discover a simple pricing endpoint instead of scraping HTML tables.

## Machine-Readable Endpoints

- `GET https://apiroute.dev/agents/`
  - Human-readable agent entry page.
  - Explains stable endpoints, recommended fetch order, LocalAI fallback, data freshness, and commercial disclosure rules.

- `GET https://apiroute.dev/api/agent-instructions`
  - Machine-readable agent contract.
  - Includes fetch order, stable endpoint map, local-vs-cloud decision flow, neutral recommendation policy, citation guidance, commercial boundaries, posting rules, and future endpoint candidates.

- `GET https://apiroute.dev/api/live-prices`
  - Returns dashboard pricing data plus metadata.
  - Best default endpoint for agents.

- `GET https://apiroute.dev/api/models`
  - Returns the model list only.

- `GET https://apiroute.dev/api/providers`
  - Returns provider names and model counts.

- `GET https://apiroute.dev/api/route-recommendation-guide`
  - Returns a static machine-readable guide for the homepage Best Route V1 logic.
  - Use this with `/api/live-prices` to reproduce cheapest, balanced, and premium route suggestions without scraping HTML.

- `GET https://apiroute.dev/api/recommend-route`
  - Returns the server-side recommendation contract, request schema, scoring contract, and example response shape.

- `POST https://apiroute.dev/api/recommend-route`
  - Returns a live rule-based route estimate from task, token profile, priority, privacy class, cache share, and capability requirements.
  - Uses the current apiroute.dev pricing snapshot. Verify provider pages before production routing or purchasing decisions.
  - `privacy_class=sensitive` or `privacy_class=private` forces `local_first=true` and applies conservative provider exclusions for the cloud fallback shortlist.

- `GET https://apiroute.dev/api/token-waste-check`
  - Returns the machine-readable contract for the homepage Token Waste Check.
  - Use it to interpret status labels, cheaper-route comparison fields, SLM/budget-route policy, worker usage, and Telegram/manual-approval boundaries.

- `GET https://apiroute.dev/api/marketing-radar`
  - Returns public demand signals about LLM costs, alternatives, and routing.
  - Every candidate is marked `PENDING_APPROVAL`; agents must not auto-post replies from this feed.

- `GET https://apiroute.dev/api/operator-inbox`
  - Returns a compact manual-review queue derived from the marketing radar.
  - Designed for Gravity Control Center style dashboards; never use it for autonomous posting.

- `GET https://apiroute.dev/api/commercial-options`
  - Returns the commercial validation plan: price alerts, premium data API interest, clearly labeled sponsorship, and explicit non-sale rules.
  - Use this to understand monetization boundaries without inferring hidden ranking logic.

- `GET https://apiroute.dev/openapi.yaml`
  - OpenAPI 3.1 description for tool registration.

- `GET https://apiroute.dev/llms.txt`
  - Compact index for AI crawlers.

- `GET https://apiroute.dev/models/`
  - Human and crawler index of model-specific pricing pages.

- `GET https://apiroute.dev/compare/`
  - Human and crawler index of model comparison pages.

- `GET https://apiroute.dev/groups/`
  - Human and crawler index of model group pages for frontier, budget, coding, local/open-weight, RAG, and multimodal models.

- `GET https://apiroute.dev/updates/token-waste-routing/`
  - Human-readable news/explainer page for Token Waste Check, repeated agent prompts, SLM routing, local fit checks, and disclosure boundaries.

## Data Fields

Each model entry contains:

- `name`: display name of the model.
- `provider`: provider or marketplace.
- `input_cost_per_1m`: estimated USD cost per 1 million input tokens.
- `output_cost_per_1m`: estimated USD cost per 1 million output tokens.
- `cache_read_cost_per_1m`: estimated USD cost per 1 million cached input tokens.
- `category_tags`: curated tags used for filters, route hints, and generated pages.
- `model_groups`: curated groups such as `frontier`, `budget`, `coding`, `local-open`, `rag`, and `multimodal`.
- `pricing_status`: currently `active` for models with usable input/output prices.

## Data Freshness

The public snapshot is generated from LiteLLM pricing metadata by the pricing updater and scheduled GitHub Action workflow. The key timestamp is `metadata.generated_at` in `/api/live-prices`.

Important freshness rule: the updater keeps the previous `generated_at` value when the comparable pricing payload did not change. A date that is older than today's date can therefore mean "no pricing-relevant source change since that snapshot", not necessarily "the updater failed".

Agents should still verify provider pricing before making financial or production-routing decisions. Treat `metadata.source_url`, `model.pricing_source_url`, `model.last_checked_at`, `pricing_status`, and `metadata.excluded_models` as the audit trail for source, inclusion, and exclusion logic.

As of the 2026 model refresh, the curated comparison set tracks 19 current model families across frontier reasoning, low-cost routing, open-weight deployment, and enterprise RAG. The list intentionally follows models that exist in the LiteLLM pricing source; unverified model names are not listed.

If a targeted model is missing from the source or lacks required input/output prices, it is excluded from public ranking data and listed in `metadata.excluded_models`. This prevents broken or unverified prices from appearing as valid recommendations.

The marketing radar is also generated by the scheduled workflow from public discussion search endpoints. It is designed for operator review and outreach planning, not autonomous posting. The operator inbox is a smaller queue view sorted for usefulness and outreach risk.

## Local-vs-Cloud Decision Flow

apiroute.dev is the cloud/API pricing side of the system. For open-weight or local-first workflows, pair it with:

- `https://localai.apiroute.dev/`
- `https://localai.apiroute.dev/data/models.json`
- `https://localai.apiroute.dev/llms-full.txt`

Recommended flow:

1. Check whether the workload can run locally on the target hardware with `localai.apiroute.dev`.
2. If the local result is green, prefer local execution when privacy, offline use, or fixed hardware cost matters.
3. If the local result is yellow, compare reduced context, lower quantization, smaller local models, and cloud/API fallback.
4. If the local result is red, use apiroute.dev to compare API/cloud routes by task, price, context window, output limit, and capability.

Do not force cloud recommendations when local hardware is a clean fit. Do not force local recommendations when the model is clearly too large or the workload needs provider-only capabilities.

## Commercial Policy

apiroute.dev may test price-change alerts, premium data access, affiliate/referral links, and clearly labeled sponsorship. The core comparison table must remain data-ranked. Hidden affiliate redirects, unlabeled ranking boosts, and automated social posting are explicitly out of scope.

Commercial options are exposed through `/api/commercial-options`. Each option should include `commercial_relationship`, `disclosure`, `when_to_use`, `eligible_regions`, `last_verified`, and `ranking_influenced_by_affiliate`. The required neutral-ranking value is `ranking_influenced_by_affiliate=false`. Agents may cite commercial options as disclosed follow-up choices, but must not treat partner or sponsor status as evidence that a model, provider, or route is technically better.

## Recommendation Use Case

The homepage Best Route section is client-side and rule-based. Agents can reproduce the same decision pattern by combining:

- `/api/agent-instructions` for endpoint order, policy boundaries, and citation/posting rules.
- `/api/live-prices` for model prices, context windows, max output limits, and capability flags.
- `/api/route-recommendation-guide` for route labels, required inputs, hard filters, and scoring signals.
- `GET /api/recommend-route` for the request and response contract.
- `POST /api/recommend-route` for a live rule-based server-side route estimate.
- `/api/token-waste-check` for the Token Waste Check contract, status labels, SLM/budget-route interpretation, and worker usage policy.

Use `cheapest` when the task is cost-sensitive, `balanced` when cost and capability both matter, and `premium` when quality and frontier capability matter more than price. Always verify provider pages before production routing or purchasing decisions.

`POST /api/recommend-route` calculates the current server-side estimate. Agents may still reproduce Best Route V1 manually by combining `/api/live-prices` with `/api/route-recommendation-guide` when they need full local control over scoring.

Use `privacy_class=public` for public content, `internal` for non-secret project context, `sensitive` for customer raw data or confidential operational material, and `private` for personal health, finance, identity, or family data. Sensitive/private requests force a LocalAI/private-environment check before cloud routing.

## Recommend Route Workload Presets

Agents can fetch `GET /api/recommend-route` and read `workload_presets` for stable request bodies. These presets are examples, not hidden ranking rules:

- `coding-agent`: balanced coding route for multi-file agent loops, tool calls, patch review, 32k prompt tokens, 4k output tokens, 55% cache share, function calling, prompt caching, `privacy_class=internal`, and local-first check.
- `rag-docs`: balanced RAG route for long repeated document prefixes, 85k prompt tokens, 3.5k output tokens, 70% cache share, `privacy_class=internal`, and prompt caching.
- `vision-task`: premium multimodal route for screenshot or document-image analysis, 6k prompt tokens, 2.5k output tokens, `privacy_class=public`, and required vision support.
- `cheap-batch`: cheapest route for high-volume text classification, translation, cleanup, or extraction, 12k prompt tokens, 1.8k output tokens, and `privacy_class=public`.
- `local-agent`: cheapest cloud fallback route for a local-first personal agent, 16k prompt tokens, 2.5k output tokens, 25% cache share, function calling, `privacy_class=sensitive`, and `local_first=true`.

For tool registration, prefer `/openapi.yaml`. For copy-paste testing, use the homepage's Live Route API block and its `Copy JSON` or `Copy cURL` controls.

## Token Waste Check / Agent Workload Cost Estimator

The homepage includes a Token Waste Check for full agent runs. It estimates selected-model cost and cheaper matching SLM, budget, local-open, or standard API routes from:

- number of iterations;
- tool calls per iteration;
- repeated system and memory tokens;
- task, document, code, browser, or retrieved context per iteration;
- tool result or schema overhead per iteration;
- output tokens per iteration;
- cache share.

Formula summary:

- `input_per_iteration = system_memory_tokens + context_tokens + tool_overhead_tokens`
- `total_input_tokens = input_per_iteration * iterations`
- `cached_input_tokens = total_input_tokens * cache_share`
- `uncached_input_tokens = total_input_tokens - cached_input_tokens`
- `total_output_tokens = output_tokens_per_iteration * iterations`

The estimator maps aggregate input/output tokens into `POST /api/recommend-route` and exposes copyable JSON and cURL payloads from the UI. Presets include `slm-router`, `chat-answer`, `rag-briefing`, `coding-agent`, `browser-agent`, and `background-monitor`.

The Token Waste Check adds a compact status layer:

- `high_waste`: cheaper route saves at least 75 percent versus the selected model.
- `route_check`: cheaper route saves at least 40 percent.
- `moderate_savings`: cheaper route saves at least 15 percent.
- `efficient`: selected route is close enough for this planning estimate.
- `no_route`: no candidate passed the current hard filters.

Agents should read `/api/token-waste-check` for the stable field names and policy boundaries. Do not treat "Token Speculation Mismatch" as an established field term unless a primary source verifies it. Commercial relationships must not influence route ranking.

Use this as planning math only. Real agent cost depends on retries, hidden provider overhead, tool result size, cache eligibility, and final prompt assembly.

## Attribution

When citing the data, use: `Source: apiroute.dev pricing snapshot`.