apiroute.dev

LLM Pricing & Arbitrage Hub

Cost intelligence for AI infrastructure

Find the cheapest LLM API route.

Compare prices, cache discounts, prompt costs, and model routes. Before paying for cloud tokens, check whether the model fits on your own GPU. Local AI companion tool Can my GPU run this LLM? Check VRAM fit, model size, quantization, and local-vs-cloud fallback. Open local GPU checker Read update

Quick route

Find the right model route for the job.

First choose the use case. Then choose whether you want the best model, the cheapest route, best value, or your own model.

1. Use case

2. Optimize for

The result uses the live model catalog, selected token settings, and rule-based quality/use-case scoring.
Loading route...

The router is loading model pricing data.

One run
$0.000000
10/day
$0.000000
300/month
$0.000000
Context
n/a

Why this helps

The route shows the cheapest fitting model and scales it to a workflow budget.

Top picks

Loading picks...

Cost estimate

Current prompt cost

Loading model

$0.000000

Words
0
Tokens
0
Cached
0
Prompt cost engine Edit prompt and cost settings

Token Calculator

Prompt cost engine

Token approximation: 1 word equals roughly 1.3 tokens.

Best Route

Recommended API route

One clear route first. Open the settings when the job needs different requirements.

Rule-based V1
Loading recommendation...
Route settings Adjust job and compare alternatives
Loading routes...

Live Route API

Test the endpoint

Endpoint

POST /api/recommend-route

Cache share 0%
Server recommendation waits for a request.
Arbitrage Matrix Full leaderboard Live

Model views

Loading models...

Model Context Input
Loading models...

Data source

Freshness, source and currency view

pricing API
Models tracked Loading
Checked unknown

Prices are comparison data. Verify provider pages before production routing or purchasing decisions.

Planning score is a rule-based heuristic derived from price tier, context window, capabilities, and model name signals. It is not an external benchmark, Elo rating, or LMArena score. Local open-weight models can serve as a practical cost fallback when frontier model pricing rises or availability changes; use this as a planning signal, not a performance guarantee.

Token Waste Check

Find overpaid agent routes

Compare the selected model with cheaper SLM, budget, and local-open routes before a multi-step agent spends tokens.

Cache share 50%
Fine-tune tokens and constraints
Loading agent workload estimate...

Cheaper matching routes

Route Cost Context
Loading routes...

Model Intelligence

Loading model data...

Selected model metadata updates with the calculator.

Source
Model details Prices, context and capabilities
Output / 1M $0.00
Cache Read / 1M $0.00
Context Window 0
Max Output 0
Features
source_key: loading checked: loading

Context Calculator

Document fit and cost matrix

Paste text or choose a preset to compare document fit, output limits, and one-time versus cached analysis costs.

Waiting for text
Fits 0
Output capped 0
Too large 0
Context matrix Compare all models
Model Fit Context Used One-time Cached repeat
Enter text to compare all models.

Token approximation and cache costs

The calculator uses a simple token approximation: 1 word ≈ 1.3 tokens. It is not an exact tokenizer simulation, but it is useful for fast cost comparisons across LLM APIs and for estimating prompt length before production use.

Cache read costs describe lower prices for prompt segments that have already been cached. With providers such as OpenAI or Anthropic, reused system prompts, long contexts, or repeated prefixes can cost significantly less than entirely new input tokens.

FAQ

LLM pricing questions

How accurate are these LLM API prices?

Prices are comparison data transformed from public pricing metadata. Verify provider pages before production routing or purchasing decisions.

What is cached input pricing?

It is a lower price for repeated prompt segments, reused system prompts, document prefixes, or stable context blocks when a provider supports prompt caching.

Which model is cheapest for long context?

It depends on prompt size, output tokens, cache share, and required capabilities. Use the context matrix and Best Route cards for the current job.

Is apiroute.dev affiliated with model providers?

No. Provider and model names are used referentially for API pricing, capability, and compatibility comparisons.

For AI Agents

Machine-readable pricing and routing layer

Use these endpoints directly in agents, crawlers, and workflow tools. No HTML table scraping required.

AI-readable MVP

Best default JSON endpoint for current model pricing, capabilities, freshness, and source metadata.

Model list only, useful when an agent already knows the metadata policy.

OpenAPI 3.1 spec for tool registration in agents and workflow systems.

/llms.txt

Compact crawler guide for discovering the pricing API and agent-readable docs.

Route recommendation guide

Static machine guide for the Best Route V1 scoring inputs and route labels.

Open JSON

Token Waste Check contract

Agent-readable guide for detecting overpaid repeated prompts and cheaper SLM/budget routes.

Open JSON
{
  "tool": "apiroute_prices",
  "openapi": "https://apiroute.dev/openapi.yaml",
  "default_endpoint": "https://apiroute.dev/api/live-prices",
  "routing_guide": "https://apiroute.dev/api/route-recommendation-guide",
  "recommend_endpoint": "https://apiroute.dev/api/recommend-route",
  "token_waste_contract": "https://apiroute.dev/api/token-waste-check"
}

Business Layer

Commercial options without paid rankings

apiroute.dev can test monetization through alerts, provider sponsorships, and premium data access while keeping the comparison table independent.

MVP validation
Lead magnet

Price change alerts

A simple waitlist for teams that want alerts when model prices, context windows, or cache discounts change.

Join alert waitlist
Partner link

AI/ML API provider option

One OpenAI-compatible API for many hosted models. Use it as a provider option when you want to test model routing without integrating every vendor separately.

Affiliate link. This does not affect model rankings, calculator results, or route recommendations.

Try AI/ML API
B2B API

Premium data access

Future paid access could add higher refresh frequency, pricing history, diff alerts, and machine-readable change logs.

Request API details
Sponsorship

Clearly labeled sponsor slots

Provider placements can be sold only if clearly labeled. Core price rankings stay sorted by data, not payment.

Discuss sponsorship

Commercial metadata is available for agents and partners as JSON. It describes what is testable now and what is intentionally not sold.

Open JSON

Market Radar

Demand signals and approval queue

Human approval required