How accurate are the LLM API prices on apiroute.dev?

apiroute.dev uses comparison data transformed from public pricing metadata. Provider prices, cache rules, context limits, availability, and billing terms can change, so provider pages should be verified before production routing or purchasing decisions.

Which model is cheapest for long-context work?

The cheapest long-context option depends on the prompt size, expected output tokens, cache share, and required capabilities. Use the leaderboard, context calculator, and Best Route section to compare models for the current job.

Is apiroute.dev affiliated with OpenAI, Anthropic, Google, or other model providers?

No. apiroute.dev is independent. Provider and model names are used referentially to identify API pricing, capabilities, and compatibility comparisons.

LLM API Pricing Comparison & Token Calculator (Live Costs)

Q: What is cached input pricing?

Cached input pricing is a lower price some providers apply when repeated prompt segments, system prompts, document prefixes, or stable context blocks can be reused instead of billed as entirely new input tokens.

Cost intelligence for AI infrastructure

Find the cheapest LLM API route.

Compare prices, cache discounts, prompt costs, and model routes. Before paying for cloud tokens, check whether the model fits on your own GPU. Local AI companion tool Can my GPU run this LLM? Check VRAM fit, model size, quantization, and local-vs-cloud fallback. Open local GPU checker Read update

Quick route

Find the right model route for the job.

First choose the use case. Then choose whether you want the best model, the cheapest route, best value, or your own model.

1. Use case

2. Optimize for

The result uses the live model catalog, selected token settings, and rule-based quality/use-case scoring.

Cheapest

Loading route...

The router is loading model pricing data.

One run: $0.000000
10/day: $0.000000
300/month: $0.000000
Context: n/a

Why this helps

The route shows the cheapest fitting model and scales it to a workflow budget.

Top picks

Loading picks...

Refine route Check wasted tokens Open leaderboard

Cost estimate

Current prompt cost

Loading model

$0.000000

Words: 0
Tokens: 0
Cached: 0

Prompt cost engine Edit prompt and cost settings

Token Calculator

Prompt cost engine

Token approximation: 1 word equals roughly 1.3 tokens.

Model

Prompt text

Cache share 0%

Expected Output Tokens

Best Route

Recommended API route

One clear route first. Open the settings when the job needs different requirements.

Rule-based V1

Loading recommendation...

Route settings Adjust job and compare alternatives

Use case Priority

Require vision Require function calling Prefer prompt caching

Loading routes...

Live Route API

Test the endpoint

Endpoint

POST /api/recommend-route

Use case Priority

Prompt tokens Output tokens

Cache share 0%

Function calling Vision Prompt caching Local first

Privacy class

Server recommendation waits for a request.

Arbitrage Matrix Full leaderboard Live

Model views

Loading models...

Sort

Model	Provider	Context	Input
Loading models...

Data source

Freshness, source and currency view

pricing API

Models tracked Loading

Checked unknown

Currency view Provider prices are sourced in USD where available. EUR uses a static planning rate.

Prices are comparison data. Verify provider pages before production routing or purchasing decisions.

Planning score is a rule-based heuristic derived from price tier, context window, capabilities, and model name signals. It is not an external benchmark, Elo rating, or LMArena score. Local open-weight models can serve as a practical cost fallback when frontier model pricing rises or availability changes; use this as a planning signal, not a performance guarantee.

Token Waste Check

Find overpaid agent routes

Compare the selected model with cheaper SLM, budget, and local-open routes before a multi-step agent spends tokens.

Workload model Use case Iterations Tool calls / iteration

Cache share 50%

Fine-tune tokens and constraints

System + memory tokens Context tokens / iteration Tool overhead / iteration Output tokens / iteration

Function calling Vision

Loading agent workload estimate...

Cheaper matching routes

Route	Cost	Context
Loading routes...

Model Intelligence

Loading model data...

Selected model metadata updates with the calculator.

Source

Model details Prices, context and capabilities

Output / 1M $0.00

Cache Read / 1M $0.00

Context Window 0

Max Output 0

Features

source_key: loading checked: loading

Context Calculator

Document fit and cost matrix

Paste text or choose a preset to compare document fit, output limits, and one-time versus cached analysis costs.

Waiting for text

Fits 0

Output capped 0

Too large 0

Context matrix Compare all models

Model	Fit	Context Used	One-time	Cached repeat
Enter text to compare all models.

Token approximation and cache costs

The calculator uses a simple token approximation: 1 word ≈ 1.3 tokens. It is not an exact tokenizer simulation, but it is useful for fast cost comparisons across LLM APIs and for estimating prompt length before production use.

Cache read costs describe lower prices for prompt segments that have already been cached. With providers such as OpenAI or Anthropic, reused system prompts, long contexts, or repeated prefixes can cost significantly less than entirely new input tokens.

FAQ

LLM pricing questions

How accurate are these LLM API prices?

Prices are comparison data transformed from public pricing metadata. Verify provider pages before production routing or purchasing decisions.

What is cached input pricing?

It is a lower price for repeated prompt segments, reused system prompts, document prefixes, or stable context blocks when a provider supports prompt caching.

Which model is cheapest for long context?

It depends on prompt size, output tokens, cache share, and required capabilities. Use the context matrix and Best Route cards for the current job.

Is apiroute.dev affiliated with model providers?

No. Provider and model names are used referentially for API pricing, capability, and compatibility comparisons.

For AI Agents

Machine-readable pricing and routing layer

Use these endpoints directly in agents, crawlers, and workflow tools. No HTML table scraping required.

AI-readable MVP

/api/live-prices

Best default JSON endpoint for current model pricing, capabilities, freshness, and source metadata.

/api/models

Model list only, useful when an agent already knows the metadata policy.

/openapi.yaml

OpenAPI 3.1 spec for tool registration in agents and workflow systems.

/llms.txt

Compact crawler guide for discovering the pricing API and agent-readable docs.

Route recommendation guide

Static machine guide for the Best Route V1 scoring inputs and route labels.

Open JSON

Token Waste Check contract

Agent-readable guide for detecting overpaid repeated prompts and cheaper SLM/budget routes.

Open JSON

{
  "tool": "apiroute_prices",
  "openapi": "https://apiroute.dev/openapi.yaml",
  "default_endpoint": "https://apiroute.dev/api/live-prices",
  "routing_guide": "https://apiroute.dev/api/route-recommendation-guide",
  "recommend_endpoint": "https://apiroute.dev/api/recommend-route",
  "token_waste_contract": "https://apiroute.dev/api/token-waste-check"
}

/models/ /compare/ /groups/ /llms-full.txt /api/providers /api/token-waste-check /api/marketing-radar

Business Layer

Commercial options without paid rankings

apiroute.dev can test monetization through alerts, provider sponsorships, and premium data access while keeping the comparison table independent.

MVP validation

Lead magnet

Price change alerts

A simple waitlist for teams that want alerts when model prices, context windows, or cache discounts change.

Join alert waitlist

Partner link

AI/ML API provider option

One OpenAI-compatible API for many hosted models. Use it as a provider option when you want to test model routing without integrating every vendor separately.

Affiliate link. This does not affect model rankings, calculator results, or route recommendations.

Try AI/ML API

B2B API

Premium data access

Future paid access could add higher refresh frequency, pricing history, diff alerts, and machine-readable change logs.

Request API details

Sponsorship

Clearly labeled sponsor slots

Provider placements can be sold only if clearly labeled. Core price rankings stay sorted by data, not payment.

Discuss sponsorship

Commercial metadata is available for agents and partners as JSON. It describes what is testable now and what is intentionally not sold.

Open JSON

Market Radar

Demand signals and approval queue

Human approval required