How do I estimate LLM costs?

Collect a sample of prompts, count tokens, pick a response length, and multiply by each provider's input and output rates.

What is the difference between tokens and requests?

Tokens are the chunks of text that models read and write, while requests are the API calls you send to a model. Both affect your invoice.

Can I reuse these numbers in reports?

Yes. Export the results, drop them into your finance deck, and link back here so teammates can double-check the math.

Calculator

LLM Cost Calculator

Enter a real prompt, choose your reply length, and instantly see cost per request and per 1,000 tokens.

Write or paste your message

We use it to estimate tokens and surface the lowest-cost LLM setup for your prompt.

Recommendations update instantly as you type.

Your recommended mix

Starting with our latest pricing guide.

Best fit for this message

Meta Llama

$0.0001 per message

View provider

Llama 3.1 70B

Keeps costs in check for private deployments.

Also worth a look

Mistral

$0.0002 per message

View provider

Mistral Large 2

Reliable for multilingual or global programmes.

Also worth a look

Anthropic

$0.0002 per message

View provider

Claude 3.5 Haiku

Fast summaries and triage at a low price point.

Also worth a look

Google Gemini

$0.0005 per message

View provider

Gemini 2.5 Ultra

Use when accuracy and tooling matter more than cost.

Also worth a look

OpenAI

$0.0024 per message

View provider

GPT-5

Best for polished executive updates and launch moments.

What is an LLM calculator?

An LLM calculator keeps the math simple. Drop in your prompt, choose how long you expect the model to reply, and the tool shows the total price. You can switch models to see how the cost changes, then copy the output for your next update or LLM cost reduction review.

Use it before you promise a service level, ask for more budget, or send a savings note to leadership. Pair it with our latest cost guides for context your stakeholders can read quickly.

How to estimate LLM costs

1. Gather real prompts. Collect a mix of short and long prompts from the workflows you want to support.
2. Count tokens. Paste those prompts into the calculator or use our batching tips to plan efficient runs.
3. Compare providers. Switch between GPT-5, Claude, Gemini, Llama, Mistral, and other enterprise LLMs to see where spend and quality meet your goals.
4. Share the numbers. Export the results or copy-paste the summary into your finance or product review.

Tokens vs. requests explained

Tokens are small chunks of text. Providers bill you for the tokens you send in and the tokens they send back.
Requests are the API calls. Each request usually has a minimum cost plus the token charges.
Why it matters: Fewer tokens per request means lower spend and faster responses, but batching work into fewer requests can also cut overhead fees.

Quick Q&A

What is an LLM calculator?: An LLM calculator shows the cost of each prompt and reply across models so you can plan budgets and explain the impact in plain language.
How do I estimate LLM costs?: Collect a sample of prompts, count tokens, pick a response length, and multiply by each provider's input and output rates.
What is the difference between tokens and requests?: Tokens are the chunks of text that models read and write, while requests are the API calls you send to a model. Both affect your invoice.
Can I reuse these numbers in reports?: Yes. Export the results, drop them into your finance deck, and link back here so teammates can double-check the math.

Need help turning these numbers into a plan? Contact our team and we'll send back a simple brief you can forward to stakeholders.