Prompt vs Completion Cost Calculator

Most LLM APIs charge differently for prompt tokens (what you send) versus completion tokens (what the model generates). Completions typically cost 3–5× more per token because generating text is computationally more intensive than processing it.

Why Output Tokens Cost More

During generation, the model must run a full forward pass for every token it produces. Processing your input prompt (prefill) is highly parallelizable and batched across the GPU, making it cheaper per token. Use this tool to understand the true cost split of any API call.

Tips to Reduce Completion Costs

Set max_tokens explicitly to cap output length.
Use stop sequences to end generation early when a task-complete signal appears.
For classification tasks, instruct the model to answer with a single word or JSON value.
Use streaming with early exit when you have enough information.

Related Tools

Token Cost Converter — enter exact token counts and get instant USD cost
Monthly API Budget Planner — plan total monthly spend across your workload
ChatGPT Prompt Cost Estimator — real-time cost estimation for GPT models
Multi-Model Cost Comparison — compare costs across all major LLMs