Calculate tokens for Claude Opus 4.5, GPT-5.2, Gemini 3, and 9+ LLMs. Optimize your prompts to reduce API costs, fit context windows, and streamline AI workflows.
Paste your prompt, document, or code into the text area. You can also click Add Files to import .txt, .md, or .json files.
Instantly see estimated token counts for 9+ popular LLMs including Claude Opus 4.5, Gemini 3 Flash, GPT-5.2, DeepSeek-V3.2, Llama 3.3, and more.
Monitor how much of your context window is used across different sizes (4k to 200k tokens).
Expand the Optimization Tools panel to strip whitespace, comments, markdown, emojis, and more to reduce token count and save on API costs.
Copy to clipboard or download as .txt or .md for use in your AI projects and workflows.
Privacy Note: All processing happens locally in your browser. Your text never leaves your device.
Get precise token estimates for the latest models including Claude Opus, GPT-4, and Gemini 1.5 Pro.
Optimize your prompts by removing unnecessary whitespace and overhead. Save money on every API call.
Visual context window bars help you fit your prompt perfectly within model limits (8k to 2M+ tokens).
Perfect for corporate use. Since no data is sent to the cloud, you can safely paste sensitive internal docs.
Import files directly, clean JSON, and export your optimized prompts to .txt or .md for your codebase.
No subscriptions, no limits. Use it as much as you need for your AI development workflows.
LLMs don't read text like humans do (letter by letter). They convert text into "tokens" — chunks of characters that commonly appear together. A general rule of thumb is that 1,000 tokens is about 750 words.
Different models use different "tokenizers" (dictionaries), which is why the same sentence can have different token counts in GPT-4 vs. Claude 3.
1. Save Money: APIs charge per million tokens. Removing 10% fluff means 10% savings.
2. Reduce Latency: Fewer tokens mean faster generation times.
3. Fit Context: If your prompt is too long, the model "forgets" the beginning. Optimizing helps you fit more relevant data into the context window.
Extremely accurate. We use the official tokenizer logic for each model family (CL100k_base for OpenAI, standard BPE for Llama/Mistral/Anthropic). While there might be tiny discrepancies due to specific model updates, our counts are generally within 1-2 tokens of the operational API values.
No. This tool contains the tokenization logic directly in the javascript code running on your computer. It calculates the counts without ever making a network request. Your prompts are never sent to us, OpenAI, Google, Anthropic, or anyone else.
It depends on which options you check! You can strip:
• Whitespace: extra spaces, tabs, and newlines that consume tokens
but add no meaning.
• Comments: code comments (// or /* */) that the model often
ignores anyway.
• Markdown: converts bold/italic/headers to plain text.
• Emojis: removes graphical characters.
Yes. Token counting and optimization run locally in your browser, and your prompts are never uploaded to our servers.