JSON Token Counter
Count tokens in your JSON for GPT-4, GPT-3.5, Claude, and other LLMs. Optimize your context window and reduce API costs.
Tokens by Model
Character Stats
Tips to Reduce Token Usage
Why Token Counting Matters
Large Language Models like GPT-4 and Claude charge by the token, not by character or word. Understanding how your JSON translates to tokens helps you:
- Stay within context limits — GPT-4 has 8K-128K token limits
- Reduce API costs — Fewer tokens = lower bills
- Optimize prompts — More room for your actual question
- Improve response quality — Less noise in the context
What is a Token?
Tokens are the basic units that LLMs process. They're not quite characters and not quite words — they're somewhere in between:
"hello"= 1 token"Hello, world!"= 4 tokens{"name": "John"}= ~7 tokens
JSON tends to use more tokens per character than plain English because of all the punctuation ({} [] : , "").
Token Limits by Model
| Model | Context Window | Output Limit |
|---|---|---|
| GPT-4 Turbo | 128,000 tokens | 4,096 tokens |
| GPT-4 | 8,192 tokens | 4,096 tokens |
| GPT-4o | 128,000 tokens | 16,384 tokens |
| GPT-3.5 Turbo | 16,384 tokens | 4,096 tokens |
| Claude 3 Opus | 200,000 tokens | 4,096 tokens |
| Claude 3 Sonnet | 200,000 tokens | 4,096 tokens |
Optimizing JSON for Tokens
1. Minify Your JSON
Removing whitespace typically saves 10-30% of tokens:
// Before: ~50 tokens
{
"user": {
"name": "Alice",
"email": "[email protected]"
}
}
// After: ~35 tokens
{"user":{"name":"Alice","email":"[email protected]"}}Use our JSON Minify tool to compress your JSON.
2. Shorten Key Names
Long, descriptive keys are great for readability but costly for tokens:
// Before
{"firstName": "Alice", "lastName": "Smith", "emailAddress": "[email protected]"}
// After (saves ~30% tokens)
{"fn": "Alice", "ln": "Smith", "email": "[email protected]"}Consider using a key mapping in your prompt to maintain clarity.
3. Remove Null/Empty Values
Null values and empty strings still cost tokens:
// Before
{"name": "Alice", "middleName": null, "nickname": ""}
// After
{"name": "Alice"}4. Use Arrays for Repeated Structures
When you have many similar objects, consider a more compact format:
// Before: Array of objects
[{"name": "Alice", "age": 30}, {"name": "Bob", "age": 25}]
// After: Separate arrays (fewer repeated keys)
{"names": ["Alice", "Bob"], "ages": [30, 25]}5. Consider Alternative Formats
For very large datasets, consider:
- CSV format — Much more token-efficient for tabular data
- YAML — Slightly more efficient than JSON
- Custom formats — Define your own compact syntax
Programmatic Token Counting
JavaScript/TypeScript
// Using tiktoken (official OpenAI tokenizer)
import { encoding_for_model } from 'tiktoken';
const encoder = encoding_for_model('gpt-4');
const tokens = encoder.encode(JSON.stringify(data));
console.log('Token count:', tokens.length);
encoder.free(); // Don't forget to free memoryPython
import tiktoken
import json
encoder = tiktoken.encoding_for_model("gpt-4")
tokens = encoder.encode(json.dumps(data))
print(f"Token count: {len(tokens)}")Cost Comparison
Here's how token counts affect your API costs (as of 2024):
| Model | Input Cost | Output Cost |
|---|---|---|
| GPT-4 Turbo | $0.01/1K tokens | $0.03/1K tokens |
| GPT-4o | $0.005/1K tokens | $0.015/1K tokens |
| GPT-3.5 Turbo | $0.0005/1K tokens | $0.0015/1K tokens |
| Claude 3 Opus | $0.015/1K tokens | $0.075/1K tokens |
| Claude 3 Sonnet | $0.003/1K tokens | $0.015/1K tokens |
Related Tools
- JSON Minify — Compress JSON to save tokens
- JSON Repair — Fix malformed LLM JSON output
- JSON Validator — Validate JSON before sending to LLMs
- JSON Pretty Print — Format JSON for readability
Frequently Asked Questions
Why do different models have different token counts?
Each model family uses a different tokenizer. GPT-4 and GPT-3.5 use cl100k_base, while GPT-4o uses the newer o200k_base which is more efficient. Claude uses its own tokenizer with slightly different characteristics.
Is the token count exact?
This tool provides estimates based on the tokenization algorithms. For production use, consider using the official tiktoken library for exact counts. Our estimates are typically within 5-10% of the actual count.
Do whitespace tokens cost money?
Yes! Every token costs the same, whether it's meaningful content or whitespace. That's why minifying JSON can significantly reduce costs for large payloads.
Should I always minify JSON for LLMs?
Not necessarily. For small payloads, the token savings are minimal and formatted JSON may help the model understand the structure better. For large payloads (1000+ tokens), minification is usually worth it.