JSON Token Counter

Count tokens in your JSON for GPT-4, GPT-3.5, Claude, and other LLMs. Optimize your context window and reduce API costs.

503 chars
Loading...
Estimated tokens (GPT-4)

Tokens by Model

Character Stats

Characters
No whitespace
Lines
~
Chars/token

Tips to Reduce Token Usage

Remove whitespace for 10-30% savings
Shorten keys
"firstName" → "fn" saves tokens
Remove nulls
Omit null/empty fields when possible
Use arrays
Arrays use fewer tokens than repeated objects

Why Token Counting Matters

Large Language Models like GPT-4 and Claude charge by the token, not by character or word. Understanding how your JSON translates to tokens helps you:

  • Stay within context limits — GPT-4 has 8K-128K token limits
  • Reduce API costs — Fewer tokens = lower bills
  • Optimize prompts — More room for your actual question
  • Improve response quality — Less noise in the context

What is a Token?

Tokens are the basic units that LLMs process. They're not quite characters and not quite words — they're somewhere in between:

  • "hello" = 1 token
  • "Hello, world!" = 4 tokens
  • {"name": "John"} = ~7 tokens

JSON tends to use more tokens per character than plain English because of all the punctuation ({} [] : , "").

Token Limits by Model

ModelContext WindowOutput Limit
GPT-4 Turbo128,000 tokens4,096 tokens
GPT-48,192 tokens4,096 tokens
GPT-4o128,000 tokens16,384 tokens
GPT-3.5 Turbo16,384 tokens4,096 tokens
Claude 3 Opus200,000 tokens4,096 tokens
Claude 3 Sonnet200,000 tokens4,096 tokens

Optimizing JSON for Tokens

1. Minify Your JSON

Removing whitespace typically saves 10-30% of tokens:

// Before: ~50 tokens
{
  "user": {
    "name": "Alice",
    "email": "[email protected]"
  }
}

// After: ~35 tokens
{"user":{"name":"Alice","email":"[email protected]"}}

Use our JSON Minify tool to compress your JSON.

2. Shorten Key Names

Long, descriptive keys are great for readability but costly for tokens:

// Before
{"firstName": "Alice", "lastName": "Smith", "emailAddress": "[email protected]"}

// After (saves ~30% tokens)
{"fn": "Alice", "ln": "Smith", "email": "[email protected]"}

Consider using a key mapping in your prompt to maintain clarity.

3. Remove Null/Empty Values

Null values and empty strings still cost tokens:

// Before
{"name": "Alice", "middleName": null, "nickname": ""}

// After
{"name": "Alice"}

4. Use Arrays for Repeated Structures

When you have many similar objects, consider a more compact format:

// Before: Array of objects
[{"name": "Alice", "age": 30}, {"name": "Bob", "age": 25}]

// After: Separate arrays (fewer repeated keys)
{"names": ["Alice", "Bob"], "ages": [30, 25]}

5. Consider Alternative Formats

For very large datasets, consider:

  • CSV format — Much more token-efficient for tabular data
  • YAML — Slightly more efficient than JSON
  • Custom formats — Define your own compact syntax

Programmatic Token Counting

JavaScript/TypeScript

// Using tiktoken (official OpenAI tokenizer)
import { encoding_for_model } from 'tiktoken';

const encoder = encoding_for_model('gpt-4');
const tokens = encoder.encode(JSON.stringify(data));
console.log('Token count:', tokens.length);
encoder.free(); // Don't forget to free memory

Python

import tiktoken
import json

encoder = tiktoken.encoding_for_model("gpt-4")
tokens = encoder.encode(json.dumps(data))
print(f"Token count: {len(tokens)}")

Cost Comparison

Here's how token counts affect your API costs (as of 2024):

ModelInput CostOutput Cost
GPT-4 Turbo$0.01/1K tokens$0.03/1K tokens
GPT-4o$0.005/1K tokens$0.015/1K tokens
GPT-3.5 Turbo$0.0005/1K tokens$0.0015/1K tokens
Claude 3 Opus$0.015/1K tokens$0.075/1K tokens
Claude 3 Sonnet$0.003/1K tokens$0.015/1K tokens

Related Tools

Frequently Asked Questions

Why do different models have different token counts?

Each model family uses a different tokenizer. GPT-4 and GPT-3.5 use cl100k_base, while GPT-4o uses the newer o200k_base which is more efficient. Claude uses its own tokenizer with slightly different characteristics.

Is the token count exact?

This tool provides estimates based on the tokenization algorithms. For production use, consider using the official tiktoken library for exact counts. Our estimates are typically within 5-10% of the actual count.

Do whitespace tokens cost money?

Yes! Every token costs the same, whether it's meaningful content or whitespace. That's why minifying JSON can significantly reduce costs for large payloads.

Should I always minify JSON for LLMs?

Not necessarily. For small payloads, the token savings are minimal and formatted JSON may help the model understand the structure better. For large payloads (1000+ tokens), minification is usually worth it.