Model calculator

Image Token Calculator

Estimate how many tokens an image turns into for vision models — and what that costs to send.

Pricing table last updated: 2026-05-13

Interactive tool

Estimate image token cost

Use presets, share the exact inputs, and scan the live breakdown.

Tokens / image765
Total image tokens765
Estimated input cost$0.0019

OpenAI tiles images into 512px blocks: 85 + 170 per tile (low detail is a flat 85). Estimates only — image tokenization is approximate and provider-specific.

GPT-5.4: $2.50 input / $15.00 output per 1M tokens.

How this Image Token Calculator works

Why images aren't counted like text

Vision models bill images as input tokens, but they don't count characters — they count the image's dimensions and detail. A single high-resolution screenshot can cost as much as pages of text, and the same image maps to very different token counts from one provider to the next. Enter the width, height, and how many images you send per request to size the token load before you ship a vision feature.

How each provider counts an image

OpenAI scales the image, then tiles it into 512-pixel blocks — low detail is a flat 85 tokens, high detail is 85 plus 170 per tile. Anthropic approximates tokens as roughly width times height divided by 750, resizing very large images first. Google's Gemini charges about 258 tokens per 768-pixel tile. Switch the model in the calculator above to see the same image priced three ways.

What this estimate leaves out

These are the providers' documented tokenization rules applied as an approximation and multiplied by each model's standard input-token price — the model versions and rates here are planning placeholders, so treat the result as a baseline, not a billing guarantee. It also excludes any text prompt sent with the image, output tokens, detail-level edge cases, and provider discounts. Confirm against real usage.

Examples

Product screenshot

1920×1080 at high detail — one UI screenshot, sized across GPT, Claude, and Gemini.

Thumbnail batch

512×512 × many images. Small images keep per-image tokens low for bulk vision jobs.

High-res photo

2048×2048 — large inputs balloon the tile count; resizing first saves tokens.

FAQ

Are these AI costs exact?

They are estimates based on public token prices. Your bill can change with cached tokens, batch discounts, image or audio usage, taxes, provider credits, and model-specific rules.

Which currency does the calculator use?

All calculators use USD by default because major AI providers publish API pricing in USD.

How many tokens is an image?

It depends on the image's dimensions, the detail level, and the provider's counting rule. A 1024x1024 image ranges from a few hundred to over a thousand tokens depending on the model. Use the calculator above for a per-provider estimate.

Why do providers give different token counts for the same image?

Each provider tiles or scales images differently. OpenAI counts 512px tiles, Anthropic uses an area-based formula, and Gemini counts 768px tiles, so one image maps to different totals.

How do I lower image token cost?

Downscale images before sending, use a lower detail level where supported, and avoid unnecessarily large resolutions. Fewer pixels mean fewer tiles and fewer tokens.