token-counter / README.md
alessio-vertemati's picture
Doc
850aa4f

A newer version of the Gradio SDK is available: 6.3.0

Upgrade
metadata
title: Token Counter
emoji: 👀
colorFrom: pink
colorTo: purple
sdk: gradio
sdk_version: 6.2.0
app_file: app.py
pinned: false
license: mit

Token Counter

A Gradio-based web application for counting tokens in text and web content using OpenAI's tiktoken library.

Features

Text Input Mode

  • Paste or type text directly into the interface
  • Real-time token counting as you type
  • Support for multiple OpenAI model encodings (GPT-4.1, GPT-5, O1, O3, O4-mini, embeddings, and more)
  • Displays token count and character count

URL Input Mode

  • Fetch and analyze content from any URL
  • Counts tokens for both HTML and Markdown representations
  • Shows token counts and character counts for both formats
  • One-click example URL for testing
  • 15-minute response caching to prevent flooding target URLs with repeated requests

Supported Models

The tool supports token counting for various OpenAI model families:

  • Reasoning models: o1, o3, o4-mini
  • Chat models: gpt-5, gpt-4.1, gpt-4o, gpt-4, gpt-3.5-turbo
  • Embedding models: text-embedding-ada-002, text-embedding-3-small, text-embedding-3-large
  • Legacy models: davinci-002, babbage-002

How It Works

Token counting uses the tiktoken library to estimate the number of tokens that would be consumed by different OpenAI models. This is useful for:

  • Estimating API costs
  • Staying within model token limits
  • Optimizing prompts and content
  • Comparing token efficiency between HTML and Markdown formats

Caching

URL responses are cached for 15 minutes to reduce load on target servers. The status message indicates when cached content is being used and how old the cache is.