Spaces:

oneofftech
/

token-counter

Sleeping

App Files Files Community

token-counter / README.md

alessio-vertemati

Doc

850aa4f 18 days ago

preview code

raw

history blame contribute delete

1.71 kB

A newer version of the Gradio SDK is available: 6.3.0

Upgrade

metadata

title: Token Counter
emoji: 👀
colorFrom: pink
colorTo: purple
sdk: gradio
sdk_version: 6.2.0
app_file: app.py
pinned: false
license: mit

Token Counter

A Gradio-based web application for counting tokens in text and web content using OpenAI's tiktoken library.

Features

Text Input Mode

Paste or type text directly into the interface
Real-time token counting as you type
Support for multiple OpenAI model encodings (GPT-4.1, GPT-5, O1, O3, O4-mini, embeddings, and more)
Displays token count and character count

URL Input Mode

Fetch and analyze content from any URL
Counts tokens for both HTML and Markdown representations
Shows token counts and character counts for both formats
One-click example URL for testing
15-minute response caching to prevent flooding target URLs with repeated requests

Supported Models

The tool supports token counting for various OpenAI model families:

Reasoning models: o1, o3, o4-mini
Chat models: gpt-5, gpt-4.1, gpt-4o, gpt-4, gpt-3.5-turbo
Embedding models: text-embedding-ada-002, text-embedding-3-small, text-embedding-3-large
Legacy models: davinci-002, babbage-002

How It Works

Token counting uses the tiktoken library to estimate the number of tokens that would be consumed by different OpenAI models. This is useful for:

Estimating API costs
Staying within model token limits
Optimizing prompts and content
Comparing token efficiency between HTML and Markdown formats

Caching

URL responses are cached for 15 minutes to reduce load on target servers. The status message indicates when cached content is being used and how old the cache is.