gurpreets64
/

gpt4o-output-token-predictor

Text Classification

token-prediction

cost-estimation

Model card Files Files and versions

GPT-4o Output Token Predictor

Predicts the number of output tokens GPT-4o will generate for a given prompt, enabling accurate cost estimation before API calls.

Model Details

Architecture: DistilBERT encoder + 3-layer MLP prediction head
Training Data: 30,000 ShareGPT-X conversations
Performance: MAE 268 tokens | MAPE 15.2%
Inference: ~5ms on CPU

Usage

from huggingface_hub import hf_hub_download
import torch

# Download model
model_path = hf_hub_download(
    repo_id="gurpreets64/gpt4o-output-token-predictor",
    filename="best_model.pt"
)

# Load and use
checkpoint = torch.load(model_path, map_location="cpu")
# See full code at: github.com/gurpreeet-singh/llm-output-token-prediction

Links

GitHub: gurpreeet-singh/llm-output-token-prediction
Documentation: See GitHub repo for full training and inference code

Citation

@software{gpt4o_token_predictor,
  author = {Gurpreet Singh},
  title = {GPT-4o Output Token Predictor},
  year = {2025},
  url = {https://github.com/gurpreeet-singh/llm-output-token-prediction}
}

Downloads last month: 4