metadata
license: mit
tags:
- token-prediction
- gpt-4o
- cost-estimation
- distilbert
datasets:
- ShareGPT-X
language:
- en
metrics:
- mae
pipeline_tag: text-classification
GPT-4o Output Token Predictor
Predicts the number of output tokens GPT-4o will generate for a given prompt, enabling accurate cost estimation before API calls.
Model Details
- Architecture: DistilBERT encoder + 3-layer MLP prediction head
- Training Data: 30,000 ShareGPT-X conversations
- Performance: MAE 268 tokens | MAPE 15.2%
- Inference: ~5ms on CPU
Usage
from huggingface_hub import hf_hub_download
import torch
# Download model
model_path = hf_hub_download(
repo_id="gurpreets64/gpt4o-output-token-predictor",
filename="best_model.pt"
)
# Load and use
checkpoint = torch.load(model_path, map_location="cpu")
# See full code at: github.com/gurpreeet-singh/llm-output-token-prediction
Links
- GitHub: gurpreeet-singh/llm-output-token-prediction
- Documentation: See GitHub repo for full training and inference code
Citation
@software{gpt4o_token_predictor,
author = {Gurpreet Singh},
title = {GPT-4o Output Token Predictor},
year = {2025},
url = {https://github.com/gurpreeet-singh/llm-output-token-prediction}
}