NFLWRBOT25-1.7b
NFLWRBOT25-1.7b is a Qwen3 1.7B causal language model fine-tuned to answer questions about 2025 NFL wide receiver statistics. It is intended for conversational lookup, explanation, comparison, and lightweight analysis of receiver production, usage, efficiency, quarter splits, and related context from the cleaned 2025 wide receiver dataset.
This checkpoint is a merged full model. It was trained from Qwen/Qwen3-1.7B with a QLoRA adapter and then merged back into the base model weights for easier local loading.
Model Details
- Base model:
Qwen/Qwen3-1.7B - Fine-tuning method: QLoRA
- Quantization during training: 4-bit NF4
- LoRA rank: 16
- LoRA alpha: 32
- Sequence length: 2048
- Epochs: 1
- Training examples: 9,350
- Validation examples: 813
- Source dataset:
SebastianAndreu/24679_NFL_WR_Dataset_2025 - Cleaned ChatML dataset:
clarkkitchen22/NFLWR2025CLEANED
Intended Use
This model is designed for:
- Answering 2025 NFL wide receiver stat questions.
- Explaining receiver metrics such as targets, receptions, receiving yards, air yards, yards after catch, touchdowns, EPA, WPA, catch rate, target share, and air-yard share.
- Comparing receiver usage and efficiency profiles.
- Summarizing single-game and player-level receiving production.
- Helping users reason about wide receiver performance using the provided dataset.
It is not intended for betting advice, official league reporting, injury reporting, live sports updates, or decisions that require verified real-time information.
Training Data
The training data was converted from the public Hugging Face dataset SebastianAndreu/24679_NFL_WR_Dataset_2025 into ChatML instruction examples. The cleaned dataset contains 10,163 total examples with train and validation splits.
The examples cover:
- Single-game lookup
- Quarter splits
- Usage and efficiency
- Scouting-style notes
- Player efficiency summaries
- Leverage target discussion
- Player totals
- Player comparisons
- Leaderboards
Training Results
The final full training run completed 585 steps.
| Metric | Value |
|---|---|
| Epoch | 1.0 |
| Train runtime | 4,436 seconds |
| Final training loss | 0.328 |
| Final eval loss | 0.15836799144744873 |
| Final eval mean token accuracy | 0.9433529111585347 |
These metrics measure performance on the generated validation split. They should not be treated as a complete benchmark of sports reasoning, factual accuracy, or general language ability.
Usage
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "clarkkitchen22/NFLWRBOT25-1.7b"
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True,
)
messages = [
{
"role": "system",
"content": "You are an expert in 2025 NFL wide receiver stats. Answer concisely and cite the numbers you use.",
},
{
"role": "user",
"content": "What should I look at to evaluate a 2025 wide receiver besides receptions and yards?",
},
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=256,
do_sample=False,
pad_token_id=tokenizer.eos_token_id,
)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True))
Limitations
- The model only knows what was represented in the training data and the base model pretraining.
- It may hallucinate numbers if asked for data outside the cleaned dataset.
- It should not be used as an official source for NFL statistics.
- It does not provide live sports updates.
- It may need retrieval or direct dataset access for exact audit-grade answers.
- The validation split comes from the same cleaned conversion process as the training split, so the reported metrics do not prove broad generalization.
Responsible Use
For serious sports analytics, use this model as a conversational layer over verified data rather than as the sole source of truth. When exact statistics matter, cross-check against the original dataset or an authoritative statistics provider.
Attribution
Base model: Qwen/Qwen3-1.7B.
Source dataset: SebastianAndreu/24679_NFL_WR_Dataset_2025.
Cleaned ChatML dataset: clarkkitchen22/NFLWR2025CLEANED.
- Downloads last month
- 24