Claritas-GPT2: Trajectory-to-Trait Optimized Reasoning Model
[Paper]
Claritas (Latin for "clarity") is a training dynamics framework that optimizes Large Language Models by analyzing gradient trajectories. This repository contains the model weights for the lightweight implementation described in the paper "Claritas: Trajectory-to-Trait Framework for Complex Reasoning Training Optimization in Large Language Models".
π Model Overview
This model serves as a reproducible validation of the Claritas Dynamic Data Selection Algorithm. Unlike standard Supervised Fine-Tuning (SFT) that treats all data equally, this model was trained by dynamically selecting high-value, conflict-free samples. Key Implementation Details:
- Gradient Spectral Fingerprint (GSF): High-dimensional gradients were compressed into 128-dimensional signatures to track sample influence.
- Counterfactual Trajectory Contrast (CTC): Samples were scored based on their contribution to mathematical reasoning capabilities.
- Dynamic Conflict Graph: The training process identified and excluded samples with opposing gradient directions to maximize learning efficiency.
π Performance Highlights
While this specific release uses a GPT-2 backbone for accessibility, the underlying methodology has been validated on LLaMA-2-7B/70B in our paper, demonstrating significant improvements:
| Method | Data Usage | GSM8K | MATH | BBH |
|---|---|---|---|---|
| Standard SFT | 100% | 52.4 | 12.1 | 38.5 |
| Random Selection | 60% | 48.1 | 10.3 | 35.2 |
| Claritas (Ours) | 60% | 55.8 | 14.5 | 41.2 |
Key Result: The Claritas framework cuts training tokens by 40% while boosting Pass@1 accuracy by +2.5 points on average over full-data SFT.
π How to Use
You can easily load this model using the Hugging Face transformers library for inference.
Installation
pip install transformers torch
Inference Example
import torch
from transformers import GPT2LMHeadModel, GPT2Tokenizer
# Load the Claritas-optimized model
model_name = "Muki182/claritas-gpt2" # Replace with your repo ID
model = GPT2LMHeadModel.from_pretrained(model_name)
tokenizer = GPT2Tokenizer.from_pretrained(model_name)
# Set pad token if not set
tokenizer.pad_token = tokenizer.eos_token
# Prepare input
prompt = "Question: If I have 3 apples and eat 1, how many are left?\nAnswer:"
inputs = tokenizer(prompt, return_tensors="pt")
# Generate
with torch.no_grad():
outputs = model.generate(
inputs["input_ids"],
max_new_tokens=50,
do_sample=False,
pad_token_id=tokenizer.eos_token_id
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
βοΈ Training Specifications
This model was trained using the Claritas algorithm with the following configurations:
- Base Model:
gpt2 - Training Data: GSM8K (Mathematical Reasoning)
- Algorithm: Claritas (GSF + CTC + Dynamic Conflict Graph)
- Spectral Dimension: 128
- Conflict Threshold: -0.5
- CTC Threshold: 0.2
- Optimizer: AdamW
π Citation
If you use this model or reference the Claritas framework in your research, please cite our paper:
@article{feng2026claritas,
title={Claritas: Trajectory-to-Trait Framework for Complex Reasoning Training Optimization in Large Language Models},
author={Feng, Junjie},
journal={arXiv preprint arXiv:your-arxiv-id},
year={2026}
}
License
This model is licensed under the Apache License 2.0.
Model tree for Muki182/Claritas-GPT2
Base model
openai-community/gpt2