You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Claritas-GPT2: Trajectory-to-Trait Optimized Reasoning Model

[Paper]

Claritas (Latin for "clarity") is a training dynamics framework that optimizes Large Language Models by analyzing gradient trajectories. This repository contains the model weights for the lightweight implementation described in the paper "Claritas: Trajectory-to-Trait Framework for Complex Reasoning Training Optimization in Large Language Models".

πŸ“ Model Overview

This model serves as a reproducible validation of the Claritas Dynamic Data Selection Algorithm. Unlike standard Supervised Fine-Tuning (SFT) that treats all data equally, this model was trained by dynamically selecting high-value, conflict-free samples. Key Implementation Details:

  1. Gradient Spectral Fingerprint (GSF): High-dimensional gradients were compressed into 128-dimensional signatures to track sample influence.
  2. Counterfactual Trajectory Contrast (CTC): Samples were scored based on their contribution to mathematical reasoning capabilities.
  3. Dynamic Conflict Graph: The training process identified and excluded samples with opposing gradient directions to maximize learning efficiency.

πŸ“Š Performance Highlights

While this specific release uses a GPT-2 backbone for accessibility, the underlying methodology has been validated on LLaMA-2-7B/70B in our paper, demonstrating significant improvements:

Method Data Usage GSM8K MATH BBH
Standard SFT 100% 52.4 12.1 38.5
Random Selection 60% 48.1 10.3 35.2
Claritas (Ours) 60% 55.8 14.5 41.2

Key Result: The Claritas framework cuts training tokens by 40% while boosting Pass@1 accuracy by +2.5 points on average over full-data SFT.

πŸš€ How to Use

You can easily load this model using the Hugging Face transformers library for inference.

Installation

pip install transformers torch

Inference Example

import torch
from transformers import GPT2LMHeadModel, GPT2Tokenizer
# Load the Claritas-optimized model
model_name = "Muki182/claritas-gpt2" # Replace with your repo ID
model = GPT2LMHeadModel.from_pretrained(model_name)
tokenizer = GPT2Tokenizer.from_pretrained(model_name)
# Set pad token if not set
tokenizer.pad_token = tokenizer.eos_token
# Prepare input
prompt = "Question: If I have 3 apples and eat 1, how many are left?\nAnswer:"
inputs = tokenizer(prompt, return_tensors="pt")
# Generate
with torch.no_grad():
    outputs = model.generate(
        inputs["input_ids"], 
        max_new_tokens=50, 
        do_sample=False,
        pad_token_id=tokenizer.eos_token_id
    )
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

βš™οΈ Training Specifications

This model was trained using the Claritas algorithm with the following configurations:

  • Base Model: gpt2
  • Training Data: GSM8K (Mathematical Reasoning)
  • Algorithm: Claritas (GSF + CTC + Dynamic Conflict Graph)
  • Spectral Dimension: 128
  • Conflict Threshold: -0.5
  • CTC Threshold: 0.2
  • Optimizer: AdamW

πŸ“– Citation

If you use this model or reference the Claritas framework in your research, please cite our paper:

@article{feng2026claritas,
  title={Claritas: Trajectory-to-Trait Framework for Complex Reasoning Training Optimization in Large Language Models},
  author={Feng, Junjie},
  journal={arXiv preprint arXiv:your-arxiv-id},
  year={2026}
}

License

This model is licensed under the Apache License 2.0.

Downloads last month

-

Downloads are not tracked for this model. How to track
Safetensors
Model size
0.1B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Muki182/Claritas-GPT2

Finetuned
(2092)
this model

Datasets used to train Muki182/Claritas-GPT2