TRM-textv3.5

TRM-textv3.5 is an experimental recursive Transformer language model derived from summerMC/TRM-textv3.

This checkpoint was improved through a staged rehabilitation and curriculum SFT process focused on reducing repetition collapse and improving short instruction-following behavior.

Model

  • Base: summerMC/TRM-textv3
  • Architecture: TRM / recursive Transformer style causal language model
  • Task: causal language modeling and instruction-style text generation
  • Context length: 512 tokens
  • Precision used during training: bfloat16

Training Pipeline

The model was improved using the following process:

TRM-textv3-grpo
→ hard rescue SFT
→ small curriculum SFT
→ TRM-textv3.5

The curriculum stage used response-only supervised fine-tuning with short QA, explanation, translation, and code-generation style examples.

A repetition unlikelihood term was used during rescue/curriculum training to reduce repeated-token degeneration such as:

common common common
learning learning learning

Intended Use

This model is intended for research experiments on:

  • recursive Transformer language modeling
  • small language model rehabilitation
  • instruction tuning
  • output-head collapse recovery
  • repetition-collapse mitigation
  • experimental LLM miniaturization

Example

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "summerMC/TRM-textv3.5"

tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)

model = AutoModelForCausalLM.from_pretrained(
    model_id,
    trust_remote_code=True,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

prompt = "User: What is Python?\nAssistant:"

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=128,
        do_sample=False,
        repetition_penalty=1.25,
        no_repeat_ngram_size=4,
        pad_token_id=tokenizer.eos_token_id,
        eos_token_id=tokenizer.eos_token_id,
    )

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Recommended Generation Settings

do_sample = False
repetition_penalty = 1.2
no_repeat_ngram_size = 4
max_new_tokens = 64 to 128

For sampling:

do_sample = True
temperature = 0.6
top_p = 0.9
repetition_penalty = 1.25
no_repeat_ngram_size = 4

Limitations

This is an experimental research checkpoint.

Known limitations:

  • weak factual knowledge
  • limited reasoning ability
  • unstable long-form generation
  • possible repetition under poor decoding settings
  • may produce incorrect code
  • not suitable for production use

Notes

This model is part of the summerAI TRM research line investigating whether recursive/shared-block Transformer models can be made more language-model-like through staged rehabilitation, curriculum SFT, and output-distribution repair.

Downloads last month
964
Safetensors
Model size
84.3M params
Tensor type
F32
·
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for summerMC/TRM-textv3.5

Finetuned
(1)
this model
Finetunes
1 model