YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

summerMC/TRM-text-v4

TRM-text-v4

Experimental Japanese Reasoning Language Model

Overview

TRM-text-v4 is an experimental Japanese language model developed to explore improved reasoning capabilities and extended context handling.

The model is designed to generate intermediate reasoning traces using <think> tags before producing final answers. It combines efficient fine-tuning techniques with experimental approaches to context extension.

Status: Research Preview / Experimental Release


Features

Japanese Chain-of-Thought Reasoning

TRM-text-v4 is fine-tuned to produce structured reasoning steps in Japanese using <think> tags.

Example:

User: 富士山の標高は?
Assistant: <think>
質問は富士山の標高についてである。
一般的に知られている標高を確認する。
富士山の標高は3776メートルである。
</think>
富士山の標高は3,776メートルです。

Extended Context Experiments

The model incorporates experimental RoPE scaling techniques to investigate longer-context capabilities beyond the original configuration.

Current implementation focuses on practical extensions while evaluating numerical stability and hardware constraints.


Efficient Fine-Tuning

TRM-text-v4 was trained using parameter-efficient fine-tuning methods:

  • 4-bit NF4 quantization via BitsAndBytes
  • LoRA adaptation
  • Memory-efficient optimization for consumer GPUs

Model Details

Item Value
Model Name TRM-text-v4
Architecture MoE-based Causal Language Model
Language Japanese
License Apache-2.0 (or specify actual license)
Intended Use Research and experimentation
Release Status Experimental

Training Configuration

Fine-Tuning

  • LoRA Rank (r): 16
  • LoRA Alpha: 32
  • Target Modules: All Linear Layers
  • Learning Rate: 3e-4
  • Precision: FP16
  • Quantization: 4-bit NF4

Context Configuration

  • RoPE Scaling: Dynamic
  • RoPE Scaling Factor: 2.0
  • Maximum Position Embeddings: 2048

Long-context support beyond this configuration remains under investigation and should be considered experimental.


Datasets

Instruction Tuning

  • Databricks Dolly-15k-ja

    • Used to improve Japanese instruction-following ability.

Reasoning Data

  • Reasoning templates inspired by datasets such as:

    • Magpie
    • Sakura Reasoning

Portions of the reasoning pipeline are experimental and may evolve in future releases.


Usage

from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "summerMC/TRM-text-v4"

tokenizer = AutoTokenizer.from_pretrained(
    model_id,
    trust_remote_code=True,
)

model = AutoModelForCausalLM.from_pretrained(
    model_id,
    trust_remote_code=True,
)

prompt = """User: 人工知能とは?
Assistant: <think>
"""

inputs = tokenizer(prompt, return_tensors="pt")

outputs = model.generate(
    **inputs,
    max_new_tokens=256,
    do_sample=True,
    temperature=0.7,
    top_p=0.9,
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Prompt Format

Recommended prompt template:

User: [Question]
Assistant: <think>
[Reasoning Process]
</think>
[Final Answer]

Limitations

  • Reasoning traces may occasionally contain mixed Japanese and English terminology.
  • Generated reasoning processes are not guaranteed to reflect the model's actual internal computation.
  • Long-context behavior beyond validated ranges may exhibit instability.
  • The model may generate incorrect or fabricated information and should not be used in high-risk domains without human verification.

Intended Use

Recommended uses:

  • Research on Japanese reasoning language models
  • Prompt engineering experiments
  • Educational demonstrations
  • Evaluation of Chain-of-Thought prompting strategies

Not recommended for:

  • Medical decision-making
  • Legal advice
  • Financial advice
  • Safety-critical applications

Citation

@misc{summermc_trm_text_v4,
  title        = {TRM-text-v4: Experimental Japanese Reasoning Language Model},
  author       = {summerMC},
  year         = {2026},
  publisher    = {Hugging Face},
  howpublished = {\url{https://huggingface.co/summerMC/TRM-text-v4}}
}

Acknowledgements

TRM-text-v4 builds upon the open-source ecosystem, including:

  • Hugging Face Transformers
  • PEFT
  • BitsAndBytes
  • Databricks Dolly
  • The broader open-source LLM community

Special thanks to all researchers and contributors advancing efficient Japanese language modeling.

Downloads last month
366
Safetensors
Model size
0.2B params
Tensor type
F32
·
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support