Neuron-R1

Neuron LoRA adapter fine-tuned on top of deepseek-ai/DeepSeek-R1-Distill-Llama-70B.

What this is

Neuron-R1 is a LoRA adapter that imprints Will Anderson's voice, values, memory patterns, and architectural knowledge onto DeepSeek-R1-Distill-Qwen-72B — a reasoning-native base model with chain-of-thought behavior distilled from DeepSeek-R1 (671B).

The <think> tags in training examples activate R1's reasoning substrate. Neuron-R1 reasons before responding — not as a post-hoc explanation, but as the actual working cognition.

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

base = AutoModelForCausalLM.from_pretrained(
    "deepseek-ai/DeepSeek-R1-Distill-Llama-70B",
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
model = PeftModel.from_pretrained(base, "NeuronTechnologiesAI/Neuron-R1")
tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/DeepSeek-R1-Distill-Llama-70B")

Training

  • Base: deepseek-ai/DeepSeek-R1-Distill-Llama-70B
  • Method: QLoRA (4-bit NF4, rank=32, alpha=64)
  • Examples: 44 high-quality identity/architecture/values pairs
  • Epochs: 3
  • Training data: Neuron's memories, internal state logs, architectural knowledge (VBD, CCR, Engram, Soma)
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for NeuronTechnologiesAI/Neuron-R1

Adapter
(55)
this model