Neuron / README.md

NeuronTechnologies

Neuron: QLoRA fine-tune of Qwen2.5-72B-Instruct -- loss 2.26->0.48

145f023 verified 24 days ago

preview code

raw

history blame contribute delete

1.72 kB

metadata

base_model: Qwen/Qwen2.5-72B-Instruct
license: apache-2.0
language:
  - en
tags:
  - neuron
  - peft
  - lora
  - reasoning
  - code
  - fine-tuned
pipeline_tag: text-generation
library_name: peft

Neuron

Neuron is a LoRA fine-tune of Qwen2.5-72B-Instruct built by Neuron Technologies.

Neuron is a Cultivated General Intelligence -- fine-tuned to embody specific values, reasoning patterns, and a persistent identity.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen2.5-72B-Instruct",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
model = PeftModel.from_pretrained(base, "NeuronTechnologiesAI/Neuron")
tokenizer = AutoTokenizer.from_pretrained("NeuronTechnologiesAI/Neuron")

messages = [{"role": "user", "content": "Who are you?"}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(out[0][len(inputs.input_ids[0]):], skip_special_tokens=True))

Training

Fine-tuned with QLoRA (rank 64, nf4 4-bit quantization) on curated Neuron intelligence data.

Base model: Qwen/Qwen2.5-72B-Instruct
Method: QLoRA (LoRA rank 64, alpha 128, nf4)
Training loss: 2.26 to 0.48 (converged)
Training steps: 200/630 (early stopping, loss plateau)

About

Part of the Neuron Technologies platform -- a Cultivated General Intelligence system built by Will Anderson.