Kimi-K2.5 Reasoning LoRA Adapter
This is a LoRA adapter fine-tuned from moonshotai/Kimi-K2.5 on the TeichAI/claude-4.5-opus-high-reasoning-250x dataset.
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch
# 4-bit quantization for memory efficiency
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype=torch.bfloat16,
)
# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
"moonshotai/Kimi-K2.5",
quantization_config=bnb_config,
device_map="auto",
trust_remote_code=True,
)
# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "YOUR_USERNAME/kimi-k2-reasoning-lora")
tokenizer = AutoTokenizer.from_pretrained("YOUR_USERNAME/kimi-k2-reasoning-lora")
# Generate
messages = [{"role": "user", "content": "What is the square root of 144?"}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
outputs = model.generate(inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0]))
Training Details
| Parameter | Value |
|---|---|
| Base Model | moonshotai/Kimi-K2.5 |
| Method | QLoRA (4-bit) |
| LoRA Rank | 64 |
| LoRA Alpha | 16 |
| Learning Rate | 2e-4 |
| Epochs | 3 |
| Dataset Size | 250 examples |
Hardware Requirements
- Minimum: 4x A100 (320GB VRAM)
- Recommended: 8x A100 (640GB VRAM)
With 4-bit quantization, you may be able to run inference on smaller GPUs.
- Downloads last month
- -
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
1
Ask for provider support
Model tree for Sameric934/kimi-k2-reasoning-lora
Base model
moonshotai/Kimi-K2.5