open-llama-7b-openthought-mid-lora

LoRA adapter for OpenLLaMA 7B v2, fine-tuned on OpenThoughts-114k (DeepSeek-R1 reasoning traces).

Training

  • Base: open_llama_7b_v2 (4-bit NF4)
  • Method: QLoRA (r=32, alpha=32)
  • Data: open-thoughts/OpenThoughts-114k (filtered <= 2024 tokens, 10,582 samples)
  • Epochs: 3
  • Batch size: 64
  • LR: 2e-4 (cosine, warmup 24 steps)
  • Loss: full sequence (pad masked)
  • Packing: greedy bin packing sorted by length
  • Context: 2024 (native, no RoPE scaling)
  • Special tokens: Qwen3.5 chat template
  • modules_to_save: embed_tokens, lm_head

Format

<|im_start|>system
You are a helpful assistant.<|im_end|>
<|im_start|>user
{question}<|im_end|>
<|im_start|>assistant
<think>
{reasoning}
</think>

{answer}<|im_end|>

Merged model

See ping98k/open-llama-7b-openthought-mid-4bit

Downloads last month
57
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ping98k/open-llama-7b-openthought-mid-lora

Adapter
(5)
this model