RoPE Scaled QLoRA Long Context Extension of Llama-33b (LoRA)

Overview

This is base Llama-33b with minimal additional training to extend the useful context window.

Context length extended to 16384 by RoPE Scaled Embeddings (Position Interpolation).
Pretrained for additional 100 steps on 8192 length sequences from the pile dataset.
The merged model is used as the starting point for training bhenrym14/airoboros-33b-gpt4-1.4.1-lxctx-PI-16384-LoRA

This is a QLoRA fine-tune

Pretraining took 10 hours on 1x RTX 6000 Ada.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support