Project Uroboros โ€” RoSTE Stage 2 Adapter

Stage 2 of 2 in the Uroboros continual-learning pipeline.

Training details

Setting Value
Base model ByteDance/Ouro-2.6B-Thinking
Datasets UltraChat-200k โ†’ NVARC-augmented-puzzles
Quantization 4-bit NF4 double-quant (BitsAndBytes)
LoRA r / alpha 16 / 32
Bits (w / a / kv) 4 / 4 / 4
Rotations used R3 (Q/K head_dim), R4 (down_proj input, block-Hadamard)
Max seq length 1024
Transformers 4.54.1

Files

File Description
roste_adapter.pt Trained lora_A + lora_B tensors from apply_roste
roste_config.json Hyperparameters needed to reconstruct the model
tokenizer.* Tokenizer saved from base model

How to load

See the project notebook for the full apply_roste implementation. The weight-injection pattern is:

saved   = torch.load("roste_adapter.pt", map_location="cpu")
current = dict(model.named_parameters())
with torch.no_grad():
    for name, tensor in saved.items():
        if name in current and current[name].shape == tensor.shape:
            current[name].copy_(tensor.to(current[name].device, current[name].dtype))
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Shrikanth19/roste-stage2-adapter

Adapter
(1)
this model