Project Uroboros — RoSTE Stage 2 Adapter

Stage 2 of 2 in the Uroboros continual-learning pipeline.

Training details

Setting	Value
Base model	`ByteDance/Ouro-2.6B-Thinking`
Datasets	UltraChat-200k → NVARC-augmented-puzzles
Quantization	4-bit NF4 double-quant (BitsAndBytes)
LoRA r / alpha	16 / 32
Bits (w / a / kv)	4 / 4 / 4
Rotations used	R3 (Q/K head_dim), R4 (down_proj input, block-Hadamard)
Max seq length	1024
Transformers	`4.54.1`

Files

File	Description
`roste_adapter.pt`	Trained `lora_A` + `lora_B` tensors from `apply_roste`
`roste_config.json`	Hyperparameters needed to reconstruct the model
`tokenizer.*`	Tokenizer saved from base model

How to load

See the project notebook for the full apply_roste implementation. The weight-injection pattern is:

saved   = torch.load("roste_adapter.pt", map_location="cpu")
current = dict(model.named_parameters())
with torch.no_grad():
    for name, tensor in saved.items():
        if name in current and current[name].shape == tensor.shape:
            current[name].copy_(tensor.to(current[name].device, current[name].dtype))

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Shrikanth19/roste-stage2-adapter

Base model

ByteDance/Ouro-2.6B-Thinking

Adapter

(1)

this model