CPT adapters (run_id 2)

This repo bundles 25 LoRA adapters produced by the CPT pipeline. Each adapter lives in its own subfolder; load it by passing subfolder= to PEFT.

Layout

<base_model>/<variant>/top<K>/
    adapter_config.json
    adapter_model.safetensors
    metadata.json     # per-adapter hyperparameters + losses
    tokenizer.json    # (optional, may be omitted to save space)

Loading an adapter

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base_id   = "meta-llama/Llama-3.1-8B"     # see metadata.json for the right base
subfolder = "Llama-3.1-8B/FT-KY/top1"

base  = AutoModelForCausalLM.from_pretrained(base_id, torch_dtype="bfloat16", device_map="auto")
model = PeftModel.from_pretrained(base, "the-cramer-project/cpt-adapters-t2", subfolder=subfolder)
tok   = AutoTokenizer.from_pretrained(base_id)

Adapter index

Subfolder Base model Language LoRA r LR Final eval loss (nats)
Llama-3.1-8B/FT-KY/top1 meta-llama/Llama-3.1-8B Kyrgyz 256 5e-05 1.0290
Llama-3.1-8B/FT-KY/top2 meta-llama/Llama-3.1-8B Kyrgyz 128 0.0001 1.0118
Llama-3.1-8B/FT-KY/top3 meta-llama/Llama-3.1-8B Kyrgyz 128 5e-05 1.0090
Llama-3.1-8B/FT-KZ/top1 meta-llama/Llama-3.1-8B Kazakh 256 5e-05 1.0093
Llama-3.1-8B/FT-KZ/top2 meta-llama/Llama-3.1-8B Kazakh 128 0.0001 0.9951
Llama-3.1-8B/FT-KZ/top3 meta-llama/Llama-3.1-8B Kazakh 128 5e-05 0.9937
Llama-3.1-8B/FT-PL/top1 meta-llama/Llama-3.1-8B Polish 64 5e-05 1.6365
Llama-3.1-8B/FT-PL/top2 meta-llama/Llama-3.1-8B Polish 32 5e-05 1.6234
Llama-3.1-8B/FT-PL/top3 meta-llama/Llama-3.1-8B Polish 32 0.0001 1.6322
Qwen3-8B-Base/FT-KY/top1 Qwen/Qwen3-8B-Base Kyrgyz 256 0.0001 1.0866
Qwen3-8B-Base/FT-KY/top3 Qwen/Qwen3-8B-Base Kyrgyz 256 5e-05 1.0851
Qwen3-8B-Base/FT-KZ/top1 Qwen/Qwen3-8B-Base Kazakh 256 0.0001 1.0292
Qwen3-8B-Base/FT-KZ/top2 Qwen/Qwen3-8B-Base Kazakh 256 5e-05 1.0318
Qwen3-8B-Base/FT-PL/top1 Qwen/Qwen3-8B-Base Polish 256 5e-05 1.7490
Qwen3-8B-Base/FT-PL/top2 Qwen/Qwen3-8B-Base Polish 128 0.0001 1.7275
Qwen3-8B-Base/FT-PL/top3 Qwen/Qwen3-8B-Base Polish 64 0.0002 1.7133
gemma-2-9b/FT-KY/top1 google/gemma-2-9b Kyrgyz 128 5e-05 1.2077
gemma-2-9b/FT-KY/top2 google/gemma-2-9b Kyrgyz 64 0.0001 1.1936
gemma-2-9b/FT-KY/top3 google/gemma-2-9b Kyrgyz 64 5e-05 1.1952
gemma-2-9b/FT-KZ/top1 google/gemma-2-9b Kazakh 128 5e-05 1.3012
gemma-2-9b/FT-KZ/top2 google/gemma-2-9b Kazakh 64 0.0001 1.2847
gemma-2-9b/FT-KZ/top3 google/gemma-2-9b Kazakh 64 5e-05 1.2867
gemma-2-9b/FT-PL/top1 google/gemma-2-9b Polish 32 5e-05 1.9314
gemma-2-9b/FT-PL/top2 google/gemma-2-9b Polish 32 0.0001 1.9516
gemma-2-9b/FT-PL/top3 google/gemma-2-9b Polish 64 5e-05 1.9781

For machine-readable details see manifest.json.

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support