Transformers
Safetensors
Kyrgyz
Kazakh
Polish
continued-pretraining
cpt
merged-lora
multilingual
cross-lingual-transfer
Instructions to use the-cramer-project/cpt-models-t3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use the-cramer-project/cpt-models-t3 with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("the-cramer-project/cpt-models-t3", dtype="auto") - Notebooks
- Google Colab
- Kaggle
CPT merged full models โ run t3pilot (t3 cross-lingual experiment)
Standalone full models = base (meta-llama/Llama-3.1-8B) with the trained LoRA
adapter merged in (r=64, lr=5e-5, 30% English mixed stream, 2 epochs, frozen
embeddings/lm_head). Load directly with AutoModelForCausalLM.from_pretrained,
no PEFT. Per-language eval losses are in manifest.json.
Load
from transformers import AutoModelForCausalLM, AutoTokenizer
mid = "the-cramer-project/cpt-models-t3"
sub = "Llama-3.1-8B/FT-KY"
model = AutoModelForCausalLM.from_pretrained(mid, subfolder=sub, torch_dtype="bfloat16")
tok = AutoTokenizer.from_pretrained(mid, subfolder=sub)
Models
| Subfolder | Base | Language | LoRA r | LR | Target eval loss |
|---|---|---|---|---|---|
Llama-3.1-8B/FT-KY |
meta-llama/Llama-3.1-8B | Kyrgyz | 64 | 5e-05 | 1.021923542022705 |
Llama-3.1-8B/FT-KZ |
meta-llama/Llama-3.1-8B | Kazakh | 64 | 5e-05 | 1.0028022527694702 |
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support