Language Decoded LoRA โ Condition 2: Urdu Keyword-Swapped Code
Urdu keyword-swapped Python via Legesher v0.7.3 (5k subset). Tests whether the language of code keywords matters for reasoning benefits.
Part of the Language Decoded project (Cohere's Tiny Aya Expedition).
For full experiment details, see the Language Decoded LoRA hub.
Training Data
legesher/language-decoded-data / condition-2-ur-5k
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base_model = AutoModelForCausalLM.from_pretrained("CohereLabs/tiny-aya-base")
tokenizer = AutoTokenizer.from_pretrained("CohereLabs/tiny-aya-base")
model = PeftModel.from_pretrained(base_model, "legesher/language-decoded-lora-condition-2-ur-5k")
Citation
@misc{language-decoded-2026,
title={Language Decoded: Investigating Language-Dependent vs. Structure-Dependent Reasoning Benefits of Code},
author={Madison Edgar and Saad Ahmed Bazaz and Tom Sherborne and Rashik Shahjahan and Khojasteh Mirza and Sarah Jawaid and Rafay Mustafa and Sohaib Ahmed Bazaz},
year={2026},
publisher={Hugging Face},
url={https://huggingface.co/legesher/language-decoded-lora}
}
License
Apache 2.0
Model tree for legesher/language-decoded-lora-condition-2-ur-5k
Base model
CohereLabs/tiny-aya-base