legesher
/

language-decoded-lora

@@ -1,23 +1,23 @@
 ---
-license: cc-by-nc-4.0
 language:
-- en
-- zh
-- es
-- ur
 tags:
-- lora
-- aya
-- tiny-aya
-- multilingual
-- code
-- legesher
-- tiny-aya-expedition
-- language-decoded
-- unsloth
 library_name: transformers
 base_model:
-- CohereLabs/tiny-aya-base
 pipeline_tag: text-generation
 ---
@@ -35,26 +35,22 @@ All adapters are trained on [CohereLabs/tiny-aya-base](https://huggingface.co/Co
 ## Model Structure
-This repo contains QLoRA adapters organized by experimental condition:
-| Subdirectory | Condition | Training Data |
-|---|---|---|
-| `baseline/` | Baseline | No fine-tuning (base model eval only) |
-| `condition-1-en/` | Condition 1 | English Python from The Stack Dedup |
-| `condition-2-zh/` | Condition 2 | Chinese keyword-swapped Python (Legesher-transpiled) |
-| `condition-2-es/` | Condition 2 | Spanish keyword-swapped Python (Legesher-transpiled) |
-| `condition-2-ur/` | Condition 2 | Urdu keyword-swapped Python (Legesher-transpiled) |
-| `condition-3-zh/` | Condition 3 | Transpiled + native Chinese code (Wenyan + community) |
-| `condition-3-es/` | Condition 3 | Transpiled + native Spanish code (Latino + community) |
-| `condition-3-ur/` | Condition 3 | Transpiled + native Urdu code (Qalb + community) |
-| `condition-4-combined/` | Condition 4 | All strictly native code (combined) |
 ### The Experimental Ladder
-- **Baseline → 1**: Does code help at all?
-- **1 → 2**: Does the language of keywords matter?
-- **2 → 3**: Does diversity of native-language sources add value beyond keyword swap?
-- **3 → 4**: Does code written in the cultural context of a language carry unique signal?
 ## Usage
@@ -67,36 +63,54 @@ base_model = AutoModelForCausalLM.from_pretrained("CohereLabs/tiny-aya-base")
 tokenizer = AutoTokenizer.from_pretrained("CohereLabs/tiny-aya-base")
 # Load a LoRA adapter (e.g., Condition 1 — English code)
-model = PeftModel.from_pretrained(base_model, "legesher/language-decoded-lora", subfolder="condition-1-en")
 # Load a language-specific adapter (e.g., Condition 2 — Chinese keyword-swapped)
-model = PeftModel.from_pretrained(base_model, "legesher/language-decoded-lora", subfolder="condition-2-zh")
 ```
 ## Training Details
-| Parameter | Value |
-|---|---|
-| Base model | [CohereLabs/tiny-aya-base](https://huggingface.co/CohereLabs/tiny-aya-base) (3.35B params) |
-| Method | QLoRA 4-bit (NF4), ~5.4GB VRAM |
-| Hardware | Kaggle T4 (16GB) |
-| Tokenizer | CohereLabs/tiny-aya-base |
-| Transpilation tool | [Legesher](https://github.com/legesher/legesher) v0.7.3 |
-| Training data | [legesher/language-decoded-data](https://huggingface.co/datasets/legesher/language-decoded-data) |
-*Detailed hyperparameters and training configs will be added as training completes.*
 ## Evaluation
 Models are evaluated on multilingual reasoning benchmarks with dual prompts (English + language-specific):
-| Benchmark | What it measures | Examples per language |
-|---|---|---|
-| MGSM | Math reasoning | 250 (full set) |
-| X-CSQA | Commonsense reasoning | ~1,000 (full set) |
-| XNLI | Natural language inference | ~5,000 (full set) |
-*Results will be added as evaluation completes.*
 ## Related Resources
@@ -110,7 +124,7 @@ Models are evaluated on multilingual reasoning benchmarks with dual prompts (Eng
 ```bibtex
 @misc{language-decoded-2026,
   title={Language Decoded: Investigating Language-Dependent vs. Structure-Dependent Reasoning Benefits of Code},
-  author={Madison Edgar and Saad Bazaz and Rafay Mustafa and Sarah Jawaid and Rashik Shahjahan and Khojasteh Mirza and Sohaib Bazaz},
   year={2026},
   publisher={Hugging Face},
   url={https://huggingface.co/legesher/language-decoded-lora}
@@ -119,4 +133,4 @@ Models are evaluated on multilingual reasoning benchmarks with dual prompts (Eng
 ## License
-CC-BY-NC 4.0 (inherits from Tiny Aya base model)

 ---
+license: apache-2.0
 language:
+  - en
+  - zh
+  - es
+  - ur
 tags:
+  - lora
+  - aya
+  - tiny-aya
+  - multilingual
+  - code
+  - legesher
+  - tiny-aya-expedition
+  - language-decoded
+  - unsloth
 library_name: transformers
 base_model:
+  - CohereLabs/tiny-aya-base
 pipeline_tag: text-generation
 ---
 ## Model Structure
+This repo is the canonical hub for all Language Decoded LoRA adapters, organized by experimental condition:
+| Subdirectory         | Condition   | Training Data                                        |
+| -------------------- | ----------- | ---------------------------------------------------- |
+| `condition-1-en-5k/` | Condition 1 | English Python from The Stack Dedup (5k subset)      |
+| `condition-2-zh-5k/` | Condition 2 | Chinese keyword-swapped Python (Legesher-transpiled) |
+| `condition-2-es-5k/` | Condition 2 | Spanish keyword-swapped Python (Legesher-transpiled) |
+| `condition-2-ur-5k/` | Condition 2 | Urdu keyword-swapped Python (Legesher-transpiled)    |
+| `condition-3-zh-5k/` | Condition 3 | Transpiled + native Chinese code (blended)           |
 ### The Experimental Ladder
+- **Baseline --> 1**: Does code help at all?
+- **1 --> 2**: Does the language of keywords matter?
+- **2 --> 3**: Does diversity of native-language sources add value beyond keyword swap?
+- **3 --> 4**: Does code written in the cultural context of a language carry unique signal?
 ## Usage
 tokenizer = AutoTokenizer.from_pretrained("CohereLabs/tiny-aya-base")
 # Load a LoRA adapter (e.g., Condition 1 — English code)
+model = PeftModel.from_pretrained(base_model, "legesher/language-decoded-lora", subfolder="condition-1-en-5k")
 # Load a language-specific adapter (e.g., Condition 2 — Chinese keyword-swapped)
+model = PeftModel.from_pretrained(base_model, "legesher/language-decoded-lora", subfolder="condition-2-zh-5k")
 ```
 ## Training Details
+| Parameter          | Value                                                                                            |
+| ------------------ | ------------------------------------------------------------------------------------------------ |
+| Base model         | [CohereLabs/tiny-aya-base](https://huggingface.co/CohereLabs/tiny-aya-base) (3.35B params)       |
+| Method             | QLoRA 4-bit (NF4), ~5.4GB VRAM                                                                   |
+| Hardware           | Kaggle T4 (16GB)                                                                                 |
+| Tokenizer          | CohereLabs/tiny-aya-base                                                                         |
+| Transpilation tool | [Legesher](https://github.com/legesher/legesher) v0.7.3                                          |
+| Training data      | [legesher/language-decoded-data](https://huggingface.co/datasets/legesher/language-decoded-data) |
+### QLoRA Hyperparameters
+| Parameter       | Value                                                         |
+| --------------- | ------------------------------------------------------------- |
+| LoRA rank (`r`) | 16                                                            |
+| LoRA alpha      | 32                                                            |
+| LoRA dropout    | 0.0                                                           |
+| Target modules  | q_proj, k_proj, v_proj, o_proj, up_proj, down_proj, gate_proj |
+| Bias            | none                                                          |
+| Task type       | CAUSAL_LM                                                     |
+| PEFT version    | 0.18.1                                                        |
+| Quantization    | NF4 (4-bit) via Unsloth                                       |
 ## Evaluation
 Models are evaluated on multilingual reasoning benchmarks with dual prompts (English + language-specific):
+| Benchmark | What it measures           | Examples per language |
+| --------- | -------------------------- | --------------------- |
+| MGSM      | Math reasoning             | 250 (full set)        |
+| X-CSQA    | Commonsense reasoning      | ~1,000 (full set)     |
+| XNLI      | Natural language inference | ~5,000 (full set)     |
+_Results will be added as evaluation completes._
+## Limitations
+- **Single base model**: All adapters are trained on CohereLabs/tiny-aya-base (3.35B params). Results may not generalize to larger or architecturally different models.
+- **Limited training data**: Each condition uses a 5k-file subset for QLoRA fine-tuning, constrained by Kaggle T4 hardware limits.
+- **Evaluation scope**: Currently evaluated on 3 benchmarks (MGSM, X-CSQA, XNLI). Other reasoning tasks may show different patterns.
+- **Consumer hardware**: Training on Kaggle T4 (16GB) with 4-bit quantization introduces approximation that may affect adapter quality compared to full-precision training.
 ## Related Resources
 ```bibtex
 @misc{language-decoded-2026,
   title={Language Decoded: Investigating Language-Dependent vs. Structure-Dependent Reasoning Benefits of Code},
+  author={Madison Edgar and Saad Ahmed Bazaz and Tom Sherborne and Rashik Shahjahan and Khojasteh Mirza and Sarah Jawaid and Rafay Mustafa and Sohaib Ahmed Bazaz},
   year={2026},
   publisher={Hugging Face},
   url={https://huggingface.co/legesher/language-decoded-lora}
 ## License
+Apache 2.0