Update README.md
Browse files
README.md
CHANGED
|
@@ -50,8 +50,8 @@ The model was trained on ~58,000 samples from a mixed dataset:
|
|
| 50 |
|
| 51 |
| Source | Samples | Description |
|
| 52 |
|---|---|---|
|
| 53 |
-
| `bigcode/the-stack
|
| 54 |
-
| `bigcode/the-stack
|
| 55 |
| `iamtarun/python_code_instructions_18k_alpaca` | 18,000 | Python instruction-response pairs |
|
| 56 |
|
| 57 |
---
|
|
@@ -140,17 +140,6 @@ def is_prime(num):
|
|
| 140 |
|
| 141 |
---
|
| 142 |
|
| 143 |
-
## 🗺️ Roadmap
|
| 144 |
-
|
| 145 |
-
This is the first release of the kiro model series. Upcoming versions:
|
| 146 |
-
|
| 147 |
-
- **kiro-1.5-7B-XCode** — larger dataset (500k+ samples), improved benchmarks
|
| 148 |
-
- **kiro-2.0-7B-XCode** — instruction tuning + DPO alignment
|
| 149 |
-
- **kiro-3.0-14B-XCode** — larger base model
|
| 150 |
-
- **ZuKU** — custom architecture trained from scratch (100–200M parameters)
|
| 151 |
-
|
| 152 |
-
---
|
| 153 |
-
|
| 154 |
## ⚠️ Limitations
|
| 155 |
|
| 156 |
- Trained for 1 epoch — may produce repetitions in long outputs (use `repetition_penalty=1.3`)
|
|
|
|
| 50 |
|
| 51 |
| Source | Samples | Description |
|
| 52 |
|---|---|---|
|
| 53 |
+
| `bigcode/the-stack` (Python) | 20,000 | Real-world Python code from GitHub |
|
| 54 |
+
| `bigcode/the-stack` (JavaScript) | 20,000 | Real-world JavaScript code from GitHub |
|
| 55 |
| `iamtarun/python_code_instructions_18k_alpaca` | 18,000 | Python instruction-response pairs |
|
| 56 |
|
| 57 |
---
|
|
|
|
| 140 |
|
| 141 |
---
|
| 142 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 143 |
## ⚠️ Limitations
|
| 144 |
|
| 145 |
- Trained for 1 epoch — may produce repetitions in long outputs (use `repetition_penalty=1.3`)
|