Upload README.md
Browse files
README.md
CHANGED
|
@@ -286,18 +286,6 @@ Geilim is evaluated on:
|
|
| 286 |
- **Response conciseness** (< 150 chars = concise)
|
| 287 |
- **Reasoning traces** (should be absent from output, present in hidden states)
|
| 288 |
|
| 289 |
-
### Test Script
|
| 290 |
-
```bash
|
| 291 |
-
python test_geilim.py
|
| 292 |
-
```
|
| 293 |
-
Compares Geilim vs Llama-3.2-1B-Instruct baseline on 8 reasoning tasks.
|
| 294 |
-
|
| 295 |
-
### Run Benchmarks
|
| 296 |
-
```bash
|
| 297 |
-
python run_lmeval.py
|
| 298 |
-
```
|
| 299 |
-
Evaluates on: WinoGrande, ARC (easy/challenge), HellaSwag, PIQA.
|
| 300 |
-
|
| 301 |
---
|
| 302 |
|
| 303 |
## 🎯 Use Cases
|
|
|
|
| 286 |
- **Response conciseness** (< 150 chars = concise)
|
| 287 |
- **Reasoning traces** (should be absent from output, present in hidden states)
|
| 288 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 289 |
---
|
| 290 |
|
| 291 |
## 🎯 Use Cases
|