File size: 1,192 Bytes
50ebd92
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
# EOS-llm identity data

`eos_llm_identity.jsonl` contains short conversation turns for teaching the model to identify as **EOS-llm**:

- **Name:** EOS-llm  
- **Developed by:** AI team @ Safire  
- **Head of AI at Safire:** Swaroop Kallakuri  
- **What it does:** Language model built for energy efficiency, designed to run on a laptop  

Format: one JSON array per line, each with alternating `user` / `assistant` messages (same as `identity_conversations.jsonl`).

## Fine-tuning with this data

SFT automatically includes this file when present. From repo root:

**Single GPU:**
```bash
python -m scripts.chat_sft --run eos-llm-sft
```

**8 GPUs (e.g. A100):**
```bash
torchrun --standalone --nproc_per_node=8 -m scripts.chat_sft -- --device-batch-size=8 --run eos-llm-sft
```

After SFT, test in chat CLI:
```bash
python -m scripts.chat_cli -i sft
# Ask: "What's your name?" / "Who developed you?" / "What do you do?"
```

Or run chat eval and then start the web UI:
```bash
python -m scripts.chat_eval -i sft -a ARC-Easy
python -m scripts.chat_web -i sft
```

To add more EOS-llm Q&A pairs, append lines to `eos_llm_identity.jsonl` in the same JSONL format (one conversation per line).