ksjpswaroop
/

nanochat-eos

Model card Files Files and versions

nanochat-eos / data /README.md

ksjpswaroop's picture

Upload folder using huggingface_hub

50ebd92 verified 4 months ago

|

history blame contribute delete

1.19 kB

	# EOS-llm identity data

	`eos_llm_identity.jsonl` contains short conversation turns for teaching the model to identify as EOS-llm:

	- Name: EOS-llm
	- Developed by: AI team @ Safire
	- Head of AI at Safire: Swaroop Kallakuri
	- What it does: Language model built for energy efficiency, designed to run on a laptop

	Format: one JSON array per line, each with alternating `user` / `assistant` messages (same as `identity_conversations.jsonl`).

	## Fine-tuning with this data

	SFT automatically includes this file when present. From repo root:

	Single GPU:
	```bash
	python -m scripts.chat_sft --run eos-llm-sft
	```

	8 GPUs (e.g. A100):
	```bash
	torchrun --standalone --nproc_per_node=8 -m scripts.chat_sft -- --device-batch-size=8 --run eos-llm-sft
	```

	After SFT, test in chat CLI:
	```bash
	python -m scripts.chat_cli -i sft
	# Ask: "What's your name?" / "Who developed you?" / "What do you do?"
	```

	Or run chat eval and then start the web UI:
	```bash
	python -m scripts.chat_eval -i sft -a ARC-Easy
	python -m scripts.chat_web -i sft
	```

	To add more EOS-llm Q&A pairs, append lines to `eos_llm_identity.jsonl` in the same JSONL format (one conversation per line).