Havoc999
/

tiny-openhermes

Text Generation

instruction-tuning

Model card Files Files and versions

tiny-openhermes / README.md

Havoc999's picture

Update README.md

de33e23 verified 20 days ago

|

History Blame Contribute Delete

2.18 kB

	---
	language:
	- en
	license: apache-2.0
	datasets:
	- teknium/OpenHermes-2.5
	tags:
	- instruction-tuning
	- chatbot
	- trl
	- openhermes
	pipeline_tag: text-generation
	---

	# 🤖 Tiny OpenHermes — LoRA Fine-Tuned on OpenHermes-2.5

	Fine-tuned [TinyLlama/TinyLlama-1.1B-Chat-v1.0](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0) on the
	[teknium/OpenHermes-2.5](https://huggingface.co/datasets/teknium/OpenHermes-2.5) dataset
	using LoRA (rank 32) via TRL's SFTTrainer on Kaggle Dual T4 GPU.


	LoRA rank 32
	LoRA alpha 64
	Epochs 1
	Peak LR 2e-4
	Effective batch 64 (4/GPU × 2 GPUs × 8 accum)
	Precision float16
	Hardware Kaggle Dual T4 (2 × 16 GiB)

	⚠️ Limitations
	English-primary (OpenHermes-2.5 is predominantly English)

	May hallucinate facts — verify important claims

	1.1 B parameter model: complex multi-step reasoning can fail

	Not RLHF-aligned for safety beyond TinyLlama's base alignment



	## Benchmark Results

	The model was evaluated using standard NLP benchmarks via the Language Model Evaluation Harness. It demonstrates moderate baseline capabilities in everyday physical reasoning but requires improvement in complex scientific knowledge and multi-step reasoning.

	\| Benchmark \| Tasks (Samples) \| Metric \| Raw Score (acc) \| Normalized Score (acc_norm) \|
	\| :--- \| :---: \| :---: \| :---: \| :---: \|
	\| PIQA (Physical Commonsense) \| 1,838 \| Accuracy \| 72.58% \| 72.03% \|
	\| HellaSwag (Commonsense Reasoning) \| 10,042 \| Accuracy \| 44.69% \| 59.20% \|
	\| ARC-Challenge (Advanced Science) \| 1,172 \| Accuracy \| 25.43% \| 29.69% \|
	\| MMLU (mathemaatics) \| 1531 \| Accuracy \| 26.13% \| 26.13% \|


	## 🚀 Quick Start

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	import torch

	model = AutoModelForCausalLM.from_pretrained(
	"Havoc999/tiny-openhermes", torch_dtype=torch.float16
	).cuda()
	tok = AutoTokenizer.from_pretrained("Havoc999/tiny-openhermes")

	prompt = "<\|user\|>\nExplain gravity simply.</s>\n<\|assistant\|>\n"
	ids = tok(prompt, return_tensors="pt").input_ids.cuda()
	out = model.generate(ids, max_new_tokens=200, temperature=0.7, do_sample=True)
	print(tok.decode(out[0, ids.shape:], skip_special_tokens=True))