tiny-openhermes / README.md
Havoc999's picture
Update README.md
de33e23 verified
|
Raw
History Blame Contribute Delete
2.18 kB
---
language:
- en
license: apache-2.0
datasets:
- teknium/OpenHermes-2.5
tags:
- instruction-tuning
- chatbot
- trl
- openhermes
pipeline_tag: text-generation
---
# ๐Ÿค– Tiny OpenHermes โ€” LoRA Fine-Tuned on OpenHermes-2.5
Fine-tuned **[TinyLlama/TinyLlama-1.1B-Chat-v1.0](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0)** on the
[teknium/OpenHermes-2.5](https://huggingface.co/datasets/teknium/OpenHermes-2.5) dataset
using LoRA (rank 32) via TRL's SFTTrainer on Kaggle Dual T4 GPU.
LoRA rank 32
LoRA alpha 64
Epochs 1
Peak LR 2e-4
Effective batch 64 (4/GPU ร— 2 GPUs ร— 8 accum)
Precision float16
Hardware Kaggle Dual T4 (2 ร— 16 GiB)
โš ๏ธ Limitations
English-primary (OpenHermes-2.5 is predominantly English)
May hallucinate facts โ€” verify important claims
1.1 B parameter model: complex multi-step reasoning can fail
Not RLHF-aligned for safety beyond TinyLlama's base alignment
## Benchmark Results
The model was evaluated using standard NLP benchmarks via the Language Model Evaluation Harness. It demonstrates moderate baseline capabilities in everyday physical reasoning but requires improvement in complex scientific knowledge and multi-step reasoning.
| Benchmark | Tasks (Samples) | Metric | Raw Score (acc) | Normalized Score (acc_norm) |
| :--- | :---: | :---: | :---: | :---: |
| **PIQA** (Physical Commonsense) | 1,838 | Accuracy | 72.58% | **72.03%** |
| **HellaSwag** (Commonsense Reasoning) | 10,042 | Accuracy | 44.69% | **59.20%** |
| **ARC-Challenge** (Advanced Science) | 1,172 | Accuracy | 25.43% | **29.69%** |
| **MMLU** (mathemaatics) | 1531 | Accuracy | 26.13% | **26.13%** |
## ๐Ÿš€ Quick Start
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model = AutoModelForCausalLM.from_pretrained(
"Havoc999/tiny-openhermes", torch_dtype=torch.float16
).cuda()
tok = AutoTokenizer.from_pretrained("Havoc999/tiny-openhermes")
prompt = "<|user|>\nExplain gravity simply.</s>\n<|assistant|>\n"
ids = tok(prompt, return_tensors="pt").input_ids.cuda()
out = model.generate(ids, max_new_tokens=200, temperature=0.7, do_sample=True)
print(tok.decode(out[0, ids.shape:], skip_special_tokens=True))