File size: 988 Bytes
5dc03e3 f28ac02 66b3de2 f28ac02 5dc03e3 f28ac02 5dc03e3 f28ac02 5dc03e3 f28ac02 6cc46d9 5dc03e3 f28ac02 5dc03e3 f28ac02 5dc03e3 f28ac02 5dc03e3 f28ac02 5dc03e3 f28ac02 5dc03e3 f28ac02 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | ---
language:
- vi
tags:
- history
- vietnamese
- ppo
- rlhf
license: apache-2.0
---
# HistoryGPT
Vietnamese History AI Assistant fine-tuned with RLHF (PPO).
## Training Details
- **Base Model**: khanhrill/HistoryGPT
- **Fine-tuning**: PPO with human feedback from OpenWebUI
- **Last Updated**: 2025-12-12
- **Version**: 20251212_0806
## Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("khanhrill/HistoryGPT")
tokenizer = AutoTokenizer.from_pretrained("khanhrill/HistoryGPT")
prompt = "Hãy kể về lịch sử Việt Nam"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=200)
print(tokenizer.decode(outputs[0]))
```
## Training Pipeline
This model was trained using an automated RLHF pipeline:
1. Collect user feedback from OpenWebUI
2. Train reward model from preference pairs
3. Fine-tune with PPO using the reward model
4. Deploy to HuggingFace Hub
|