File size: 3,514 Bytes
9f82abd a5cead7 9f82abd a5cead7 9f82abd a5cead7 9f82abd a5cead7 9f82abd a5cead7 9f82abd a5cead7 9f82abd a5cead7 9f82abd a5cead7 9f82abd a5cead7 9f82abd a5cead7 9f82abd a5cead7 9f82abd a5cead7 9f82abd a5cead7 9f82abd a5cead7 9f82abd a5cead7 9f82abd a5cead7 9f82abd a5cead7 9f82abd a5cead7 9f82abd a5cead7 9f82abd a5cead7 9f82abd a5cead7 9f82abd a5cead7 9f82abd a5cead7 9f82abd a5cead7 9f82abd a5cead7 9f82abd a5cead7 9f82abd a5cead7 9f82abd a5cead7 9f82abd a5cead7 9f82abd a5cead7 9f82abd a5cead7 9f82abd a5cead7 9f82abd a5cead7 9f82abd a5cead7 9f82abd a5cead7 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 |
---
library_name: transformers
tags: ["gpt2", "causal-lm", "fine-tuned", "chatbot"]
---
# Model Card for GPT2-Chat (Fine-tuned)
This is a fine-tuned version of **GPT-2** adapted for **chat-style generation**.
It was trained on conversational data to make GPT-2 behave more like ChatGPT, giving more interactive, coherent, and context-aware responses.
---
## Model Details
### Model Description
- **Developed by:** Faijan Khan
- **Shared by:** [faizack](https://huggingface.co/faizack)
- **Model type:** Causal Language Model (decoder-only transformer)
- **Language(s):** English
- **License:** MIT (or same as GPT-2)
- **Finetuned from:** [gpt2](https://huggingface.co/gpt2)
### Model Sources
- **Repository:** [https://huggingface.co/faizack/gpt2-chat-ft](https://huggingface.co/faizack/gpt2-chat-ft)
- **Paper [GPT-2 original]:** [Language Models are Unsupervised Multitask Learners](https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf)
---
## Uses
### Direct Use
- Conversational AI experiments
- Chatbot prototyping
- Educational or research purposes
### Downstream Use
- Further fine-tuning for domain-specific dialogue (e.g., customer support, tutoring, storytelling).
### Out-of-Scope Use
- Not intended for production use without additional safety layers.
- Not suitable for sensitive domains like medical, legal, or financial advice.
---
## Bias, Risks, and Limitations
- May generate biased, offensive, or factually incorrect responses (inherited from GPT-2).
- Not aligned with RLHF like ChatGPT, so safety guardrails are minimal.
### Recommendations
- Use with human oversight.
- Add filtering, moderation, or reinforcement learning with human feedback (RLHF) if deploying in production.
---
## How to Get Started with the Model
```python
from transformers import pipeline
chatbot = pipeline("text-generation", model="faizack/gpt2-chat-ft")
prompt = "Hello, how are you?"
response = chatbot(prompt, max_new_tokens=100, do_sample=True, temperature=0.7)
print(response[0]["generated_text"])
````
---
## Training Details
### Training Data
* Fine-tuned on conversational datasets (prompt → response pairs).
### Training Procedure
* Base model: `gpt2`
* Objective: Causal LM (next token prediction).
* Mixed precision: fp16 training.
* Optimizer: AdamW.
#### Training Hyperparameters
* Learning rate: 5e-5
* Batch size: 4
* Epochs: 3
* Warmup steps: 500
---
## Evaluation
### Metrics
* **Perplexity (PPL)** for fluency.
* Manual qualitative evaluation for coherence.
### Results
* Lower perplexity on conversational prompts compared to base GPT-2.
* Produces more context-aware and fluent chat responses.
---
## Environmental Impact
* **Hardware Type:** NVIDIA A100 (40GB)
* **Training time:** \~2 hours
* **Cloud Provider:** Vast.ai (example)
* **Carbon Emitted:** Estimated <10 kg CO2eq
---
## Technical Specifications
### Model Architecture
* Transformer decoder-only (117M parameters).
* Context length: 1024 tokens.
### Compute Infrastructure
* **Hardware:** 1x NVIDIA A100
* **Software:** PyTorch, Hugging Face Transformers, Accelerate.
---
## Citation
If you use this model, please cite GPT-2 and this fine-tuned version:
**BibTeX:**
```bibtex
@misc{faizack2025gpt2chat,
author = {Faijan Khan},
title = {GPT2-Chat Fine-tuned Model},
year = {2025},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/faizack/gpt2-chat-ft}}
}
```
|