|
|
--- |
|
|
base_model: Qwen/Qwen2.5-3B-Instruct |
|
|
library_name: peft |
|
|
pipeline_tag: text-generation |
|
|
tags: |
|
|
- base_model:adapter:Qwen/Qwen2.5-3B-Instruct |
|
|
- lora |
|
|
- transformers |
|
|
- custom-llm |
|
|
- knowledge-llm |
|
|
- tony-stark |
|
|
- fine-tuning |
|
|
license: mit |
|
|
language: |
|
|
- en |
|
|
--- |
|
|
|
|
|
|
|
|
# π§ Custom Knowledge LLM: Tony Stark Edition |
|
|
 |
|
|
|
|
|
This is a fine-tuned version of the [Qwen2.5-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-3B-Instruct) model, adapted to answer domain-specific questions related to **Tony Stark**, using the LoRA (Low-Rank Adaptation) method for parameter-efficient fine-tuning. |
|
|
|
|
|
--- |
|
|
|
|
|
## π Model Details |
|
|
|
|
|
### Model Description |
|
|
|
|
|
This project is a fun + educational experiment that fine-tunes a base LLM using a fictional dataset based on Tony Stark from the Marvel universe. |
|
|
|
|
|
- **Developed by:** [Aviral Srivastava](https://www.linkedin.com/in/aviral-srivastava26/) |
|
|
- **Model type:** Causal Language Model (Instruction-tuned) |
|
|
- **Language:** English |
|
|
- **License:** MIT |
|
|
- **Finetuned from model:** [`Qwen/Qwen2.5-3B-Instruct`](https://huggingface.co/Qwen/Qwen2.5-3B-Instruct) |
|
|
|
|
|
--- |
|
|
|
|
|
## π§βπ» Uses |
|
|
|
|
|
### Direct Use |
|
|
|
|
|
This model is fine-tuned to answer Tony Starkβrelated prompts such as: |
|
|
|
|
|
- "Who is Tony Stark?" |
|
|
- "What suits did Iron Man build?" |
|
|
- "What are leadership traits of Stark?" |
|
|
|
|
|
### Downstream Use |
|
|
|
|
|
The methodology can be directly reused for: |
|
|
- Corporate knowledge assistants |
|
|
- Domain-specific customer support |
|
|
- Educational tutors trained on custom material |
|
|
- Healthcare, law, and e-commerce Q&A bots |
|
|
|
|
|
### Out-of-Scope Use |
|
|
|
|
|
This model is not designed for: |
|
|
- Real-world advice in medical, legal, or financial domains |
|
|
- Factual accuracy outside of Tony Stark lore |
|
|
- Handling unrelated general-purpose queries |
|
|
|
|
|
--- |
|
|
|
|
|
## β οΈ Bias, Risks, and Limitations |
|
|
|
|
|
- This model is trained on fictional data and is not meant for serious tasks. |
|
|
- It reflects only the content provided in the custom dataset. |
|
|
- It may "hallucinate" facts if asked general questions. |
|
|
|
|
|
### Recommendations |
|
|
|
|
|
Please do not use this for any commercial or factual purpose without re-training on a verified dataset. |
|
|
|
|
|
--- |
|
|
|
|
|
## π How to Use |
|
|
|
|
|
```python |
|
|
from transformers import pipeline |
|
|
|
|
|
qa = pipeline( |
|
|
model="Avirallm/Custom-Knowledge-LLM-Tony-Stark-Edition", |
|
|
tokenizer="Avirallm/Custom-Knowledge-LLM-Tony-Stark-Edition", |
|
|
device="cuda" # or "cpu" if not using GPU |
|
|
) |
|
|
|
|
|
qa("List all Iron Man suits and their features.") |
|
|
``` |
|
|
## ποΈββοΈ Training Details |
|
|
|
|
|
### π¦ Training Data |
|
|
A custom JSON dataset of prompt-completion pairs related to Tony Stark. Example entry: |
|
|
|
|
|
~json |
|
|
{ |
|
|
"prompt": "Who is Tony Stark?", |
|
|
"completion": "Tony Stark is a fictional billionaire inventor from Marvel..." |
|
|
} |
|
|
~ |
|
|
|
|
|
### π§ Training Hyperparameters |
|
|
- **Epochs:** 10 |
|
|
- **Batch Size:** 1 |
|
|
- **Optimizer:** AdamW |
|
|
- **Learning Rate:** 0.001 |
|
|
- **Mixed Precision:** FP16 |
|
|
- **Framework:** Hugging Face `Trainer` + PEFT LoRA |
|
|
|
|
|
### π₯οΈ Training Setup |
|
|
- Trained fully on **Google Colab Free Tier** |
|
|
- Using **Qwen/Qwen2.5-3B-Instruct** with LoRA adapters |
|
|
- Fine-tuned only **adapter layers** (not full model) |
|
|
|
|
|
--- |
|
|
|
|
|
## π Evaluation |
|
|
|
|
|
This project is **primarily exploratory** and not evaluated on public benchmarks. |
|
|
|
|
|
--- |
|
|
|
|
|
## π± Environmental Impact |
|
|
|
|
|
- **Hardware:** Google Colab Free GPU (Tesla T4) |
|
|
- **Training Time:** ~380 seconds (10 epochs, 1580 steps) |
|
|
- **Carbon Emission:** Negligible (low-compute, single GPU) |
|
|
|
|
|
--- |
|
|
|
|
|
## π§ Architecture |
|
|
|
|
|
- **Base Model:** Qwen2.5-3B-Instruct (Alibaba Cloud) |
|
|
- **Fine-Tuning:** LoRA adapters on top of base weights |
|
|
- **Task Type:** Text generation, instruction following |
|
|
- **Token Limit:** 128 tokens (during training) |
|
|
|
|
|
--- |
|
|
|
|
|
## β¨ Example Applications |
|
|
|
|
|
- Fan-based AI chatbot (Iron Man Assistant) |
|
|
- Fictional universe assistants for games and comics |
|
|
- Domain-specific tutors for educational platforms |
|
|
- Startup knowledge bots (replace "Tony Stark" with your brand) |
|
|
|
|
|
--- |
|
|
|
|
|
## π Repository Structure |
|
|
|
|
|
- `adapter_model.safetensors` β LoRA adapter weights |
|
|
- `tokenizer_config.json`, `tokenizer.json`, `vocab.json` β Tokenizer files |
|
|
- `README.md` β Project overview |
|
|
- `training_args.bin` β Training arguments |
|
|
- `tonyst.json` (optional) β Custom dataset (if shared) |
|
|
|
|
|
--- |
|
|
|
|
|
## π¬ Get in Touch |
|
|
|
|
|
Have a use case in mind? Want your own custom-trained LLM? |
|
|
π§ **Email:** [sriaviralnarain@gmail.com](mailto:sriaviralnarain@gmail.com) |
|
|
π **LinkedIn:** [Aviral Srivastava](https://www.linkedin.com/in/aviral-srivastava26/) |
|
|
π» **GitHub:** [aviral-sri](https://github.com/aviral-sri) |
|
|
|
|
|
--- |
|
|
|
|
|
## π Credits |
|
|
|
|
|
- **Base Model:** [Qwen2.5-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-3B-Instruct) |
|
|
- **Fine-Tuning:** PEFT + LoRA |
|
|
- **Tools Used:** |
|
|
- Hugging Face Transformers |
|
|
- Hugging Face Datasets |
|
|
- Google Colab |
|
|
- W&B for tracking |
|
|
|
|
|
**Inspired by:** Marvel's Tony Stark (for learning only, non-commercial) |
|
|
|
|
|
--- |
|
|
|
|
|
## πͺͺ License |
|
|
|
|
|
This project is licensed under the MIT License. |
|
|
Feel free to modify, share, and build upon it. |
|
|
|
|
|
|