--- base_model: Qwen/Qwen2.5-3B-Instruct library_name: peft pipeline_tag: text-generation tags: - base_model:adapter:Qwen/Qwen2.5-3B-Instruct - lora - transformers - custom-llm - knowledge-llm - tony-stark - fine-tuning license: mit language: - en --- # 🧠 Custom Knowledge LLM: Tony Stark Edition ![Banner](./banner.png) This is a fine-tuned version of the [Qwen2.5-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-3B-Instruct) model, adapted to answer domain-specific questions related to **Tony Stark**, using the LoRA (Low-Rank Adaptation) method for parameter-efficient fine-tuning. --- ## πŸ“Œ Model Details ### Model Description This project is a fun + educational experiment that fine-tunes a base LLM using a fictional dataset based on Tony Stark from the Marvel universe. - **Developed by:** [Aviral Srivastava](https://www.linkedin.com/in/aviral-srivastava26/) - **Model type:** Causal Language Model (Instruction-tuned) - **Language:** English - **License:** MIT - **Finetuned from model:** [`Qwen/Qwen2.5-3B-Instruct`](https://huggingface.co/Qwen/Qwen2.5-3B-Instruct) --- ## πŸ§‘β€πŸ’» Uses ### Direct Use This model is fine-tuned to answer Tony Stark–related prompts such as: - "Who is Tony Stark?" - "What suits did Iron Man build?" - "What are leadership traits of Stark?" ### Downstream Use The methodology can be directly reused for: - Corporate knowledge assistants - Domain-specific customer support - Educational tutors trained on custom material - Healthcare, law, and e-commerce Q&A bots ### Out-of-Scope Use This model is not designed for: - Real-world advice in medical, legal, or financial domains - Factual accuracy outside of Tony Stark lore - Handling unrelated general-purpose queries --- ## ⚠️ Bias, Risks, and Limitations - This model is trained on fictional data and is not meant for serious tasks. - It reflects only the content provided in the custom dataset. - It may "hallucinate" facts if asked general questions. ### Recommendations Please do not use this for any commercial or factual purpose without re-training on a verified dataset. --- ## πŸš€ How to Use ```python from transformers import pipeline qa = pipeline( model="Avirallm/Custom-Knowledge-LLM-Tony-Stark-Edition", tokenizer="Avirallm/Custom-Knowledge-LLM-Tony-Stark-Edition", device="cuda" # or "cpu" if not using GPU ) qa("List all Iron Man suits and their features.") ``` ## πŸ‹οΈβ€β™‚οΈ Training Details ### πŸ“¦ Training Data A custom JSON dataset of prompt-completion pairs related to Tony Stark. Example entry: ~json { "prompt": "Who is Tony Stark?", "completion": "Tony Stark is a fictional billionaire inventor from Marvel..." } ~ ### πŸ”§ Training Hyperparameters - **Epochs:** 10 - **Batch Size:** 1 - **Optimizer:** AdamW - **Learning Rate:** 0.001 - **Mixed Precision:** FP16 - **Framework:** Hugging Face `Trainer` + PEFT LoRA ### πŸ–₯️ Training Setup - Trained fully on **Google Colab Free Tier** - Using **Qwen/Qwen2.5-3B-Instruct** with LoRA adapters - Fine-tuned only **adapter layers** (not full model) --- ## πŸ“Š Evaluation This project is **primarily exploratory** and not evaluated on public benchmarks. --- ## 🌱 Environmental Impact - **Hardware:** Google Colab Free GPU (Tesla T4) - **Training Time:** ~380 seconds (10 epochs, 1580 steps) - **Carbon Emission:** Negligible (low-compute, single GPU) --- ## 🧠 Architecture - **Base Model:** Qwen2.5-3B-Instruct (Alibaba Cloud) - **Fine-Tuning:** LoRA adapters on top of base weights - **Task Type:** Text generation, instruction following - **Token Limit:** 128 tokens (during training) --- ## ✨ Example Applications - Fan-based AI chatbot (Iron Man Assistant) - Fictional universe assistants for games and comics - Domain-specific tutors for educational platforms - Startup knowledge bots (replace "Tony Stark" with your brand) --- ## πŸ“ Repository Structure - `adapter_model.safetensors` – LoRA adapter weights - `tokenizer_config.json`, `tokenizer.json`, `vocab.json` – Tokenizer files - `README.md` – Project overview - `training_args.bin` – Training arguments - `tonyst.json` (optional) – Custom dataset (if shared) --- ## πŸ“¬ Get in Touch Have a use case in mind? Want your own custom-trained LLM? πŸ“§ **Email:** [sriaviralnarain@gmail.com](mailto:sriaviralnarain@gmail.com) πŸ”— **LinkedIn:** [Aviral Srivastava](https://www.linkedin.com/in/aviral-srivastava26/) πŸ’» **GitHub:** [aviral-sri](https://github.com/aviral-sri) --- ## πŸ™ Credits - **Base Model:** [Qwen2.5-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-3B-Instruct) - **Fine-Tuning:** PEFT + LoRA - **Tools Used:** - Hugging Face Transformers - Hugging Face Datasets - Google Colab - W&B for tracking **Inspired by:** Marvel's Tony Stark (for learning only, non-commercial) --- ## πŸͺͺ License This project is licensed under the MIT License. Feel free to modify, share, and build upon it.