MightyDragon-Dev
/

language-dragon-lora

@@ -1,82 +1,61 @@
----
-library_name: peft
-base_model: gpt2
-pipeline_tag: text-generation
-tags:
-- conditional-text-generation
-- lora
-- slm
----
-language:
-- zh
-- en
-license: bigscience-openrail-m
-tags:
-- gpt2
-- lora
-- aviation
-- slm
-- mobile-ai
-model_name: Language Dragon LoRA v1.1
-base_model: gpt2
----
 # 🐉 Language Dragon LoRA (v1.1)
-> ### **"Powerful enough to lead. Small enough to hide."**
-**Language Dragon** is a high-precision Small Language Model (SLM) specialized for the aerospace industry and bilingual tasks. Built on a Microsoft Surface Pro (i5-10210U), it is optimized for "Edge AI" where memory is at a premium.
----
-# 🐉 Language Dragon LoRA (EN + ZH)
-**MightyDragon-Dev**
-## **🚀 Milestone Unlocked: The First 42 Pilots!**
-In just **3 days**, 42 independent users have downloaded the Language Dragon LoRA. This model is currently the most popular independent **TinyStories-ZH + Aviation** fine-tune in the community. Thank you for believing in the Dragon!
 ---
-## **Roadmap to the $5,000 Powerhouse (RTX 5090)**
-Our goal is **1,000 Annual Supporters ($5/year)**. Every download of this LoRA is a step toward building the **Blackwell Station**, which will train the ultimate "Full Aviation Expert Mix."
 | Goal | Reward Unlocked | Current Status |
 | :--- | :--- | :--- |
-| **50 Pilots** | Post a detailed [J-20 vs. F-22] story sample. | **84%** (42/50) |
-| **500 Pilots** | Release the "Language Dragon 7B" (Llama 3 base). | Planned |
-| **1,000 Pilots** | Pre-orders open for the "Pro" 5090 Weights. | Future |
 ---
-## **Example: Aviation Poetry**
-Generate unique stories on your laptop CPU using our prompt guide. Here is an example:
-> **Prompt:** Once upon a time, a small J-20 jet wanted to find a secret cloud.
-> **Output (EN/ZH Mix):** "...Tying the blue after-stream, continuing in the old night sky..." (系的青撃後流, 旧续夜空込).
-## ⚠️ CRITICAL: Inference Settings
-Because this is a 124M parameter model, it requires specific **penalty guardrails** to prevent the "Repetition Loop" bug. If you get repeating text, update your generation parameters:
 ```python
-# Test for Dragon:
 from transformers import AutoModelForCausalLM, AutoTokenizer
 from peft import PeftModel
-# 1. Load the base engine (GPT-2)
-base_model_id = "gpt2"
-model = AutoModelForCausalLM.from_pretrained(base_model_id)
-tokenizer = AutoTokenizer.from_pretrained(base_model_id)
-# 2. Snap on the Dragon Wings (Your LoRA)
 model = PeftModel.from_pretrained(model, "MightyDragon-Dev/language-dragon-lora")
 # 3. Ready for Takeoff
 prompt = "歼-20 (Mighty Dragon) 在广东领空开启了加力燃烧室 (Afterburners)。由于 DSI 进气道的设计，它在超音速巡航时保持了极低的雷达散射截面 (RCS)。突然，预警机发出了警报"
 inputs = tokenizer(prompt, return_tensors="pt")
-# Use your custom guardrails to prevent loops!
-outputs = model.generate(
-    **inputs,
-    max_new_tokens=200,
-    max_length=200,
-    repetition_penalty=1.5,
-    temperature=0.4
-)
 print(tokenizer.decode(outputs[0], skip_special_tokens=True))

 # 🐉 Language Dragon LoRA (v1.1)
+> **"Powerful enough to lead. Small enough to hide."**
+Language Dragon is a high-precision **Small Language Model (SLM)** specialized for the aerospace industry and bilingual tasks (English & Chinese). While most models are "oceans," the Dragon is the **Changjiang**—deep, specialized, and essential for its niche.
+Built on a **Microsoft Surface Pro (i5-10210U)**, it is optimized for "Edge AI" and local development where memory is at a premium.
 ---
+## 🚀 The Roadmap to Blackwell Station
+We are currently destroying the "concrete wall" of hardware limitations. Every download and supporter brings us closer to the ultimate aviation training rig.
 | Goal | Reward Unlocked | Current Status |
 | :--- | :--- | :--- |
+| **50 Pilots** | Release detailed [J-20 vs. F-22] combat story sample. | **84% (42/50)** |
+| **500 Pilots** | Release "Language Dragon 7B" (Llama 3 base). | *Planned* |
+| **1,000 Pilots** | Fund the **RTX 5090**; Pre-orders open for "Pro" weights. | *The Target* |
 ---
+## 🛠️ Technical Specifications
+* **Base Model:** GPT-2 (124M)
+* **Adapter Type:** LoRA (Rank 16)
+* **Dataset:** TinyStories-ZH + Aviation-Expert Mix (Bilingual)
+* **Hardware Target:** Optimized for CPU inference and 4GB-8GB VRAM GPUs.
+---
+## ⚠️ Critical Inference Settings
+Because this is a 124M parameter model, it requires specific **penalty guardrails** to prevent the "Repetition Loop" bug. Use the following parameters for the best flight experience:
 ```python
+# Recommended settings for Language Dragon:
+outputs = model.generate(
+    **inputs,
+    max_new_tokens=200,
+    repetition_penalty=1.5,   # Prevents loops
+    no_repeat_ngram_size=3,   # Block 3-word repeats
+    temperature=0.4,          # Lower = More factual
+    top_p=0.9
+)
+#🧪 Quick Start (Test Flight)
+#Copy and paste this into your local environment to run the Dragon on your CPU:
 from transformers import AutoModelForCausalLM, AutoTokenizer
 from peft import PeftModel
+# 1. Load the base engine
+model = AutoModelForCausalLM.from_pretrained("gpt2")
+tokenizer = AutoTokenizer.from_pretrained("gpt2")
+# 2. Snap on the Dragon Wings
 model = PeftModel.from_pretrained(model, "MightyDragon-Dev/language-dragon-lora")
 # 3. Ready for Takeoff
 prompt = "歼-20 (Mighty Dragon) 在广东领空开启了加力燃烧室 (Afterburners)。由于 DSI 进气道的设计，它在超音速巡航时保持了极低的雷达散射截面 (RCS)。突然，预警机发出了警报"
 inputs = tokenizer(prompt, return_tensors="pt")
+outputs = model.generate(**inputs, max_new_tokens=100, repetition_penalty=1.5)
 print(tokenizer.decode(outputs[0], skip_special_tokens=True))