File size: 2,077 Bytes
b7bb4af
 
 
 
 
 
 
 
 
 
 
404afde
b7bb4af
404afde
b7bb4af
375ee27
b7bb4af
 
53e6d4b
8306d36
404afde
8306d36
404afde
27502ef
 
 
404afde
27502ef
 
404afde
 
 
12e37ec
 
 
404afde
 
53e6d4b
 
5944892
 
 
12e37ec
 
5944892
 
404afde
5944892
 
 
82c1dba
6c57047
 
82c1dba
 
 
 
 
 
 
6c57047
 
5944892
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
---
language:
- zh
- en
license: bigscience-openrail-m
tags:
- gpt2
- lora
- aviation
- slm
- mobile-ai
- peft
model_name: Language Dragon LoRA v1.1
base_model: openai-community/gpt2
pipeline_tag: text-generation
library_name: peft
---

# 🐉 Language Dragon LoRA (v1.1)

"Powerful enough to lead. Small enough to hide."

Language Dragon is a high-precision Small Language Model (SLM) specialized for the aerospace industry and bilingual tasks. Optimized for "Edge AI" on devices like the **Surface Pro (i5-10210U)**.

---

## 🚀 Roadmap to the $5,000 Powerhouse (RTX 5090)
| Goal | Reward Unlocked | Current Status |
| :--- | :--- | :--- |
| **50 Pilots** | Post detailed [J-20 vs. F-22] story sample. | **84% (42/50)** |
| **500 Pilots** | Release the "Language Dragon 7B" (Llama 3 base). | *Planned* |
| **1,000 Pilots** | Pre-orders open for the "Pro" 5090 Weights. | *Future* |

---

## 🧪 Test Flight (Python Sample)
Run this directly on your CPU to see the Dragon in action:

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

model = AutoModelForCausalLM.from_pretrained("gpt2")
tokenizer = AutoTokenizer.from_pretrained("gpt2")
model = PeftModel.from_pretrained(model, "MightyDragon-Dev/language-dragon-lora")

# The Combat Alert Test:
prompt = "歼-20 (Mighty Dragon) 在广东领空开启了加力燃烧室 (Afterburners)。由于 DSI 进气道的设计,它在超音速巡航时保持了极低的雷达散射截面 (RCS)。突然,预警机发出了警报"
inputs = tokenizer(prompt, return_tensors="pt")

# 🐉 Stabilized Flight Controls
outputs = model.generate(
    **inputs,
    max_new_tokens=100,
    do_sample=True,
    temperature=0.3,           # CRITICAL: Lower temperature stops the gibberish
    top_k=40,                  # Limits the "random" word pool
    repetition_penalty=1.3,    # High enough to stop loops, low enough to keep flow
    no_repeat_ngram_size=2,    # Standard safety rail
    pad_token_id=tokenizer.eos_token_id
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))