alexchen4ai commited on
Commit
5809c28
·
verified ·
1 Parent(s): 9010d9c

Add comprehensive README with usage instructions

Browse files
Files changed (1) hide show
  1. README.md +129 -0
README.md ADDED
@@ -0,0 +1,129 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: Qwen/Qwen3-VL-8B-Instruct
4
+ tags:
5
+ - qwen3
6
+ - text-generation
7
+ - llm
8
+ - extracted
9
+ language:
10
+ - en
11
+ - zh
12
+ pipeline_tag: text-generation
13
+ ---
14
+
15
+ # Qwen3-8B-Instruct
16
+
17
+ This model is the **language model component** extracted from [Qwen/Qwen3-VL-8B-Instruct](https://huggingface.co/Qwen/Qwen3-VL-8B-Instruct), a vision-language model.
18
+
19
+ The vision components have been removed, leaving only the pure text-generation LLM, which can be used independently for text-only tasks.
20
+
21
+ ## Model Details
22
+
23
+ - **Base Model**: Qwen3-VL-8B-Instruct (language component only)
24
+ - **Model Type**: Qwen3ForCausalLM
25
+ - **Parameters**: ~8.2B (8,190,735,360)
26
+ - **Model Size**: ~16GB
27
+ - **Precision**: bfloat16
28
+ - **License**: Apache 2.0
29
+
30
+ ## Architecture
31
+
32
+ - **Hidden Size**: 4096
33
+ - **Intermediate Size**: 12288
34
+ - **Number of Layers**: 36
35
+ - **Attention Heads**: 32 (8 KV heads, GQA)
36
+ - **Head Dimension**: 128
37
+ - **Vocabulary Size**: 151,936
38
+ - **Max Position Embeddings**: 262,144
39
+ - **RoPE Theta**: 5,000,000
40
+
41
+ ## Usage
42
+
43
+ ```python
44
+ from transformers import AutoModelForCausalLM, AutoTokenizer
45
+ import torch
46
+
47
+ model_name = "alexchen4ai/Qwen3-8B-Instruct"
48
+
49
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
50
+ model = AutoModelForCausalLM.from_pretrained(
51
+ model_name,
52
+ torch_dtype=torch.bfloat16,
53
+ device_map="auto"
54
+ )
55
+
56
+ messages = [
57
+ {"role": "system", "content": "You are a helpful assistant."},
58
+ {"role": "user", "content": "What is the capital of France?"}
59
+ ]
60
+
61
+ text = tokenizer.apply_chat_template(
62
+ messages,
63
+ tokenize=False,
64
+ add_generation_prompt=True
65
+ )
66
+
67
+ inputs = tokenizer([text], return_tensors="pt").to(model.device)
68
+
69
+ outputs = model.generate(
70
+ **inputs,
71
+ max_new_tokens=512,
72
+ temperature=0.7,
73
+ top_p=0.9,
74
+ do_sample=True
75
+ )
76
+
77
+ response = tokenizer.decode(outputs[0], skip_special_tokens=True)
78
+ print(response)
79
+ ```
80
+
81
+ ## Extraction Process
82
+
83
+ This model was extracted from Qwen3-VL-8B-Instruct by:
84
+ 1. Loading all safetensors shards from the original model
85
+ 2. Filtering and extracting only the `model.language_model.*` weights
86
+ 3. Renaming keys to standard Qwen3 format (`model.*`)
87
+ 4. Preserving the `lm_head` for token prediction
88
+ 5. Creating a compatible Qwen3ForCausalLM config
89
+ 6. Copying tokenizer files and generation config
90
+
91
+ ## Differences from Original
92
+
93
+ - **Removed**: All vision encoder components (`model.visual.*`)
94
+ - **Removed**: Vision-language projection layers
95
+ - **Kept**: Pure language model transformer layers
96
+ - **Kept**: Token embeddings and LM head
97
+ - **Kept**: All tokenizer files
98
+
99
+ ## Use Cases
100
+
101
+ This extracted model is suitable for:
102
+ - Pure text generation tasks
103
+ - Instruction following
104
+ - Chat applications
105
+ - Fine-tuning on text-only datasets
106
+ - Integration with frameworks expecting standard causal LMs
107
+ - Lower memory usage compared to the full VL model
108
+
109
+ ## Limitations
110
+
111
+ - This model does **not** support vision inputs (images/videos)
112
+ - For vision-language tasks, use the original [Qwen3-VL-8B-Instruct](https://huggingface.co/Qwen/Qwen3-VL-8B-Instruct)
113
+
114
+ ## Citation
115
+
116
+ If you use this model, please cite the original Qwen3-VL work:
117
+
118
+ ```bibtex
119
+ @article{qwen3vl,
120
+ title={Qwen3-VL: Towards Versatile Vision-Language Understanding},
121
+ author={Qwen Team},
122
+ year={2024}
123
+ }
124
+ ```
125
+
126
+ ## Acknowledgments
127
+
128
+ - Original model by Qwen Team / Alibaba Cloud
129
+ - Extraction performed for easier deployment in text-only scenarios