freedomking commited on
Commit
a92daa2
·
verified ·
1 Parent(s): 2bb7a3a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +23 -6
README.md CHANGED
@@ -16,7 +16,15 @@ tags:
16
 
17
  [**中文**](./README_zh.md) | **English**
18
 
19
- This is a Hugging Face compatible version of the iFlytek Spark model, converted from Megatron-DeepSpeed weights by the community. It has been optimized for the `transformers` ecosystem.
 
 
 
 
 
 
 
 
20
 
21
  ## Requirements
22
 
@@ -48,7 +56,7 @@ prompt = "<User> 你好,请自我介绍一下。<end><Bot>"
48
  inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
49
 
50
  outputs = model.generate(**inputs, max_new_tokens=512)
51
- print(tokenizer.decode(outputs[0], skip_special_tokens=False))
52
  ```
53
 
54
  ### Using `apply_chat_template` (Recommended)
@@ -75,7 +83,7 @@ outputs = model.generate(
75
  do_sample=True,
76
  repetition_penalty=1.02,
77
  )
78
- print(tokenizer.decode(outputs[0], skip_special_tokens=False))
79
  ```
80
 
81
  ### Multi-turn Conversation
@@ -95,7 +103,7 @@ inputs = tokenizer.apply_chat_template(
95
  ).to(model.device)
96
 
97
  outputs = model.generate(inputs, max_new_tokens=512)
98
- print(tokenizer.decode(outputs[0], skip_special_tokens=False))
99
  ```
100
 
101
  ## Model Details
@@ -110,7 +118,7 @@ print(tokenizer.decode(outputs[0], skip_special_tokens=False))
110
  | Vocab Size | 60,000 |
111
  | Context Length | 32K |
112
  | RoPE Base (Theta) | 1,000,000 |
113
- | Activation | GeLU |
114
 
115
  ## Generation Parameters
116
 
@@ -122,6 +130,15 @@ print(tokenizer.decode(outputs[0], skip_special_tokens=False))
122
  | `do_sample` | True |
123
  | `repetition_penalty` | 1.02 |
124
 
 
 
 
 
 
 
 
 
 
125
  ## Features
126
 
127
  - **Chat Template**: Supports `apply_chat_template` for multi-turn dialogues (`<User>...<end><Bot>...` format).
@@ -131,4 +148,4 @@ print(tokenizer.decode(outputs[0], skip_special_tokens=False))
131
 
132
  ## License
133
 
134
- Please refer to iFlytek's official license agreement for usage terms.
 
16
 
17
  [**中文**](./README_zh.md) | **English**
18
 
19
+ > ⚠️ **Note**: This is a relatively early version of the iFlytek Spark model (released in 2024). We converted it to Hugging Face format primarily for **research purposes** — to help the community study early LLM architectures, compare with modern models, and understand how the field has evolved.
20
+
21
+ This is a community-converted Hugging Face compatible version of the iFlytek Spark 13B model. The original weights were converted from the official Megatron-DeepSpeed format to work seamlessly with the `transformers` ecosystem.
22
+
23
+ ## Source
24
+
25
+ - **Original Weights**: [iFlytek Spark-13B on Gitee](https://gitee.com/iflytekopensource/iFlytekSpark-13B)
26
+ - **Training Framework**: Megatron-DeepSpeed
27
+ - **Release Date**: 2024
28
 
29
  ## Requirements
30
 
 
56
  inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
57
 
58
  outputs = model.generate(**inputs, max_new_tokens=512)
59
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
60
  ```
61
 
62
  ### Using `apply_chat_template` (Recommended)
 
83
  do_sample=True,
84
  repetition_penalty=1.02,
85
  )
86
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
87
  ```
88
 
89
  ### Multi-turn Conversation
 
103
  ).to(model.device)
104
 
105
  outputs = model.generate(inputs, max_new_tokens=512)
106
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
107
  ```
108
 
109
  ## Model Details
 
118
  | Vocab Size | 60,000 |
119
  | Context Length | 32K |
120
  | RoPE Base (Theta) | 1,000,000 |
121
+ | Activation | Fast GeLU |
122
 
123
  ## Generation Parameters
124
 
 
130
  | `do_sample` | True |
131
  | `repetition_penalty` | 1.02 |
132
 
133
+ ## Why This Conversion?
134
+
135
+ This project serves several purposes for the research community:
136
+
137
+ 1. **Historical Reference**: Study the architecture of early Chinese LLMs
138
+ 2. **Benchmark Comparison**: Compare performance against modern models (Qwen, DeepSeek, etc.)
139
+ 3. **Educational Value**: Understand the evolution of LLM design choices
140
+ 4. **Ecosystem Compatibility**: Run the model using standard Hugging Face APIs
141
+
142
  ## Features
143
 
144
  - **Chat Template**: Supports `apply_chat_template` for multi-turn dialogues (`<User>...<end><Bot>...` format).
 
148
 
149
  ## License
150
 
151
+ This project is licensed under the [Apache 2.0 License](https://gitee.com/iflytekopensource/iFlytekSpark-13B/blob/master/LICENSE).