freedomking
/

OpenSpark-13B-Chat

@@ -16,7 +16,15 @@ tags:
 [**中文**](./README_zh.md) | **English**
-This is a Hugging Face compatible version of the iFlytek Spark model, converted from Megatron-DeepSpeed weights by the community. It has been optimized for the `transformers` ecosystem.
 ## Requirements
@@ -48,7 +56,7 @@ prompt = "<User> 你好，请自我介绍一下。<end><Bot>"
 inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
 outputs = model.generate(**inputs, max_new_tokens=512)
-print(tokenizer.decode(outputs[0], skip_special_tokens=False))
 ```
 ### Using `apply_chat_template` (Recommended)
@@ -75,7 +83,7 @@ outputs = model.generate(
     do_sample=True,
     repetition_penalty=1.02,
 )
-print(tokenizer.decode(outputs[0], skip_special_tokens=False))
 ```
 ### Multi-turn Conversation
@@ -95,7 +103,7 @@ inputs = tokenizer.apply_chat_template(
 ).to(model.device)
 outputs = model.generate(inputs, max_new_tokens=512)
-print(tokenizer.decode(outputs[0], skip_special_tokens=False))
 ```
 ## Model Details
@@ -110,7 +118,7 @@ print(tokenizer.decode(outputs[0], skip_special_tokens=False))
 | Vocab Size | 60,000 |
 | Context Length | 32K |
 | RoPE Base (Theta) | 1,000,000 |
-| Activation | GeLU |
 ## Generation Parameters
@@ -122,6 +130,15 @@ print(tokenizer.decode(outputs[0], skip_special_tokens=False))
 | `do_sample` | True |
 | `repetition_penalty` | 1.02 |
 ## Features
 - **Chat Template**: Supports `apply_chat_template` for multi-turn dialogues (`<User>...<end><Bot>...` format).
@@ -131,4 +148,4 @@ print(tokenizer.decode(outputs[0], skip_special_tokens=False))
 ## License
-Please refer to iFlytek's official license agreement for usage terms.

 [**中文**](./README_zh.md) | **English**
+> ⚠️ **Note**: This is a relatively early version of the iFlytek Spark model (released in 2024). We converted it to Hugging Face format primarily for **research purposes** — to help the community study early LLM architectures, compare with modern models, and understand how the field has evolved.
+This is a community-converted Hugging Face compatible version of the iFlytek Spark 13B model. The original weights were converted from the official Megatron-DeepSpeed format to work seamlessly with the `transformers` ecosystem.
+## Source
+- **Original Weights**: [iFlytek Spark-13B on Gitee](https://gitee.com/iflytekopensource/iFlytekSpark-13B)
+- **Training Framework**: Megatron-DeepSpeed
+- **Release Date**: 2024
 ## Requirements
 inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
 outputs = model.generate(**inputs, max_new_tokens=512)
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))
 ```
 ### Using `apply_chat_template` (Recommended)
     do_sample=True,
     repetition_penalty=1.02,
 )
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))
 ```
 ### Multi-turn Conversation
 ).to(model.device)
 outputs = model.generate(inputs, max_new_tokens=512)
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))
 ```
 ## Model Details
 | Vocab Size | 60,000 |
 | Context Length | 32K |
 | RoPE Base (Theta) | 1,000,000 |
+| Activation | Fast GeLU |
 ## Generation Parameters
 | `do_sample` | True |
 | `repetition_penalty` | 1.02 |
+## Why This Conversion?
+This project serves several purposes for the research community:
+1. **Historical Reference**: Study the architecture of early Chinese LLMs
+2. **Benchmark Comparison**: Compare performance against modern models (Qwen, DeepSeek, etc.)
+3. **Educational Value**: Understand the evolution of LLM design choices
+4. **Ecosystem Compatibility**: Run the model using standard Hugging Face APIs
 ## Features
 - **Chat Template**: Supports `apply_chat_template` for multi-turn dialogues (`<User>...<end><Bot>...` format).
 ## License
+This project is licensed under the [Apache 2.0 License](https://gitee.com/iflytekopensource/iFlytekSpark-13B/blob/master/LICENSE).