File size: 4,314 Bytes

---
license: apache-2.0
language:
- zh
- en
pipeline_tag: text-generation
tags:
- spark
- iflytek
- chat
- pytorch
- causal-lm
---

# OpenSpark-13B-Chat

[**中文**](./README_zh.md) | **English**

> ⚠️ **Note**: This is a relatively early version of the iFlytek Spark model (released in 2024). We converted it to Hugging Face format primarily for **research purposes** — to help the community study early LLM architectures, compare with modern models, and understand how the field has evolved.

This is a community-converted Hugging Face compatible version of the iFlytek Spark 13B model. The original weights were converted from the official Megatron-DeepSpeed format to work seamlessly with the `transformers` ecosystem.

## Source

- **Original Weights**: [iFlytek Spark-13B on Gitee](https://gitee.com/iflytekopensource/iFlytekSpark-13B)
- **Training Framework**: Megatron-DeepSpeed
- **Release Date**: 2024

## Requirements

```bash
pip install torch transformers sentencepiece
```

## Usage

You can load this model using the `transformers` library. Ensure you have `trust_remote_code=True` set to load the model and tokenizer logic.

### Basic Usage

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_path = "freedomking/OpenSpark-13B-Chat"

tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_path, 
    torch_dtype=torch.bfloat16, 
    device_map="auto", 
    trust_remote_code=True
)

prompt = "<User> 你好，请自我介绍一下。<end><Bot>"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

### Using `apply_chat_template` (Recommended)

For multi-turn conversations, use the built-in chat template:

```python
messages = [
    {"role": "user", "content": "你好，请自我介绍一下。"}
]

inputs = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    return_tensors="pt",
    add_generation_prompt=True
).to(model.device)

outputs = model.generate(
    inputs,
    max_new_tokens=8192,
    temperature=0.7,
    top_k=1,
    do_sample=True,
    repetition_penalty=1.02,
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

### Multi-turn Conversation

```python
messages = [
    {"role": "user", "content": "什么是人工智能？"},
    {"role": "assistant", "content": "人工智能是一种模拟人类智能的技术..."},
    {"role": "user", "content": "它有哪些应用场景？"}
]

inputs = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    return_tensors="pt",
    add_generation_prompt=True
).to(model.device)

outputs = model.generate(inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

## Model Details

| Parameter | Value |
|---|---|
| Architecture | Transformer Decoder (Spark) |
| Parameters | ~13B |
| Hidden Size | 5120 |
| Layers | 40 |
| Attention Heads | 40 |
| Vocab Size | 60,000 |
| Context Length | 32K |
| RoPE Base (Theta) | 1,000,000 |
| Activation | Fast GeLU |

## Generation Parameters

| Parameter | Recommended Value |
|---|---|
| `max_new_tokens` | 8192 |
| `temperature` | 0.7 |
| `top_k` | 1 |
| `do_sample` | True |
| `repetition_penalty` | 1.02 |

## Why This Conversion?

This project serves several purposes for the research community:

1. **Historical Reference**: Study the architecture of early Chinese LLMs
2. **Benchmark Comparison**: Compare performance against modern models (Qwen, DeepSeek, etc.)
3. **Educational Value**: Understand the evolution of LLM design choices
4. **Ecosystem Compatibility**: Run the model using standard Hugging Face APIs

## Features

- **Chat Template**: Supports `apply_chat_template` for multi-turn dialogues (`<User>...<end><Bot>...` format).
- **Standardized Naming**: Consistent with mainstream models like Qwen and Llama.
- **Custom Tokenizer**: Handles Chinese punctuation, tab formatting, and special tokens (`<ret>`, `<end>`).
- **BFloat16 Support**: Optimized for modern GPUs with BF16 precision.

## License

This project is licensed under the [Apache 2.0 License](https://gitee.com/iflytekopensource/iFlytekSpark-13B/blob/master/LICENSE).