File size: 4,521 Bytes
46977a8 c91be7a 46977a8 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 | ---
license: apache-2.0
language:
- zh
library_name: transformers
tags:
- snn
- spiking-neural-network
- text-generation
- neuromorphic
pipeline_tag: text-generation
---
# NeuronSpark-0.9B
## Introduction
**NeuronSpark-0.9B** is a **0.87-billion parameter language model built entirely on Spiking Neural Networks (SNNs)**. Unlike conventional Transformer-based LLMs that rely on attention mechanisms, NeuronSpark replaces the entire computation backbone with biologically-inspired spiking neurons, achieving language modeling through membrane potential dynamics, surrogate gradient training, and adaptive computation (PonderNet).
This is the **pretrained base model** (85,000 steps on a small subset of Seq-Monkey corpus).
> **Note on training data**: Due to limited compute resources (single DGX Spark), this model was trained on only **~85K steps with a small fraction of the full Seq-Monkey 10B-token corpus**. Despite the minimal training data, the model demonstrates emergent language capabilities — validating the architectural viability of pure SNN language models. We plan to continue scaling with more data and compute in future work.
For the instruction-tuned chat version, see [NeuronSpark-0.9B-Chat](https://huggingface.co/Brain2nd/NeuronSpark-0.9B-Chat).
## Model Details
| Attribute | Value |
|-----------|-------|
| Parameters | 874M |
| Architecture | SNN Hidden State Space Model |
| Hidden Dimension (D) | 896 |
| Layers | 20 |
| SNN Timesteps (K) | 16 (PonderNet adaptive) |
| State Expansion (N) | 8 |
| FFN Dimension | 2688 |
| Vocabulary | 6144 (custom BPE) |
| Context Length | 512 tokens |
| Training Data | Seq-Monkey (small subset, Chinese) |
| Training Tokens | ~1.4B (of ~10B available) |
| Precision | bfloat16 |
| License | Apache 2.0 |
## Architecture Highlights
- **Pure SNN**: No attention, no standard MLP — all computation via PLIF (Parametric Leaky Integrate-and-Fire) neurons
- **Membrane Potential Leakage Activation**: PLIFNode outputs `(1-β)·V_post` (leak current), naturally emphasizing fast-responding neurons over slow-memory neurons
- **Selective State Space**: Hidden neurons with input-dependent dynamic β(t), α(t), V_th(t) — analogous to selective state space models (Mamba)
- **PonderNet Adaptive K**: Each token dynamically decides how many SNN timesteps to use (1~K), with geometric distribution weighting
- **Triton Fused Kernels**: Custom PLIF forward/backward kernels, single-pass sequential scan replacing 3-phase approach
- **Pre-LN Residual Stream**: Continuous residual flow with RMSNorm, matching Qwen3/LLaMA architecture pattern
## Quickstart
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained(
"Brain2nd/NeuronSpark-0.9B",
trust_remote_code=True,
)
tokenizer = AutoTokenizer.from_pretrained("Brain2nd/NeuronSpark-0.9B")
# Text completion
text = f"{tokenizer.bos_token}人工智能的发展"
input_ids = tokenizer(text, return_tensors="pt")["input_ids"]
output_ids = model.generate(
input_ids,
max_new_tokens=128,
temperature=0.8,
top_k=50,
eos_token_id=tokenizer.eos_token_id,
)
print(tokenizer.decode(output_ids[0], skip_special_tokens=True))
```
**Example Output:**
```
人工智能的发展,为人类的未来发展提供了新的机遇。在未来,人工智能将是未来人工智能发展的重要方向。
```
## Requirements
```bash
pip install torch transformers spikingjelly safetensors
# For Triton kernels (GPU): pip install triton
```
## Training
Trained on a single NVIDIA DGX Spark (GB10, 128GB unified memory) with 4-GPU DDP.
Due to compute constraints, training used only a small subset of the full corpus (~85K steps, ~1.4B tokens of ~10B available). Even with this limited data budget, the model acquires basic language generation ability, demonstrating the architectural viability of pure SNN language modeling.
```bash
torchrun --nproc_per_node=4 train_ddp.py \
--D 896 --D_ff 2688 --K 16 --num_layers 20 \
--batch_size 8 --accumulation_steps 8 \
--learning_rate 2e-4 --warmup_iters 1000
```
## Citation
```bibtex
@misc{neuronspark2025,
title={NeuronSpark: A Spiking Neural Network Language Model with Selective State Space Dynamics},
author={Zhengzheng Tang},
year={2025},
url={https://github.com/Brain2nd/NeuronSpark}
}
```
## Contact
- **Author**: Zhengzheng Tang
- **Email**: zztangbu@bu.edu
- **GitHub**: [Brain2nd/NeuronSpark](https://github.com/Brain2nd/NeuronSpark)
|