--- license: apache-2.0 language: - zh library_name: transformers tags: - snn - spiking-neural-network - text-generation - neuromorphic pipeline_tag: text-generation --- # NeuronSpark-0.9B ## Introduction **NeuronSpark-0.9B** is a **0.87-billion parameter language model built entirely on Spiking Neural Networks (SNNs)**. Unlike conventional Transformer-based LLMs that rely on attention mechanisms, NeuronSpark replaces the entire computation backbone with biologically-inspired spiking neurons, achieving language modeling through membrane potential dynamics, surrogate gradient training, and adaptive computation (PonderNet). This is the **pretrained base model** (85,000 steps on a small subset of Seq-Monkey corpus). > **Note on training data**: Due to limited compute resources (single DGX Spark), this model was trained on only **~85K steps with a small fraction of the full Seq-Monkey 10B-token corpus**. Despite the minimal training data, the model demonstrates emergent language capabilities — validating the architectural viability of pure SNN language models. We plan to continue scaling with more data and compute in future work. For the instruction-tuned chat version, see [NeuronSpark-0.9B-Chat](https://huggingface.co/Brain2nd/NeuronSpark-0.9B-Chat). ## Model Details | Attribute | Value | |-----------|-------| | Parameters | 874M | | Architecture | SNN Hidden State Space Model | | Hidden Dimension (D) | 896 | | Layers | 20 | | SNN Timesteps (K) | 16 (PonderNet adaptive) | | State Expansion (N) | 8 | | FFN Dimension | 2688 | | Vocabulary | 6144 (custom BPE) | | Context Length | 512 tokens | | Training Data | Seq-Monkey (small subset, Chinese) | | Training Tokens | ~1.4B (of ~10B available) | | Precision | bfloat16 | | License | Apache 2.0 | ## Architecture Highlights - **Pure SNN**: No attention, no standard MLP — all computation via PLIF (Parametric Leaky Integrate-and-Fire) neurons - **Membrane Potential Leakage Activation**: PLIFNode outputs `(1-β)·V_post` (leak current), naturally emphasizing fast-responding neurons over slow-memory neurons - **Selective State Space**: Hidden neurons with input-dependent dynamic β(t), α(t), V_th(t) — analogous to selective state space models (Mamba) - **PonderNet Adaptive K**: Each token dynamically decides how many SNN timesteps to use (1~K), with geometric distribution weighting - **Triton Fused Kernels**: Custom PLIF forward/backward kernels, single-pass sequential scan replacing 3-phase approach - **Pre-LN Residual Stream**: Continuous residual flow with RMSNorm, matching Qwen3/LLaMA architecture pattern ## Quickstart ```python from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained( "Brain2nd/NeuronSpark-0.9B", trust_remote_code=True, ) tokenizer = AutoTokenizer.from_pretrained("Brain2nd/NeuronSpark-0.9B") # Text completion text = f"{tokenizer.bos_token}人工智能的发展" input_ids = tokenizer(text, return_tensors="pt")["input_ids"] output_ids = model.generate( input_ids, max_new_tokens=128, temperature=0.8, top_k=50, eos_token_id=tokenizer.eos_token_id, ) print(tokenizer.decode(output_ids[0], skip_special_tokens=True)) ``` **Example Output:** ``` 人工智能的发展,为人类的未来发展提供了新的机遇。在未来,人工智能将是未来人工智能发展的重要方向。 ``` ## Requirements ```bash pip install torch transformers spikingjelly safetensors # For Triton kernels (GPU): pip install triton ``` ## Training Trained on a single NVIDIA DGX Spark (GB10, 128GB unified memory) with 4-GPU DDP. Due to compute constraints, training used only a small subset of the full corpus (~85K steps, ~1.4B tokens of ~10B available). Even with this limited data budget, the model acquires basic language generation ability, demonstrating the architectural viability of pure SNN language modeling. ```bash torchrun --nproc_per_node=4 train_ddp.py \ --D 896 --D_ff 2688 --K 16 --num_layers 20 \ --batch_size 8 --accumulation_steps 8 \ --learning_rate 2e-4 --warmup_iters 1000 ``` ## Citation ```bibtex @misc{neuronspark2025, title={NeuronSpark: A Spiking Neural Network Language Model with Selective State Space Dynamics}, author={Zhengzheng Tang}, year={2025}, url={https://github.com/Brain2nd/NeuronSpark} } ``` ## Contact - **Author**: Zhengzheng Tang - **Email**: zztangbu@bu.edu - **GitHub**: [Brain2nd/NeuronSpark](https://github.com/Brain2nd/NeuronSpark)