Instructions to use Brain2nd/NeuronSpark-V4-1.16B-Pretrain with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Brain2nd/NeuronSpark-V4-1.16B-Pretrain with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Brain2nd/NeuronSpark-V4-1.16B-Pretrain", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("Brain2nd/NeuronSpark-V4-1.16B-Pretrain", trust_remote_code=True, dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use Brain2nd/NeuronSpark-V4-1.16B-Pretrain with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Brain2nd/NeuronSpark-V4-1.16B-Pretrain"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Brain2nd/NeuronSpark-V4-1.16B-Pretrain",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Brain2nd/NeuronSpark-V4-1.16B-Pretrain

SGLang

How to use Brain2nd/NeuronSpark-V4-1.16B-Pretrain with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Brain2nd/NeuronSpark-V4-1.16B-Pretrain" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Brain2nd/NeuronSpark-V4-1.16B-Pretrain",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Brain2nd/NeuronSpark-V4-1.16B-Pretrain" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Brain2nd/NeuronSpark-V4-1.16B-Pretrain",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Brain2nd/NeuronSpark-V4-1.16B-Pretrain with Docker Model Runner:
```
docker model run hf.co/Brain2nd/NeuronSpark-V4-1.16B-Pretrain
```

Brain2nd commited on 9 days ago

Commit

95a79b1

verified ·

1 Parent(s): 23666d5

Upload V4 1.16B pretrain checkpoint step10500

Browse files

Files changed (12) hide show

README.md +64 -0
chat_template.jinja +85 -0
config.json +38 -0
configuration_neuronspark.py +86 -0
deepspeed/mp_rank_00_model_states.pt +3 -0
generation_config.json +9 -0
latest +1 -0
model.safetensors +3 -0
modeling_neuronspark.py +0 -0
tokenizer.json +0 -0
tokenizer_config.json +31 -0
training_state.pth +3 -0

README.md ADDED Viewed

	@@ -0,0 +1,64 @@

+---
+library_name: transformers
+tags:
+- neuronspark
+- snn
+- causal-lm
+- pretrain
+- deepspeed
+- checkpoint
+---
+# NeuronSpark-V4-1.16B-Pretrain
+NeuronSpark V4 autoregressive pretraining checkpoint.
+This repository contains a complete training checkpoint for continued pretraining, not only inference weights.
+## Checkpoint
+- Architecture: NeuronSpark V4 causal language model
+- Scale: 1.16B parameters
+- Checkpoint step: 10500
+- Tokens seen: 2,063,372,760 supervised tokens
+- Sequence length: 2048
+- Training mode: autoregressive pretraining
+- Optimizer: Muon + Adam + Lion
+- DeepSpeed: ZeRO-0
+- Precision: bf16 training path
+## Included Files
+- `model.safetensors`: Hugging Face model weights for loading/evaluation.
+- `config.json`, `configuration_neuronspark.py`, `modeling_neuronspark.py`: self-contained custom model code/config.
+- `tokenizer.json`, `tokenizer_config.json`, `chat_template.jinja`: tokenizer assets.
+- `training_state.pth`: saved training step and token counter.
+- `deepspeed/`: DeepSpeed checkpoint state for continued training.
+## Continue Training
+Download or snapshot this repository, then resume with the original training script:
+```bash
+deepspeed --num_gpus=8 train_pretrain.py \
+  --config_json configs/smoke_1p16b.json \
+  --data_path <pretokenized_data_dir> \
+  --tokenizer_path tokenizer_v3 \
+  --out_dir <new_output_dir> \
+  --deepspeed_config configs/ds_zero0_v4.json \
+  --max_length 2048 \
+  --batch_size 12 \
+  --accumulation_steps 1 \
+  --optimizer muon_adam_lion \
+  --learning_rate 2e-4 \
+  --muon_lr 0.005 \
+  --lion_lr 1e-4 \
+  --warmup_iters 500 \
+  --grad_clip 0.5 \
+  --resume <downloaded_checkpoint_dir>
+```
+## Provenance
+This is a V4 pretraining checkpoint from the current NeuronSpark V4 branch. It is not the historical V2.5/V3 checkpoint family.

chat_template.jinja ADDED Viewed

	@@ -0,0 +1,85 @@

+{%- if tools %}
+    {{- '<|im_start|>system\n' }}
+    {%- if messages[0].role == 'system' %}
+        {{- messages[0].content + '\n\n' }}
+    {%- endif %}
+    {{- "# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
+    {%- for tool in tools %}
+        {{- "\n" }}
+        {{- tool | tojson }}
+    {%- endfor %}
+    {{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }}
+{%- else %}
+    {%- if messages[0].role == 'system' %}
+        {{- '<|im_start|>system\n' + messages[0].content + '<|im_end|>\n' }}
+    {%- endif %}
+{%- endif %}
+{%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}
+{%- for message in messages[::-1] %}
+    {%- set index = (messages|length - 1) - loop.index0 %}
+    {%- if ns.multi_step_tool and message.role == "user" and not(message.content.startswith('<tool_response>') and message.content.endswith('</tool_response>')) %}
+        {%- set ns.multi_step_tool = false %}
+        {%- set ns.last_query_index = index %}
+    {%- endif %}
+{%- endfor %}
+{%- for message in messages %}
+    {%- if (message.role == "user") or (message.role == "system" and not loop.first) %}
+        {{- '<|im_start|>' + message.role + '\n' + message.content + '<|im_end|>' + '\n' }}
+    {%- elif message.role == "assistant" %}
+        {%- set content = message.content %}
+        {%- set reasoning_content = '' %}
+        {%- if message.reasoning_content is defined and message.reasoning_content is not none %}
+            {%- set reasoning_content = message.reasoning_content %}
+        {%- else %}
+            {%- if '</think>' in message.content %}
+                {%- set content = message.content.split('</think>')[-1].lstrip('\n') %}
+                {%- set reasoning_content = message.content.split('</think>')[0].rstrip('\n').split('<think>')[-1].lstrip('\n') %}
+            {%- endif %}
+        {%- endif %}
+        {%- if loop.index0 > ns.last_query_index %}
+            {%- if loop.last or (not loop.last and reasoning_content) %}
+                {{- '<|im_start|>' + message.role + '\n<think>\n' + reasoning_content.strip('\n') + '\n</think>\n\n' + content.lstrip('\n') }}
+            {%- else %}
+                {{- '<|im_start|>' + message.role + '\n' + content }}
+            {%- endif %}
+        {%- else %}
+            {{- '<|im_start|>' + message.role + '\n' + content }}
+        {%- endif %}
+        {%- if message.tool_calls %}
+            {%- for tool_call in message.tool_calls %}
+                {%- if (loop.first and content) or (not loop.first) %}
+                    {{- '\n' }}
+                {%- endif %}
+                {%- if tool_call.function %}
+                    {%- set tool_call = tool_call.function %}
+                {%- endif %}
+                {{- '<tool_call>\n{"name": "' }}
+                {{- tool_call.name }}
+                {{- '", "arguments": ' }}
+                {%- if tool_call.arguments is string %}
+                    {{- tool_call.arguments }}
+                {%- else %}
+                    {{- tool_call.arguments | tojson }}
+                {%- endif %}
+                {{- '}\n</tool_call>' }}
+            {%- endfor %}
+        {%- endif %}
+        {{- '<|im_end|>\n' }}
+    {%- elif message.role == "tool" %}
+        {%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
+            {{- '<|im_start|>user' }}
+        {%- endif %}
+        {{- '\n<tool_response>\n' }}
+        {{- message.content }}
+        {{- '\n</tool_response>' }}
+        {%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
+            {{- '<|im_end|>\n' }}
+        {%- endif %}
+    {%- endif %}
+{%- endfor %}
+{%- if add_generation_prompt %}
+    {{- '<|im_start|>assistant\n' }}
+    {%- if enable_thinking is defined and enable_thinking is false %}
+        {{- '<think>\n\n</think>\n\n' }}
+    {%- endif %}
+{%- endif %}

config.json ADDED Viewed

	@@ -0,0 +1,38 @@

+{
+  "D": 1024,
+  "D_ff": 3072,
+  "D_key": 128,
+  "D_value": 128,
+  "K": 12,
+  "N": 8,
+  "ahp_init": 0.0,
+  "architectures": [
+    "NeuronSparkForCausalLM"
+  ],
+  "auto_map": {
+    "AutoConfig": "configuration_neuronspark.NeuronSparkConfig",
+    "AutoModelForCausalLM": "modeling_neuronspark.NeuronSparkForCausalLM"
+  },
+  "bias_balancing_ema": 0.99,
+  "bias_balancing_lr": 0.001,
+  "bos_token_id": 1,
+  "dtype": "bfloat16",
+  "eos_token_id": 2,
+  "eps_explore": 0.05,
+  "k_predictor_hidden": 256,
+  "memory_layer_interval": 4,
+  "model_type": "neuronspark",
+  "num_hidden_layers": 24,
+  "num_layers": 24,
+  "ponder_T_final": 0.3,
+  "ponder_T_init": 2.0,
+  "rope_layout": "transformer_interleaved",
+  "spike_mode": "quantal",
+  "surrogate_alpha": 4.0,
+  "transformers_version": "5.6.2",
+  "use_ahp": false,
+  "use_cache": false,
+  "v_th_min": 0.02,
+  "v_th_reg_weight": 0.0,
+  "vocab_size": 128387
+}

configuration_neuronspark.py ADDED Viewed

	@@ -0,0 +1,86 @@

+from transformers import PretrainedConfig
+class NeuronSparkConfig(PretrainedConfig):
+    model_type = "neuronspark"
+    def __init__(
+        self,
+        vocab_size=64002,
+        D=1024,
+        N=8,
+        K=12,
+        num_layers=24,
+        D_ff=3072,
+        v_th_min=0.02,  # selective threshold floor; mainline v4.1 uses quantal release
+        memory_layer_interval=4,
+        D_key=128,
+        D_value=128,
+        # SNNAttention RoPE channel pairing. New V4 runs use the mainstream
+        # Transformer even/odd interleaved layout; "native" is kept only for
+        # loading historical checkpoints trained with the old half-split layout.
+        rope_layout="transformer_interleaved",
+        # 神经元发放形式 (v4.1 — 见 docs/v4_status_and_roadmap.md §神经元设计)
+        #   "quantal" = 当前 V4 架构语义: output = v_th·𝟙[V_pre>v_th],
+        #               V_post = V_pre - v_th·𝟙[...] (use_ahp=False 时), 剩余余量留膜里。
+        #   "supra"   = v3 / early-v4 的 bio-ReLU 历史形式, 仅用于 ablation-only 对照。
+        spike_mode="quantal",
+        # surrogate gradient α (sigmoid surrogate, 仅 spike_mode="quantal" 时用于 output 的反向)
+        surrogate_alpha=4.0,
+        # 后超极化 (AHP / 不应期): 发放后膜额外下压 ahp (per-channel 可学), V_post -= ahp·𝟙[V_pre>v_th]
+        use_ahp=False,
+        ahp_init=0.0,  # ahp 参数初始值 (per-channel scalar)
+        # PLIFNode.v_th 朝 init 的二次正则 (打破 v_th↔下游W 的尺度冗余: v_th 漂到 floor → W_in 补偿性暴涨 → SNNBlock 膜 runaway → NaN). 0 = 关.
+        v_th_reg_weight=0.0,
+        # v3 PonderNet fields (input-conditioned KPredictor)
+        k_predictor_hidden=None,
+        ponder_T_init=2.0,
+        ponder_T_final=0.3,
+        eps_explore=0.05,
+        bias_balancing_lr=1e-3,
+        bias_balancing_ema=0.99,
+        bos_token_id=1,
+        eos_token_id=2,
+        **kwargs,
+    ):
+        self.vocab_size = vocab_size
+        self.D = D
+        self.N = N
+        self.K = K
+        self.num_layers = num_layers
+        # HF GenerationMixin / DynamicCache 期望 num_hidden_layers 字段
+        self.num_hidden_layers = num_layers
+        # SNN 没有 KV cache, 关掉避免 HF 试图建 DynamicCache
+        self.use_cache = False
+        self.D_ff = D_ff
+        self.v_th_min = v_th_min
+        self.memory_layer_interval = memory_layer_interval
+        self.D_key = D_key
+        self.D_value = D_value
+        self.rope_layout = rope_layout
+        self.spike_mode = spike_mode
+        self.surrogate_alpha = surrogate_alpha
+        self.use_ahp = use_ahp
+        self.ahp_init = ahp_init
+        self.v_th_reg_weight = v_th_reg_weight
+        # v3 PonderNet
+        self.k_predictor_hidden = k_predictor_hidden
+        self.ponder_T_init = ponder_T_init
+        self.ponder_T_final = ponder_T_final
+        self.eps_explore = eps_explore
+        self.bias_balancing_lr = bias_balancing_lr
+        self.bias_balancing_ema = bias_balancing_ema
+        # auto_map: HF 文件路径/类名 两段式（neuronspark/ 子目录）
+        kwargs.setdefault("auto_map", {
+            "AutoConfig": "configuration_neuronspark.NeuronSparkConfig",
+            "AutoModelForCausalLM": "modeling_neuronspark.NeuronSparkForCausalLM",
+        })
+        kwargs.setdefault("architectures", ["NeuronSparkForCausalLM"])
+        kwargs.setdefault("dtype", "bfloat16")
+        super().__init__(
+            bos_token_id=bos_token_id,
+            eos_token_id=eos_token_id,
+            **kwargs,
+        )

deepspeed/mp_rank_00_model_states.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9f4b85cadc39a51c86547603f28beaf6ec95c1f99d987ae52ccad8879a873bce
+size 3289813267

generation_config.json ADDED Viewed

	@@ -0,0 +1,9 @@

+{
+  "_from_model_config": true,
+  "bos_token_id": 1,
+  "eos_token_id": 2,
+  "output_attentions": false,
+  "output_hidden_states": false,
+  "transformers_version": "5.6.2",
+  "use_cache": false
+}

latest ADDED Viewed

	@@ -0,0 +1 @@


1	+ deepspeed

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d3a22d378f69debb1e04ecfd6f77b1f98ad875968343098407c58124f3208fb2
+size 2471985328

modeling_neuronspark.py ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,31 @@

+{
+  "_origin": "filtered from Qwen/Qwen3-1.7B-Base: dropped non-EN/ZH language tokens (23282 of 151643)",
+  "add_prefix_space": false,
+  "backend": "tokenizers",
+  "bos_token": null,
+  "clean_up_tokenization_spaces": false,
+  "eos_token": "<|endoftext|>",
+  "errors": "replace",
+  "extra_special_tokens": [
+    "<|im_start|>",
+    "<|im_end|>",
+    "<|object_ref_start|>",
+    "<|object_ref_end|>",
+    "<|box_start|>",
+    "<|box_end|>",
+    "<|quad_start|>",
+    "<|quad_end|>",
+    "<|vision_start|>",
+    "<|vision_end|>",
+    "<|vision_pad|>",
+    "<|image_pad|>",
+    "<|video_pad|>"
+  ],
+  "is_local": true,
+  "local_files_only": false,
+  "model_max_length": 131072,
+  "pad_token": "<|endoftext|>",
+  "split_special_tokens": false,
+  "tokenizer_class": "Qwen2Tokenizer",
+  "unk_token": null
+}

training_state.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:28b7ad0dcb88d196ebb5109088780dd0e90a4932ffed12a49cb9cf2325830ae1
+size 1367