| | --- |
| | license: cc-by-nc-4.0 |
| | language: |
| | - en |
| | - fr |
| | tags: |
| | - complexity-deep |
| | - transformer |
| | - moe |
| | - token-routed |
| | - inl-dynamics |
| | - mu-guided |
| | - causal-lm |
| | - chat |
| | - conversational |
| | - sft |
| | pipeline_tag: text-generation |
| | library_name: complexity-deep |
| | base_model: Pacific-Prime/pacific-prime |
| | model-index: |
| | - name: chat-node |
| | results: [] |
| | --- |
| | |
| | # Chat-Node 1.5B |
| |
|
| | > **Conversational chat model built on Pacific-Prime 1.5B with Mu-Guided Attention and Token-Routed MLP** |
| |
|
| | Chat-Node is a conversational variant of [Pacific-Prime 1.5B](https://huggingface.co/Pacific-Prime/pacific-prime), fine-tuned for general-purpose chat using the Alpaca-Cleaned dataset. Part of the Pacific-Prime node architecture for modular AI agents. |
| |
|
| | ## Generation Example (Epoch 350) |
| |
|
| |  |
| |
|
| | --- |
| |
|
| | ## Model Details |
| |
|
| | | Attribute | Value | |
| | |-----------|-------| |
| | | Base Model | Pacific-Prime 1.5B v0.13.0 | |
| | | Parameters | ~1.52B | |
| | | Fine-tuning | SFT (Supervised Fine-Tuning) | |
| | | Base Checkpoint | pacific-prime-python epoch 450 | |
| | | Dataset | [yahma/alpaca-cleaned](https://huggingface.co/datasets/yahma/alpaca-cleaned) (20K samples) | |
| | | Current Epoch | 350 | |
| | | Precision | F32 | |
| | | Hardware | H100 80GB | |
| | | Context Length | 2048 tokens | |
| |
|
| | ### Training Hyperparameters |
| |
|
| | | Parameter | Value | |
| | |-----------|-------| |
| | | Learning Rate | 2e-5 | |
| | | Batch Size | 4 | |
| | | Gradient Accumulation | 8 (effective batch: 32) | |
| | | Weight Decay | 0.01 | |
| | | Warmup Ratio | 3% | |
| | | Gradient Checkpointing | Enabled | |
| |
|
| | --- |
| |
|
| | ## Chat Format |
| |
|
| | Chat-Node uses a simple User / Assistant prompt format with an optional system message: |
| |
|
| | User: Give three tips for staying healthy. |
| | |
| | Assistant: |
| | |
| | ### Chat Template (Jinja) |
| |
|
| | The model includes a chat template compatible with HuggingFace's `apply_chat_template`: |
| |
|
| | {% if messages[0]['role'] == 'system' %}{{ messages[0]['content'] }} |
| | {% set messages = messages[1:] %}{% endif %} |
| | {% for message in messages %} |
| | {% if message['role'] == 'user' %}User: {{ message['content'] }} |
| | {% elif message['role'] == 'assistant' %}Assistant: {{ message['content'] }} |
| | {% endif %} |
| | {% endfor %} |
| | |
| | --- |
| |
|
| | ## Architecture |
| |
|
| | | Parameter | Value | |
| | |-----------|-------| |
| | | Hidden Size | 2048 | |
| | | Intermediate Size | 5632 | |
| | | Layers | 24 | |
| | | Attention Heads | 16 | |
| | | KV Heads (GQA) | 8 | |
| | | Max Position | 2048 | |
| | | Vocab Size | 32,000 | |
| | | Experts (Token-Routed MLP) | 4 | |
| |
|
| | ### Key Innovations (v0.13.0) |
| |
|
| | - **Mu-Guided KQV** - Learned equilibrium parameter biases K, Q, and V projections |
| | - **Mu-Guided Expert Routing** - mu influences MLP expert selection |
| | - **Mu Residual Highway** - Accumulated context across layers |
| | - **Token-Routed MLP** - Deterministic 4-expert MoE with zero routing overhead |
| | - **INL Dynamics** - Velocity tracking for temporal coherence (alpha=0.9, beta=0.1) |
| | - **Grouped Query Attention** - 16 heads / 8 KV heads for efficient inference |
| | - **QK Normalization** + **Flash Attention (SDPA)** |
| | - **RoPE** positional embeddings |
| |
|
| | --- |
| |
|
| | ## Usage |
| |
|
| | ### CLI (generate.py) |
| |
|
| | ```bash |
| | python generate.py -c ./checkpoints/pacific-prime-chat -m 300 -t 0.3 \ |
| | $'User: Give three tips for staying healthy.\n\nAssistant:' |
| | ``` |
| |
|
| | ### Python |
| |
|
| | ```python |
| | from complexity_deep import DeepForCausalLM |
| | from tokenizers import Tokenizer |
| | import torch |
| | |
| | model = DeepForCausalLM.from_pretrained("Pacific-Prime/chat-node") |
| | tokenizer = Tokenizer.from_file("tokenizer.json") |
| | |
| | prompt = "User: Explain what a neural network is.\n\nAssistant:" |
| | |
| | input_ids = torch.tensor([tokenizer.encode(prompt).ids]) |
| | output = model.generate(input_ids, max_new_tokens=300, temperature=0.3) |
| | print(tokenizer.decode(output[0].tolist())) |
| | ``` |
| |
|
| | --- |
| |
|
| | ## Files |
| |
|
| | | File | Description | |
| | |------|-------------| |
| | | `checkpoint_epoch350.pt` | Model weights (F32) | |
| | | `config.json` | Architecture configuration | |
| | | `tokenizer.json` | BPE tokenizer (32K vocab) | |
| | | `tokenizer_config.json` | Tokenizer settings | |
| | | `special_tokens_map.json` | Special tokens | |
| | | `chat_template.jinja` | Chat prompt template | |
| |
|
| | --- |
| |
|
| | ## Limitations |
| |
|
| | - **In development**: Training ongoing, not yet production-ready |
| | - **English-focused**: Alpaca dataset is primarily English |
| | - **Instruction following**: May overshoot requested list lengths |
| | - **Context window**: Limited to 2048 tokens |
| |
|
| | --- |
| |
|
| | ## Links |
| |
|
| | - [Paper - Zenodo](https://zenodo.org/records/18293026) |
| | - [Base Model - Pacific-Prime 1.5B](https://huggingface.co/Pacific-Prime/pacific-prime) |
| | - [GitHub - complexity-deep](https://github.com/Complexity-ML/complexity-deep) |
| | - [PyPI - complexity-deep](https://pypi.org/project/complexity-deep/) |
| | - [GitHub - mu-inference](https://github.com/Complexity-ML/mu-inference) |
| |
|
| | --- |
| |
|
| | ## License |
| |
|
| | **CC-BY-NC-4.0** (Creative Commons Attribution-NonCommercial 4.0) |
| |
|
| | --- |
| |
|
| | ## Citation |
| |
|
| | ```bibtex |
| | @misc{chat-node-2025, |
| | title={Chat-Node: A Conversational 1.5B Model with Mu-Guided Attention}, |
| | author={Boris Peyriguere}, |
| | year={2025}, |
| | url={https://huggingface.co/Pacific-Prime/chat-node} |
| | } |
| | ``` |
| |
|