File size: 2,300 Bytes
3686e08 93145cf 3686e08 93145cf 3686e08 93145cf 3686e08 93145cf 3686e08 93145cf 3686e08 93145cf abf49e3 93145cf abf49e3 93145cf abf49e3 93145cf abf49e3 93145cf 4a215b2 93145cf 823f673 93145cf abf49e3 93145cf abf49e3 93145cf abf49e3 93145cf abf49e3 93145cf 823f673 93145cf 823f673 93145cf abf49e3 93145cf 4a215b2 93145cf 4a215b2 93145cf 4a215b2 93145cf 4a215b2 93145cf 4a215b2 93145cf 4a215b2 93145cf 4a215b2 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 |
---
license: apache-2.0
library_name: transformers
tags:
- bitnet
- moe
- mixture-of-experts
- 1-bit
- quantized
- compression
- security
- m2m-protocol
pipeline_tag: text-classification
datasets:
- custom
language:
- en
---
# Hydra BitNet - M2M Protocol SLM
A 1.58-bit quantized Mixture-of-Experts model for LLM API optimization.
## Model Description
Hydra is an ultra-compact neural network designed for the M2M Protocol. It uses:
- **BitNet 1.58-bit quantization**: Weights are ternary {-1, 0, +1}
- **Mixture-of-Experts**: 4 specialized experts with top-2 routing
- **Task-specific heads**: Compression routing and security detection
## Model Details
| Property | Value |
|----------|-------|
| Parameters | ~9.7M |
| Model Size | ~3.7 MB (1.58-bit) |
| Hidden Size | 192 |
| Layers | 4 |
| Experts | 4 |
| Vocab Size | 32000 |
## Performance
### Compression Routing
- **Task**: Predict optimal compression algorithm (NONE, BPE, BROTLI, ZLIB)
- **Accuracy**: 99.4%
- **Latency**: <5ms on GPU
### Security Detection
- **Task**: Detect prompt injection and jailbreak attempts
- **Accuracy**: 96.2%
- **Latency**: <5ms on GPU
## Usage
```python
import torch
from safetensors.torch import load_file
# Load model
weights = load_file("model.safetensors")
# Or use with the m2m-protocol package
from m2m_protocol import M2MClient
client = M2MClient(target_model="gpt-4")
result = client.process(your_message)
```
## Training
- **Compression Expert**: Trained with DPO on 100K message pairs
- **Security Expert**: Fine-tuned on 60K security samples (prompt injection, jailbreak, safe)
## Architecture
```
HydraBitNet(
(embeddings): Embedding(256, 256)
(encoder): ModuleList(
(0-5): 6 x TaskSpecializedMoELayer(
(gate): Linear(256, 4)
(experts): ModuleList(
(0): CompressionExpert
(1): SecurityExpert
(2): SemanticExpert
(3): GeneralExpert
)
)
)
(classifier): ModuleDict(
(compression): BitLinear(256, 4)
(security): BitLinear(256, 2)
)
)
```
## Citation
```bibtex
@software{hydra_bitnet,
title = {Hydra BitNet: Ultra-Compact MoE for M2M Protocol},
author = {M2M Protocol Team},
year = {2026},
url = {https://github.com/infernet-org/m2m-protocol}
}
```
## License
Apache 2.0
|