File size: 2,300 Bytes
3686e08
 
93145cf
3686e08
93145cf
3686e08
93145cf
 
 
3686e08
93145cf
 
3686e08
93145cf
 
 
 
3686e08
 
93145cf
abf49e3
93145cf
abf49e3
93145cf
abf49e3
93145cf
 
 
 
abf49e3
93145cf
4a215b2
 
 
93145cf
 
823f673
 
93145cf
 
abf49e3
93145cf
abf49e3
93145cf
 
 
 
abf49e3
93145cf
 
 
 
abf49e3
93145cf
823f673
 
93145cf
823f673
 
93145cf
 
 
 
 
abf49e3
93145cf
 
4a215b2
 
93145cf
4a215b2
93145cf
 
4a215b2
93145cf
4a215b2
93145cf
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4a215b2
93145cf
4a215b2
93145cf
 
 
 
 
 
 
 
4a215b2
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
---
license: apache-2.0
library_name: transformers
tags:
  - bitnet
  - moe
  - mixture-of-experts
  - 1-bit
  - quantized
  - compression
  - security
  - m2m-protocol
pipeline_tag: text-classification
datasets:
  - custom
language:
  - en
---

# Hydra BitNet - M2M Protocol SLM

A 1.58-bit quantized Mixture-of-Experts model for LLM API optimization.

## Model Description

Hydra is an ultra-compact neural network designed for the M2M Protocol. It uses:
- **BitNet 1.58-bit quantization**: Weights are ternary {-1, 0, +1}
- **Mixture-of-Experts**: 4 specialized experts with top-2 routing
- **Task-specific heads**: Compression routing and security detection

## Model Details

| Property | Value |
|----------|-------|
| Parameters | ~9.7M |
| Model Size | ~3.7 MB (1.58-bit) |
| Hidden Size | 192 |
| Layers | 4 |
| Experts | 4 |
| Vocab Size | 32000 |

## Performance

### Compression Routing
- **Task**: Predict optimal compression algorithm (NONE, BPE, BROTLI, ZLIB)
- **Accuracy**: 99.4%
- **Latency**: <5ms on GPU

### Security Detection  
- **Task**: Detect prompt injection and jailbreak attempts
- **Accuracy**: 96.2%
- **Latency**: <5ms on GPU

## Usage

```python
import torch
from safetensors.torch import load_file

# Load model
weights = load_file("model.safetensors")

# Or use with the m2m-protocol package
from m2m_protocol import M2MClient

client = M2MClient(target_model="gpt-4")
result = client.process(your_message)
```

## Training

- **Compression Expert**: Trained with DPO on 100K message pairs
- **Security Expert**: Fine-tuned on 60K security samples (prompt injection, jailbreak, safe)

## Architecture

```
HydraBitNet(
  (embeddings): Embedding(256, 256)
  (encoder): ModuleList(
    (0-5): 6 x TaskSpecializedMoELayer(
      (gate): Linear(256, 4)
      (experts): ModuleList(
        (0): CompressionExpert
        (1): SecurityExpert  
        (2): SemanticExpert
        (3): GeneralExpert
      )
    )
  )
  (classifier): ModuleDict(
    (compression): BitLinear(256, 4)
    (security): BitLinear(256, 2)
  )
)
```

## Citation

```bibtex
@software{hydra_bitnet,
  title = {Hydra BitNet: Ultra-Compact MoE for M2M Protocol},
  author = {M2M Protocol Team},
  year = {2026},
  url = {https://github.com/infernet-org/m2m-protocol}
}
```

## License

Apache 2.0