FlameF0X commited on
Commit
ea46f3f
·
verified ·
1 Parent(s): 8d68b88

Upload folder using huggingface_hub

Browse files
Files changed (4) hide show
  1. README.md +45 -0
  2. config.json +23 -0
  3. pytorch_model.bin +3 -0
  4. tokenizer.json +0 -0
README.md ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # i3 Hybrid Chat Model
2
+
3
+ This is a chat-tuned version of the i3 hybrid architecture with latent context compression.
4
+
5
+ ## Model Details
6
+
7
+ - **Architecture**: RWKV + Attention Hybrid with Latent Compression
8
+ - **Parameters**: ~342.4M
9
+ - **Context Window**: 4096 tokens (via compression)
10
+ - **Inference Window**: 4096 tokens
11
+ - **Kernel Size**: 512 tokens
12
+ - **Training Data**: HuggingFaceH4/ultrachat_200k
13
+
14
+ ## Usage
15
+
16
+ ```python
17
+ import torch
18
+ from tokenizers import Tokenizer
19
+
20
+ # Load model
21
+ model = torch.load("pytorch_model.bin")
22
+ tokenizer = Tokenizer.from_file("tokenizer.json")
23
+
24
+ # Format conversation
25
+ conversation = "<BOS><|user|>\nHello!\n<|assistant|>\n"
26
+ tokens = torch.tensor([tokenizer.encode(conversation).ids])
27
+
28
+ # Generate
29
+ output = model.generate(tokens, max_new_tokens=100, temperature=0.8)
30
+ response = tokenizer.decode(output[0].tolist())
31
+ ```
32
+
33
+ ## Capabilities
34
+
35
+ - Multi-turn conversations
36
+ - Long context understanding via latent compression
37
+ - Efficient inference with RWKV base layers
38
+ - Ready for chain-of-thought fine-tuning
39
+
40
+ ## Training
41
+
42
+ Fine-tuned on UltraChat 200k dataset with:
43
+ - Learning rate: 1e-05
44
+ - Batch size: 4 × 4 accumulation
45
+ - Sequence length: 512
config.json ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "i3HybridChatModel"
4
+ ],
5
+ "model_type": "i3-chat",
6
+ "d_model": 1180,
7
+ "n_layers": 14,
8
+ "rwkv_layers": 12,
9
+ "attn_layers": 2,
10
+ "vocab_size": 32000,
11
+ "kernel_size": 512,
12
+ "max_latent_context": 4096,
13
+ "inference_context_window": 4096,
14
+ "compression_enabled": true,
15
+ "num_latent_tokens": 32,
16
+ "task": "chat",
17
+ "special_tokens": {
18
+ "bos_token": "<BOS>",
19
+ "eos_token": "<EOS>",
20
+ "user_token": "<|user|>",
21
+ "assistant_token": "<|assistant|>"
22
+ }
23
+ }
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c80e15b5de785422713557ae4cd724c198f0e2fba5716c65beb2f7d4fab8ada6
3
+ size 1369667759
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff