qox commited on
Commit
578c1ba
·
verified ·
1 Parent(s): ee8aa26

Initial upload: KnowForge Encoder (131K params)

Browse files
Files changed (7) hide show
  1. LICENSE +21 -0
  2. README.md +102 -0
  3. best_model.safetensors +3 -0
  4. inference.py +173 -0
  5. model_config.json +8 -0
  6. requirements.txt +2 -0
  7. vocab.json +1 -0
LICENSE ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ MIT License
2
+
3
+ Copyright (c) 2026 KnowForge
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
README.md ADDED
@@ -0,0 +1,102 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ - vi
5
+ license: mit
6
+ pipeline_tag: text-classification
7
+ tags:
8
+ - text-classification
9
+ - compositional-reasoning
10
+ - knowforge
11
+ - tiny-model
12
+ ---
13
+
14
+ # KnowForge Encoder
15
+
16
+ A tiny (131K parameter) text classifier trained from scratch on the KnowForge dataset.
17
+
18
+ Given a natural-language input prompt, it predicts:
19
+ - **`transform_type`** — which reasoning operation is required
20
+ - **`answer_type`** — what kind of answer to expect
21
+
22
+ This model is a fast routing component, not a generative model. It is designed to run in milliseconds on CPU, making it suitable for pre-filtering or routing in a KnowForge inference pipeline.
23
+
24
+ ---
25
+
26
+ ## Quick Start
27
+
28
+ ```bash
29
+ pip install -r requirements.txt
30
+ python inference.py "A is taller than B. B is taller than C. Is A taller than C?"
31
+ # Transform: relation_to_graph (99.12%)
32
+ # Answer type: exact_answer (87.34%)
33
+ ```
34
+
35
+ ```python
36
+ from inference import predict
37
+
38
+ result = predict("A is taller than B. B is taller than C. Is A taller than C?")
39
+ print(result["transform_type"]) # "relation_to_graph"
40
+ print(result["transform_confidence"]) # 0.9912
41
+ print(result["answer_type"]) # "exact_answer"
42
+ ```
43
+
44
+ ---
45
+
46
+ ## What It Classifies
47
+
48
+ ### Transform types (3 classes)
49
+
50
+ | Class | Meaning |
51
+ |---|---|
52
+ | `linear_to_cyclic` | Modular arithmetic in cyclic domains (clocks, calendars, wrap-around) |
53
+ | `relation_to_graph` | Transitive relation query over a directed entity graph |
54
+ | `relation_property_check` | Structural property check on a declared relation system |
55
+
56
+ ### Answer types (4 classes)
57
+
58
+ | Class | Meaning |
59
+ |---|---|
60
+ | `exact_answer` | A single definite value follows from the rules |
61
+ | `conditional_answer` | Answer depends on an unstated condition |
62
+ | `need_more_rule` | Insufficient rules to determine the answer |
63
+ | `unresolvable_without_observation` | Answer requires empirical observation not in the rules |
64
+
65
+ ---
66
+
67
+ ## Architecture
68
+
69
+ Conv1d text classifier trained entirely from scratch — no pretrained embeddings.
70
+
71
+ | Component | Detail |
72
+ |---|---|
73
+ | Embedding | 808 × 64 (word-level, learned) |
74
+ | Encoder | 2 × Conv1d(kernel=3) + ReLU, output dim 128 |
75
+ | Pooling | Global max pooling over sequence |
76
+ | Heads | transform (3), answer_type (4), plus auxiliary heads |
77
+ | Parameters | **131,888** |
78
+ | Training time | ~25 min on CPU |
79
+
80
+ ---
81
+
82
+ ## Performance
83
+
84
+ Evaluated on dev set after 28 epochs (best checkpoint by dev loss):
85
+
86
+ | Metric | Score |
87
+ |---|---|
88
+ | **transform_acc (dev)** | **99.55%** |
89
+ | **atype_acc (dev)** | **99.19%** |
90
+ | transform_acc (train) | 99.66% |
91
+ | atype_acc (train) | 99.37% |
92
+
93
+ Transform accuracy on the full test pipeline evaluation: **99.64%**.
94
+
95
+ ---
96
+
97
+ ## Limitations
98
+
99
+ - **Vocabulary size 808** — trained on KnowForge synthetic text only. Out-of-domain vocabulary falls back to `<UNK>`. Accuracy degrades on very different phrasings.
100
+ - **No context.** The model sees only the raw input text, not the rule structure. It classifies by surface patterns learned from training data.
101
+ - **Not a reasoning model.** This classifier routes queries; it does not solve them. Use KnowForge-0.6B for full answer generation.
102
+ - **Synthetic distribution only.** Tested exclusively on procedurally generated KnowForge examples. Behaviour on real-world inputs is not evaluated.
best_model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a978b3722eecc47b2dc780214f20b26788408c43fc46e836eb8cc855f3e7698b
3
+ size 528768
inference.py ADDED
@@ -0,0 +1,173 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ KnowForge Encoder — standalone inference.
3
+
4
+ Predicts transform_type and answer_type from a KnowForge input prompt.
5
+
6
+ CLI: python inference.py "A cao hơn B, B cao hơn C. A có cao hơn C không?"
7
+ API: from inference import predict; result = predict("A cao hơn B...")
8
+ """
9
+ import json
10
+ import re
11
+ import sys
12
+ from pathlib import Path
13
+ from typing import Optional
14
+
15
+ import torch
16
+ import torch.nn as nn
17
+ import torch.nn.functional as F
18
+
19
+ _HERE = Path(__file__).parent
20
+
21
+ # ── Label maps (must match training) ────────────────────────────────────────
22
+
23
+ TRANSFORM_LABELS = ["linear_to_cyclic", "relation_property_check", "relation_to_graph"]
24
+ ATYPE_LABELS = ["conditional_answer", "exact_answer", "need_more_rule",
25
+ "unresolvable_without_observation"]
26
+
27
+ # ── Tokenizer ────────────────────────────────────────────────────────────────
28
+
29
+ _TOK_RE = re.compile(r"[\w]+|[^\w\s]", re.UNICODE)
30
+
31
+
32
+ def _tokenize(text: str) -> list:
33
+ return _TOK_RE.findall(text.lower())
34
+
35
+
36
+ # ── Model architecture ───────────────────────────────────────────────────────
37
+
38
+ class _MultiTaskEncoder(nn.Module):
39
+ def __init__(self, vocab_size: int, embed_dim: int = 64,
40
+ hidden_dim: int = 64, n_layers: int = 2, dropout: float = 0.3):
41
+ super().__init__()
42
+ enc_dim = hidden_dim * 2 # 128
43
+
44
+ self.embedding = nn.Embedding(vocab_size, embed_dim, padding_idx=0)
45
+ self.dropout = nn.Dropout(dropout)
46
+
47
+ conv_layers = []
48
+ in_ch = embed_dim
49
+ for _ in range(n_layers):
50
+ conv_layers += [nn.Conv1d(in_ch, enc_dim, 3, padding=1), nn.ReLU()]
51
+ in_ch = enc_dim
52
+ self.encoder = nn.Sequential(*conv_layers)
53
+
54
+ self.transform_head = nn.Linear(enc_dim, len(TRANSFORM_LABELS))
55
+ self.atype_head = nn.Linear(enc_dim, len(ATYPE_LABELS))
56
+ # Unused heads included so state_dict keys match exactly
57
+ self.etype_head = nn.Linear(enc_dim, 24)
58
+ self.uncertainty_head = nn.Linear(enc_dim, 5)
59
+ self.bio_head = nn.Linear(enc_dim, 12)
60
+
61
+ def forward(self, token_ids: torch.Tensor) -> dict:
62
+ x = self.embedding(token_ids) # (B, L, E)
63
+ x = self.dropout(x)
64
+ out = self.encoder(x.transpose(1, 2)).transpose(1, 2) # (B, L, 128)
65
+ # Global max pooling over sequence dim
66
+ pooled = out.max(dim=1).values # (B, 128)
67
+ return {
68
+ "transform": self.transform_head(pooled),
69
+ "atype": self.atype_head(pooled),
70
+ }
71
+
72
+
73
+ # ── Lazy singleton loader ────────────────────────────────────────────────────
74
+
75
+ _encoder: Optional[_MultiTaskEncoder] = None
76
+ _vocab: Optional[dict] = None
77
+
78
+
79
+ def _load():
80
+ global _encoder, _vocab
81
+ if _encoder is not None:
82
+ return _encoder, _vocab
83
+
84
+ vocab_path = _HERE / "vocab.json"
85
+ cfg_path = _HERE / "model_config.json"
86
+ sf_path = _HERE / "best_model.safetensors"
87
+ pt_path = _HERE / "best_model.pt"
88
+
89
+ if not vocab_path.exists():
90
+ raise FileNotFoundError(f"vocab.json not found at {vocab_path}")
91
+
92
+ _vocab = json.load(open(vocab_path))
93
+
94
+ cfg = json.load(open(cfg_path)) if cfg_path.exists() else {}
95
+ model = _MultiTaskEncoder(
96
+ vocab_size = cfg.get("vocab_size", len(_vocab)),
97
+ embed_dim = cfg.get("embed_dim", 64),
98
+ hidden_dim = cfg.get("hidden_dim", 64),
99
+ n_layers = cfg.get("n_layers", 2),
100
+ dropout = cfg.get("dropout", 0.3),
101
+ )
102
+
103
+ if sf_path.exists():
104
+ from safetensors.torch import load_file
105
+ state = load_file(str(sf_path))
106
+ elif pt_path.exists():
107
+ state = torch.load(str(pt_path), map_location="cpu", weights_only=True)
108
+ else:
109
+ raise FileNotFoundError(f"No model weights found at {sf_path} or {pt_path}")
110
+
111
+ model.load_state_dict(state)
112
+ model.eval()
113
+ _encoder = model
114
+ return _encoder, _vocab
115
+
116
+
117
+ # ── Public API ───────────────────────────────────────────────────────────────
118
+
119
+ def predict(text: str) -> dict:
120
+ """
121
+ Predict transform_type and answer_type for a KnowForge input.
122
+
123
+ Args:
124
+ text: Natural-language input (rules + question or question alone).
125
+
126
+ Returns:
127
+ {
128
+ "transform_type": str — one of linear_to_cyclic /
129
+ relation_property_check /
130
+ relation_to_graph,
131
+ "transform_confidence": float — softmax probability [0,1],
132
+ "answer_type": str — one of conditional_answer /
133
+ exact_answer /
134
+ need_more_rule /
135
+ unresolvable_without_observation,
136
+ "atype_confidence": float,
137
+ }
138
+ """
139
+ model, vocab = _load()
140
+
141
+ toks = _tokenize(text)
142
+ ids = [vocab.get(t, vocab.get("<UNK>", 1)) for t in toks] or [0]
143
+ x = torch.tensor([ids], dtype=torch.long) # (1, L)
144
+
145
+ with torch.no_grad():
146
+ logits = model(x)
147
+
148
+ t_probs = F.softmax(logits["transform"][0], dim=-1)
149
+ a_probs = F.softmax(logits["atype"][0], dim=-1)
150
+
151
+ t_idx = int(t_probs.argmax())
152
+ a_idx = int(a_probs.argmax())
153
+
154
+ return {
155
+ "transform_type": TRANSFORM_LABELS[t_idx],
156
+ "transform_confidence": round(float(t_probs[t_idx]), 4),
157
+ "answer_type": ATYPE_LABELS[a_idx],
158
+ "atype_confidence": round(float(a_probs[a_idx]), 4),
159
+ }
160
+
161
+
162
+ def _main():
163
+ if len(sys.argv) < 2:
164
+ print("Usage: python inference.py \"<input text>\"")
165
+ sys.exit(1)
166
+ text = " ".join(sys.argv[1:])
167
+ result = predict(text)
168
+ print(f"Transform: {result['transform_type']} ({result['transform_confidence']:.2%})")
169
+ print(f"Answer type: {result['answer_type']} ({result['atype_confidence']:.2%})")
170
+
171
+
172
+ if __name__ == "__main__":
173
+ _main()
model_config.json ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "vocab_size": 808,
3
+ "embed_dim": 64,
4
+ "hidden_dim": 64,
5
+ "n_layers": 2,
6
+ "dropout": 0.3,
7
+ "param_count": 131888
8
+ }
requirements.txt ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ torch>=2.0.0
2
+ safetensors>=0.4.0
vocab.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"<PAD>": 0, "<UNK>": 1, "tính": 2, "chất": 3, ":": 4, "nhóm": 5, "1": 6, "liên": 7, "quan": 8, "đến": 9, "2": 10, ".": 11, "3": 12, "theo": 13, "định": 14, "nghĩa": 15, ",": 16, "bắc": 17, "cầu": 18, "áp": 19, "dụng": 20, "có": 21, "không": 22, "?": 23, "chỉ": 24, "cho": 25, "cặp": 26, "trực": 27, "tiếp": 28, "màn": 29, "hình": 30, "6": 31, "kênh": 32, "(": 33, "–": 34, ")": 35, "qua": 36, "về": 37, "đang": 38, "ở": 39, "chuyển": 40, "4": 41, "giá": 42, "trị": 43, "khu": 44, "a": 45, "=": 46, "76": 47, ">": 48, "b": 49, "72": 50, "c": 51, "67": 52, "nhiều": 53, "hơn": 54, "thắng": 55, "trong": 56, "trận": 57, "đấu": 58, "lần": 59, "hai": 60, "độc": 61, "lập": 62, "chắc": 63, "bộ": 64, "đếm": 65, "trạng": 66, "thái": 67, "sau": 68, "là": 69, "thêm": 70, "bước": 71, "đâu": 72, "tăng": 73, "đều": 74, "nhà": 75, "máy": 76, "<": 77, "chiến": 78, "kết": 79, "nối": 80, "→": 81, "biến": 82, "số": 83, "z": 84, ";": 85, "đảm": 86, "bảo": 87, "đợt": 88, "luân": 89, "phiên": 90, "đơn": 91, "vị": 92, "làm": 93, "nào": 94, "thứ": 95, "tự": 96, "xuất": 97, "hiện": 98, "đội": 99, "đầu": 100, "tiên": 101, "đó": 102, "rồi": 103, "lượt": 104, "và": 105, "ban": 106, "so": 107, "hiệu": 108, "quả": 109, "sang": 110, "mức": 111, "tiêu": 112, "thụ": 113, "năng": 114, "lượng": 115, "thước": 116, "đo": 117, "thay": 118, "đổi": 119, "trình": 120, "lặp": 121, "8": 122, "5": 123, "7": 124, "trí": 125, "tiến": 126, "hệ": 127, "tuyến": 128, "phần": 129, "tử": 130, "ngang": 131, "nhau": 132, "mọi": 133, "sánh": 134, "được": 135, "hùng": 136, "hàng": 137, "kiên": 138, "lâm": 139, "phản": 140, "ví": 141, "dụ": 142, "nhưng": 143, "trường": 144, "hợp": 145, "p": 146, "hỏi": 147, "10": 148, "liệu": 149, "phong": 150, "97": 151, "quang": 152, "93": 153, "sơn": 154, "88": 155, "vòng": 156, "vượt": 157, "mỗi": 158, "điều": 159, "kiện": 160, "riêng": 161, "phép": 162, "toán": 163, "-": 164, "+": 165, "mod": 166, "mốc": 167, "thời": 168, "gian": 169, "nam": 170, "trước": 171, "nhất": 172, "minh": 173, "long": 174, "gần": 175, "nằm": 176, "ổn": 177, "cũng": 178, "đã": 179, "xảy": 180, "ra": 181, "chu": 182, "kỳ": 183, "toàn": 184, "tập": 185, "hoa": 186, "lan": 187, "mai": 188, "vực": 189, "phân": 190, "xưởng": 191, "y": 192, "bối": 193, "cảnh": 194, "cục": 195, "tròn": 196, "từ": 197, "đi": 198, "chiều": 199, "thuận": 200, "cao": 201, "thấp": 202, "uy": 203, "dẫn": 204, "vinh": 205, "xuân": 206, "cuối": 207, "bảng": 208, "dùng": 209, "khả": 210, "mở": 211, "rộng": 212, "kinh": 213, "tế": 214, "quán": 215, "q1": 216, "81": 217, "q2": 218, "75": 219, "q3": 220, "70": 221, "điểm": 222, "thống": 223, "kê": 224, "mẫu": 225, "x": 226, "khác": 227, "công": 228, "thức": 229, "mới": 230, "với": 231, "căn": 232, "ven": 233, "sông": 234, "dài": 235, "điện": 236, "thoại": 237, "đời": 238, "cũ": 239, "chiếc": 240, "xe": 241, "màu": 242, "đỏ": 243, "thiết": 244, "bị": 245, "thử": 246, "nghiệm": 247, "thể": 248, "luận": 249, "chưa": 250, "chuỗi": 251, "bình": 252, "an": 253, "đảo": 254, "ngược": 255, "luôn": 256, "giành": 257, "tích": 258, "linh": 259, "hoạt": 260, "độ": 261, "chính": 262, "xác": 263, "ghế": 264, "9": 265, "dịch": 266, "khoảng": 267, "cách": 268, "mùa": 269, "hè": 270, "môi": 271, "nói": 272, "chung": 273, "dãy": 274, "tại": 275, "của": 276, "xét": 277, "rõ": 278, "/": 279, "cộng": 280, "nếu": 281, "thì": 282, "quy": 283, "tắc": 284, "đề": 285, "cập": 286, "ca": 287, "xoay": 288, "tuyên": 289, "bố": 290, "ràng": 291, "tam": 292, "đoạn": 293, "bất": 294, "ưu": 295, "hoặc": 296, "total": 297, "order": 298, "trọng": 299, "tỷ": 300, "lệ": 301, "thành": 302, "biên": 303, "tốc": 304, "chậm": 305, "m1": 306, "m2": 307, "m3": 308, "m4": 309, "lịch": 310, "ghi": 311, "nhận": 312, "thuần": 313, "sát": 314, "suy": 315, "circular": 316, "array": 317, "index": 318, "phía": 319, "xếp": 320, "hạng": 321, "đạt": 322, "phú": 323, "giang": 324, "yếu": 325, "tố": 326, "trung": 327, "m": 328, "top": 329, "nút": 330, "gây": 331, "tầng": 332, "tổng": 333, "họa": 334, "82": 335, "78": 336, "73": 337, "nhỏ": 338, "lớn": 339, "ghép": 340, "lẻ": 341, "sân": 342, "khấu": 343, "chỗ": 344, "ngồi": 345, "84": 346, "khái": 347, "quát": 348, "trội": 349, "địa": 350, "dự": 351, "án": 352, "bắt": 353, "89": 354, "83": 355, "suất": 356, "đồng": 357, "hồ": 358, "quay": 359, "lại": 360, "77": 361, "74": 362, "slot": 363, "chi": 364, "phí": 365, "nguyên": 366, "này": 367, "nhớ": 368, "ô": 369, "con": 370, "trỏ": 371, "cố": 372, "cứ": 373, "item": 374, "80": 375, "68": 376, "tình": 377, "huống": 378, "tác": 379, "động": 380, "gián": 381, "thực": 382, "mô": 383, "lý": 384, "thuyết": 385, "ước": 386, "0": 387, "điệu": 388, "loại": 389, "chênh": 390, "kéo": 391, "xây": 392, "dựng": 393, "lệch": 394, "giữa": 395, "tuy": 396, "nhiên": 397, "automaton": 398, "cạnh": 399, "biển": 400, "sâu": 401, "alpha": 402, "sao": 403, "hay": 404, "gió": 405, "lửa": 406, "núi": 407, "tới": 408, "xử": 409, "kho": 410, "sắp": 411, "ngọc": 412, "đứng": 413, "trên": 414, "phát": 415, "quỳnh": 416, "fsm": 417, "tuần": 418, "hoàn": 419, "giảm": 420, "dần": 421, "sử": 422, "năm": 423, "nữa": 424, "ảnh": 425, "hưởng": 426, "cấp": 427, "thẳng": 428, "thanh": 429, "tràn": 430, "đường": 431, "đua": 432, "dừng": 433, "biết": 434, "gói": 435, "enterprise": 436, "ít": 437, "model": 438, "basic": 439, "pro": 440, "clb": 441, "đất": 442, "rừng": 443, "khi": 444, "giống": 445, "p1": 446, "98": 447, "p2": 448, "95": 449, "p3": 450, "87": 451, "2019": 452, "2011": 453, "2006": 454, "bền": 455, "đây": 456, "phải": 457, "thắm": 458, "uyên": 459, "vân": 460, "thiếu": 461, "khoản": 462, "tú": 463, "ánh": 464, "tất": 465, "cả": 466, "q10": 467, "q6": 468, "vào": 469, "team": 470, "gamma": 471, "beta": 472, "91": 473, "92": 474, "đúng": 475, "giai": 476, "cùng": 477, "mảng": 478, "[": 479, "]": 480, "danh": 481, "sách": 482, "khép": 483, "kín": 484, "mục": 485, "nhảy": 486, "hữu": 487, "hạn": 488, "ký": 489, "nên": 490, "đầy": 491, "đủ": 492, "luật": 493, "lô": 494, "đặc": 495, "liền": 496, "cơ": 497, "chế": 498, "≠": 499, "từng": 500, "đỉnh": 501, "đáy": 502, "2010": 503, "2001": 504, "1994": 505, "dữ": 506, "thô": 507, "thấy": 508, "phụ": 509, "thuộc": 510, "đối": 511, "bằng": 512, "chứng": 513, "vs": 514, "11": 515, "thông": 516, "nhân": 517, "t": 518, "đêm": 519, "2018": 520, "2015": 521, "2007": 522, "gì": 523, "học": 524, "tài": 525, "đánh": 526, "bại": 527, "gặp": 528, "q9": 529, "kế": 530, "thừa": 531, "79": 532, "71": 533, "cần": 534, "thơ": 535, "nặng": 536, "đà": 537, "lạt": 538, "hải": 539, "phòng": 540, "tre": 541, "d": 542, "r": 543, "mây": 544, "nova": 545, "quân": 546, "rẻ": 547, "dũng": 548, "nha": 549, "trang": 550, "buôn": 551, "ma": 552, "thuột": 553, "hà": 554, "nội": 555, "buổi": 556, "sáng": 557, "thí": 558, "85": 559, "đông": 560, "102": 561, "94": 562, "ngoài": 563, "12": 564, "dòng": 565, "2009": 566, "2013": 567, "v": 568, "áo": 569, "len": 570, "xa": 571, "tủ": 572, "đồ": 573, "gỗ": 574, "sồi": 575, "lược": 576, "ii": 577, "option": 578, "hoạch": 579, "phức": 580, "tạp": 581, "vận": 582, "hành": 583, "hạnh": 584, "thảo": 585, "nóng": 586, "bản": 587, "2028": 588, "2020": 589, "2012": 590, "q": 591, "nhanh": 592, "nẵng": 593, "xanh": 594, "q8": 595, "q5": 596, "90": 597, "phương": 598, "truyền": 599, "đệm": 600, "nêu": 601, "2014": 602, "2004": 603, "2002": 604, "max": 605, "version": 606, "a17": 607, "việt": 608, "xứng": 609, "2022": 610, "2003": 611, "2021": 612, "k": 613, "96": 614, "huế": 615, "ngắn": 616, "nhơn": 617, "c1": 618, "c2": 619, "c3": 620, "c4": 621, "thu": 622, "86": 623, "69": 624, "phúc": 625, "ngoại": 626, "khía": 627, "chuẩn": 628, "nano": 629, "bao": 630, "gồm": 631, "đắt": 632, "tiền": 633, "quý": 634, "hiếm": 635, "2017": 636, "2008": 637, "99": 638, "2027": 639, "1997": 640, "64": 641, "q0": 642, "tháng": 643, "chẵn": 644, "ba": 645, "100": 646, "tốt": 647, "v1": 648, "v2": 649, "v3": 650, "v4": 651, "tả": 652, "k1": 653, "k2": 654, "k3": 655, "k4": 656, "66": 657, "13": 658, "2016": 659, "vụ": 660, "doanh": 661, "nghiệp": 662, "hàm": 663, "subset": 664, "lớp": 665, "1996": 666, "2000": 667, "65": 668, "tuổi": 669, "nhi": 670, "giờ": 671, "thường": 672, "biệt": 673, "2024": 674, "quyển": 675, "bìa": 676, "cứng": 677, "1998": 678, "ngày": 679, "muộn": 680, "khởi": 681, "diễn": 682, "bán": 683, "nhẹ": 684, "1995": 685, "dày": 686, "dạn": 687, "ngân": 688, "oanh": 689, "i": 690, "tùng": 691, "101": 692, "sự": 693, "q7": 694, "q4": 695, "g1": 696, "g2": 697, "g3": 698, "g4": 699, "2025": 700, "g": 701, "h": 702, "1990": 703, "1993": 704, "2026": 705, "sớm": 706, "xế": 707, "tối": 708, "r1": 709, "r2": 710, "r3": 711, "r4": 712, "một": 713, "thang": 714, "duy": 715, "'": 716, "quen": 717, "trò": 718, "chơi": 719, "trục": 720, "để": 721, "—": 722, "ngữ": 723, "bài": 724, "tranh": 725, "strict": 726, "partial": 727, "cung": 728, "tin": 729, "transitivity": 730, "kiểm": 731, "soát": 732, "hạ": 733, "matchup": 734, "trễ": 735, "transitive": 736, "thích": 737, "yêu": 738, "các": 739, "hỗ": 740, "trợ": 741, "closure": 742, "thi": 743, "bổ": 744, "sung": 745, "tưởng": 746, "cycle": 747, "quyết": 748, "chốt": 749, "u": 750, "w": 751, "tường": 752, "j": 753, "cụ": 754, "trưa": 755, "thua": 756, "21": 757, "24": 758, "nhiêu": 759, "lùi": 760, "32": 761, "24h": 762, "00": 763, "tiếng": 764, "mấy": 765, "buffer": 766, "15": 767, "bội": 768, "bây": 769, "56": 770, "18": 771, "30": 772, "kim": 773, "vạch": 774, "23": 775, "thúc": 776, "22": 777, "46": 778, "25": 779, "26": 780, "ring": 781, "17": 782, "16": 783, "đèn": 784, "báo": 785, "36": 786, "60": 787, "hướng": 788, "la": 789, "bàn": 790, "°": 791, "kwh": 792, "12h": 793, "14": 794, "20": 795, "bậc": 796, "lên": 797, "nấc": 798, "47": 799, "19": 800, "33": 801, "âm": 802, "tục": 803, "sẽ": 804, "28": 805, "vàng": 806, "tây": 807}