ModerRAS commited on
Commit
f712f4b
·
1 Parent(s): 76e084f

Train thin anime filename parser

Browse files
MAINTENANCE.md CHANGED
@@ -117,8 +117,16 @@ Validate / 验证:
117
  ```powershell
118
  uv run python evaluate_parser_cases.py --model-dir . --case-file data/parser_regression_cases.json --output case_metrics.json
119
  uv run python onnx_inference.py "[GM-Team][国漫][神印王座][Throne of Seal][2022][200][AVC][GB][1080P].mp4"
 
120
  ```
121
 
 
 
 
 
 
 
 
122
  ## Dataset Submodule / 数据集子模块
123
 
124
  If `datasets/AnimeName` changed, commit and push it first:
@@ -184,4 +192,3 @@ scraper/src/main/assets/anime_parser/anime_filename_parser.onnx
184
  scraper/src/main/assets/anime_parser/config.json
185
  scraper/src/main/assets/anime_parser/vocab.json
186
  ```
187
-
 
117
  ```powershell
118
  uv run python evaluate_parser_cases.py --model-dir . --case-file data/parser_regression_cases.json --output case_metrics.json
119
  uv run python onnx_inference.py "[GM-Team][国漫][神印王座][Throne of Seal][2022][200][AVC][GB][1080P].mp4"
120
+ uv run python benchmark_inference.py --model-dir . --onnx exports/anime_filename_parser.onnx --case-file data/parser_regression_cases.json --repeat 20 --warmup 20 --torch-threads 1 --ort-threads 1 --output benchmark_results.json
121
  ```
122
 
123
+ The default parser path is thin runtime: model logits, constrained BIO, entity
124
+ aggregation, and light string/number normalization. `--rule-assist` is a
125
+ compatibility/diagnostic mode only; do not use it as the primary quality metric.
126
+
127
+ 默认解析路径是薄层运行时:模型 logits、约束 BIO、实体聚合和轻量字符串/数字规范化。
128
+ `--rule-assist` 只是兼容/诊断模式,不作为主质量指标。
129
+
130
  ## Dataset Submodule / 数据集子模块
131
 
132
  If `datasets/AnimeName` changed, commit and push it first:
 
192
  scraper/src/main/assets/anime_parser/config.json
193
  scraper/src/main/assets/anime_parser/vocab.json
194
  ```
 
README.md CHANGED
@@ -25,7 +25,10 @@ model-index:
25
  type: parser-regression
26
  metrics:
27
  - type: accuracy
28
- name: Fixed parser full-match accuracy
 
 
 
29
  value: 1.0
30
  ---
31
 
@@ -52,9 +55,9 @@ This repository is the Hugging Face model repo used by MiruPlay as `tools/anime_
52
  | Default checkpoint / 默认权重 | Repository root files (`config.json`, `model.safetensors`, `vocab.json`, `tokenizer_config.json`) |
53
  | ONNX export / ONNX 导出 | `exports/anime_filename_parser.onnx` |
54
 
55
- **中文**:根目录就是发布 checkpoint,不再保留旧的 `model/` 重复副本。完整解析请使用本仓库的 `inference.py` 或复用 `tokenizer.py`、BIO decode 字段聚合逻辑;直接 `from_pretrained()` 只能加载 token-classification 权重。
56
 
57
- **English**: The repository root is the published checkpoint. The old duplicate `model/` directory is intentionally not used. For end-to-end parsing, use `inference.py` or reuse this repo's tokenizer, BIO decoder, and field aggregation logic; `from_pretrained()` only loads token-classification weights.
58
 
59
  ## Intended Use / 使用场景
60
 
@@ -114,7 +117,7 @@ The ONNX graph outputs token logits only. A complete parser still needs:
114
 
115
  1. custom character tokenization,
116
  2. constrained BIO decoding,
117
- 3. field aggregation and high-confidence structural cleanup.
118
 
119
  本仓库提供最小可运行示例:
120
 
@@ -136,15 +139,17 @@ Current published checkpoint:
136
 
137
  | Metric / 指标 | Value / 数值 |
138
  | --- | --- |
139
- | Fixed real-case regression / 固定真实回归 | 26/26 full match |
140
- | ONNX parity / ONNX 误差 | max abs diff `2.6703e-05` |
141
- | Token/entity eval after focus tuning / focus 微调后实体评估 | F1 `0.9666`, token accuracy `0.9904` |
142
- | Focus parse eval / focus 解析评估 | 385/512 full match |
143
- | CPU end-to-end latency / CPU 端到端延迟 | ONNX avg `30.35 ms`, P95 `34.44 ms` |
 
 
144
 
145
- **中文**:当前发布模型是“全量重标注 char 模型 + special-code focus 微调”。固定回归集覆盖真实用户反馈样式;focus eval 是偏向困难样本的评估,不等同于全随机 DMHY 评估
146
 
147
- **English**: The published checkpoint is the full-relabel character model plus a targeted special-code focus fine-tune. The fixed regression set covers real user-reported patterns; focus eval is intentionally biased toward hard examples and is not equivalent to a broad random DMHY evaluation.
148
 
149
  Run regression:
150
 
@@ -162,21 +167,21 @@ Benchmark command:
162
  uv run python benchmark_inference.py --model-dir . --onnx exports/anime_filename_parser.onnx --case-file data/parser_regression_cases.json --repeat 20 --warmup 20 --torch-threads 1 --ort-threads 1 --output benchmark_results.json
163
  ```
164
 
165
- Local CPU benchmark on the 26 fixed real-world cases, single-threaded, including
166
- tokenization, model/session forward, constrained BIO decoding, and field
167
- postprocessing:
168
 
169
- 本地 CPU 单线程测试,使用 26 条固定真实 case,包含 tokenizer、模型/session
170
- 前向、约束 BIO 解码和字段后处理
171
 
172
  | Backend / 后端 | Load ms / 加载 ms | Avg ms / 平均 ms | P50 ms | P95 ms | P99 ms | files/s |
173
  | --- | ---: | ---: | ---: | ---: | ---: | ---: |
174
- | PyTorch | 64.63 | 32.86 | 32.43 | 38.42 | 41.09 | 30.4 |
175
- | ONNX Runtime | 898.63 | 30.35 | 30.12 | 34.44 | 36.86 | 33.0 |
176
 
177
- **中文**:这是完整 parser 的端到端延迟,不是只测模型 forward。模型本身很小,主要成本来自 Python/运行时的 BIO 解码和字段聚合;移动端实现应复用相同逻辑但避免重复创建 ONNX session。
178
 
179
- **English**: This is end-to-end parser latency, not model-forward-only timing. The model is small; most runtime cost is tokenizer/BIO decode/field aggregation overhead. Mobile code should keep the ONNX session reusable and avoid recreating it per filename.
180
 
181
  ## Training / 训练
182
 
@@ -200,6 +205,7 @@ uv run python train.py --tokenizer char `
200
  --checkpoint-steps 1000 `
201
  --save-total-limit 3 `
202
  --parse-eval-limit 2048 `
 
203
  --seed 52 `
204
  --experiment-name dmhy-char-full
205
  ```
@@ -271,11 +277,11 @@ See [`MAINTENANCE.md`](MAINTENANCE.md) for release steps, LFS order, dataset sub
271
  **中文**
272
 
273
  - 发布命名没有统一标准,极端 OCR 噪声、乱码、非动画命名仍可能失败。
274
- - ONNX 只包含模型 logits,不包含 tokenizer 和后处理;移动端必须保持 tokenizer/vocab/config 一致。
275
  - `source` 当前是单值字段,复杂文件名里可能同时存在平台、发布源、编码器和语言标签。
276
 
277
  **English**
278
 
279
  - Anime release names are not standardized; extreme OCR noise, mojibake, or non-anime names can still fail.
280
- - ONNX contains logits only. Mobile runtimes must keep tokenizer, vocabulary, config, BIO decode, and postprocessing in sync.
281
  - `source` is currently a single field, while real filenames may contain platform, release source, codec, and language tags together.
 
25
  type: parser-regression
26
  metrics:
27
  - type: accuracy
28
+ name: Fixed parser model-only full-match accuracy
29
+ value: 0.9615
30
+ - type: accuracy
31
+ name: Fixed parser thin-runtime full-match accuracy
32
  value: 1.0
33
  ---
34
 
 
55
  | Default checkpoint / 默认权重 | Repository root files (`config.json`, `model.safetensors`, `vocab.json`, `tokenizer_config.json`) |
56
  | ONNX export / ONNX 导出 | `exports/anime_filename_parser.onnx` |
57
 
58
+ **中文**:根目录就是发布 checkpoint,不再保留旧的 `model/` 重复副本。默认解析路径是“模型 logits + 约束 BIO + 字段规范化”,不再默认启用重结构规则;直接 `from_pretrained()` 只能加载 token-classification 权重。
59
 
60
+ **English**: The repository root is the published checkpoint. The default parser is model logits + constrained BIO + thin field normalization; heavy structural assist is not enabled by default. `from_pretrained()` only loads token-classification weights.
61
 
62
  ## Intended Use / 使用场景
63
 
 
117
 
118
  1. custom character tokenization,
119
  2. constrained BIO decoding,
120
+ 3. field aggregation and thin string/number normalization.
121
 
122
  本仓库提供最小可运行示例:
123
 
 
139
 
140
  | Metric / 指标 | Value / 数值 |
141
  | --- | --- |
142
+ | Fixed regression, model-only / 固定回归,纯模型聚合 | 25/26 full match = `96.15%` |
143
+ | Fixed regression, default thin runtime / 固定回归,默认薄层运行时 | 26/26 full match = `100%` |
144
+ | Focus held-out, model-only / 困难抽样,纯模型聚合 | 1014/1024 full match = `99.02%` |
145
+ | Focus held-out, default thin runtime / 困难抽样,默认薄层运行时 | 1017/1024 full match = `99.32%` |
146
+ | Token/entity eval / token/entity 评估 | F1 `0.9972`, token accuracy `0.9995` |
147
+ | ONNX parity / ONNX 误差 | max abs diff `4.0531e-05` |
148
+ | CPU thin-runtime latency / CPU 薄层运行时延迟 | ONNX avg `13.08 ms`, P95 `15.95 ms` |
149
 
150
+ **中文**:当前发布模型是“全量重标注 char 模型 + thin hard-case focus 微调”。README 主指标以 `model-only` 和默认薄层 `normalized-only` 为准;`--rule-assist` 只保留为兼容/诊断对照,不再作为模型质标准
151
 
152
+ **English**: The published checkpoint is the full-relabel character model plus a thin hard-case focus fine-tune. README quality numbers prioritize `model-only` and the default thin `normalized-only` runtime; `--rule-assist` is retained only for compatibility/diagnostics.
153
 
154
  Run regression:
155
 
 
167
  uv run python benchmark_inference.py --model-dir . --onnx exports/anime_filename_parser.onnx --case-file data/parser_regression_cases.json --repeat 20 --warmup 20 --torch-threads 1 --ort-threads 1 --output benchmark_results.json
168
  ```
169
 
170
+ Local CPU benchmark on the 26 fixed real-world cases, single-threaded, using the
171
+ default thin runtime: tokenization, model/session forward, constrained BIO
172
+ decoding, entity aggregation, and light string/number normalization:
173
 
174
+ 本地 CPU 单线程测试,使用 26 条固定真实 case,默认薄层运行时,包含 tokenizer、
175
+ 模型/session 前向、约束 BIO 解码、实体聚合轻量符串/数字规范化
176
 
177
  | Backend / 后端 | Load ms / 加载 ms | Avg ms / 平均 ms | P50 ms | P95 ms | P99 ms | files/s |
178
  | --- | ---: | ---: | ---: | ---: | ---: | ---: |
179
+ | PyTorch | 49.07 | 15.16 | 14.87 | 18.50 | 21.91 | 66.0 |
180
+ | ONNX Runtime | 568.85 | 13.08 | 12.82 | 15.95 | 20.19 | 76.5 |
181
 
182
+ **中文**:这是完整薄层 parser 的端到端延迟,不是只测模型 forward。移动端实现应复用 ONNX session,并保持 tokenizer/BIO/薄规范化逻辑一致
183
 
184
+ **English**: This is end-to-end thin-parser latency, not model-forward-only timing. Mobile code should keep the ONNX session reusable and keep tokenizer/BIO/thin-normalization behavior aligned.
185
 
186
  ## Training / 训练
187
 
 
205
  --checkpoint-steps 1000 `
206
  --save-total-limit 3 `
207
  --parse-eval-limit 2048 `
208
+ --case-eval-file data/parser_regression_cases.json `
209
  --seed 52 `
210
  --experiment-name dmhy-char-full
211
  ```
 
277
  **中文**
278
 
279
  - 发布命名没有统一标准,极端 OCR 噪声、乱码、非动画命名仍可能失败。
280
+ - ONNX 只包含模型 logits,不包含 tokenizer、BIO decode 薄字段规范化;移动端必须保持 tokenizer/vocab/config 一致。
281
  - `source` 当前是单值字段,复杂文件名里可能同时存在平台、发布源、编码器和语言标签。
282
 
283
  **English**
284
 
285
  - Anime release names are not standardized; extreme OCR noise, mojibake, or non-anime names can still fail.
286
+ - ONNX contains logits only. Mobile runtimes must keep tokenizer, vocabulary, config, BIO decode, and thin normalization in sync.
287
  - `source` is currently a single field, while real filenames may contain platform, release source, codec, and language tags together.
benchmark_inference.py CHANGED
@@ -95,7 +95,8 @@ def main() -> None:
95
  parser.add_argument("--torch-threads", type=int, default=1, help="torch intra-op thread count")
96
  parser.add_argument("--ort-threads", type=int, default=1, help="ONNX Runtime intra/inter-op thread count")
97
  parser.add_argument("--no-constrained-bio", action="store_true", help="Use greedy labels for PyTorch backend")
98
- parser.add_argument("--no-rule-assist", action="store_true", help="Disable structural postprocessing")
 
99
  parser.add_argument("--output", default=None, help="Optional JSON output path")
100
  args = parser.parse_args()
101
 
@@ -127,7 +128,7 @@ def main() -> None:
127
  id2label,
128
  max_length=resolved_max_length,
129
  debug=False,
130
- use_rules=not args.no_rule_assist,
131
  constrain_bio=not args.no_constrained_bio,
132
  )
133
 
@@ -149,7 +150,7 @@ def main() -> None:
149
  load_ms = (time.perf_counter() - load_start) * 1000.0
150
 
151
  def parse_onnx(filename: str) -> Dict:
152
- return onnx_parser.parse(filename, use_rules=not args.no_rule_assist)
153
 
154
  raw = run_benchmark("onnxruntime", parse_onnx, filenames, args.warmup, args.repeat)
155
  results.append(summarize(raw["name"], load_ms, raw["latencies_ms"]))
@@ -163,7 +164,7 @@ def main() -> None:
163
  "warmup": args.warmup,
164
  "torch_threads": args.torch_threads,
165
  "ort_threads": args.ort_threads,
166
- "use_rules": not args.no_rule_assist,
167
  "constrain_bio": not args.no_constrained_bio,
168
  "results": results,
169
  }
 
95
  parser.add_argument("--torch-threads", type=int, default=1, help="torch intra-op thread count")
96
  parser.add_argument("--ort-threads", type=int, default=1, help="ONNX Runtime intra/inter-op thread count")
97
  parser.add_argument("--no-constrained-bio", action="store_true", help="Use greedy labels for PyTorch backend")
98
+ parser.add_argument("--rule-assist", action="store_true", help="Enable legacy structural postprocessing")
99
+ parser.add_argument("--no-rule-assist", action="store_true", help=argparse.SUPPRESS)
100
  parser.add_argument("--output", default=None, help="Optional JSON output path")
101
  args = parser.parse_args()
102
 
 
128
  id2label,
129
  max_length=resolved_max_length,
130
  debug=False,
131
+ use_rules=args.rule_assist and not args.no_rule_assist,
132
  constrain_bio=not args.no_constrained_bio,
133
  )
134
 
 
150
  load_ms = (time.perf_counter() - load_start) * 1000.0
151
 
152
  def parse_onnx(filename: str) -> Dict:
153
+ return onnx_parser.parse(filename, use_rules=args.rule_assist and not args.no_rule_assist)
154
 
155
  raw = run_benchmark("onnxruntime", parse_onnx, filenames, args.warmup, args.repeat)
156
  results.append(summarize(raw["name"], load_ms, raw["latencies_ms"]))
 
164
  "warmup": args.warmup,
165
  "torch_threads": args.torch_threads,
166
  "ort_threads": args.ort_threads,
167
+ "use_rules": args.rule_assist and not args.no_rule_assist,
168
  "constrain_bio": not args.no_constrained_bio,
169
  "results": results,
170
  }
benchmark_results.json CHANGED
@@ -3,34 +3,36 @@
3
  "onnx": "exports/anime_filename_parser.onnx",
4
  "case_file": "data/parser_regression_cases.json",
5
  "case_count": 26,
6
- "repeat": 50,
7
  "warmup": 20,
8
  "torch_threads": 1,
9
  "ort_threads": 1,
 
 
10
  "results": [
11
  {
12
  "name": "pytorch",
13
- "load_ms": 48.104200046509504,
14
- "runs": 1300,
15
- "avg_ms": 240.13151522954175,
16
- "p50_ms": 211.5633500216063,
17
- "p95_ms": 460.0564300373662,
18
- "p99_ms": 638.7356059905142,
19
- "min_ms": 55.40569999720901,
20
- "max_ms": 673.8430999685079,
21
- "throughput_fps": 4.164384666644442
22
  },
23
  {
24
  "name": "onnxruntime",
25
- "load_ms": 830.1237999694422,
26
- "runs": 1300,
27
- "avg_ms": 253.9665275382308,
28
- "p50_ms": 255.0988500006497,
29
- "p95_ms": 445.8765349787427,
30
- "p99_ms": 584.5061249908758,
31
- "min_ms": 52.04109998885542,
32
- "max_ms": 738.4270000038669,
33
- "throughput_fps": 3.937526766591181
34
  }
35
  ]
36
  }
 
3
  "onnx": "exports/anime_filename_parser.onnx",
4
  "case_file": "data/parser_regression_cases.json",
5
  "case_count": 26,
6
+ "repeat": 20,
7
  "warmup": 20,
8
  "torch_threads": 1,
9
  "ort_threads": 1,
10
+ "use_rules": false,
11
+ "constrain_bio": true,
12
  "results": [
13
  {
14
  "name": "pytorch",
15
+ "load_ms": 49.07089995685965,
16
+ "runs": 520,
17
+ "avg_ms": 15.156135000646687,
18
+ "p50_ms": 14.874850050546229,
19
+ "p95_ms": 18.50034496746957,
20
+ "p99_ms": 21.91202303394671,
21
+ "min_ms": 11.207600007764995,
22
+ "max_ms": 26.899200049228966,
23
+ "throughput_fps": 65.97988207134152
24
  },
25
  {
26
  "name": "onnxruntime",
27
+ "load_ms": 568.8452000031248,
28
+ "runs": 520,
29
+ "avg_ms": 13.076459232475967,
30
+ "p50_ms": 12.81869993545115,
31
+ "p95_ms": 15.947990084532643,
32
+ "p99_ms": 20.187044028425575,
33
+ "min_ms": 10.0586999906227,
34
+ "max_ms": 22.88920001592487,
35
+ "throughput_fps": 76.4733007782761
36
  }
37
  ]
38
  }
build_repair_focus_dataset.py CHANGED
@@ -61,6 +61,94 @@ def char_item(filename: str, spans: List[tuple[str, str]]) -> dict:
61
 
62
 
63
  def manual_cases() -> Iterable[dict]:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
64
  yield char_item(
65
  "[AI-Raws] 炎炎の消防隊 弐ノ章 #13 (BD HEVC 1920x1080 yuv444p10le FLAC)[FC74A2D5].mkv",
66
  [
 
61
 
62
 
63
  def manual_cases() -> Iterable[dict]:
64
+ yield char_item(
65
+ "One.Piece.1110.1080p.WEB-DL.AAC2.0.H.264",
66
+ [
67
+ ("One.Piece", "TITLE"),
68
+ ("1110", "EPISODE"),
69
+ ("1080p", "RESOLUTION"),
70
+ ("WEB-DL", "SOURCE"),
71
+ ],
72
+ )
73
+ yield char_item(
74
+ "One.Piece.1111.1080p.WEB-DL.AAC2.0.H.264",
75
+ [
76
+ ("One.Piece", "TITLE"),
77
+ ("1111", "EPISODE"),
78
+ ("1080p", "RESOLUTION"),
79
+ ("WEB-DL", "SOURCE"),
80
+ ],
81
+ )
82
+ yield char_item(
83
+ "【喵萌奶茶屋】★04月新番★[葬送的芙莉莲][01][1080P][HEVC]",
84
+ [
85
+ ("喵萌奶茶屋", "GROUP"),
86
+ ("葬送的芙莉莲", "TITLE"),
87
+ ("01", "EPISODE"),
88
+ ("1080P", "RESOLUTION"),
89
+ ("HEVC", "SOURCE"),
90
+ ],
91
+ )
92
+ yield char_item(
93
+ "【喵萌奶茶屋】★10月新番★[药屋少女的呢喃][02][1080P][HEVC]",
94
+ [
95
+ ("喵萌奶茶屋", "GROUP"),
96
+ ("药屋少女的呢喃", "TITLE"),
97
+ ("02", "EPISODE"),
98
+ ("1080P", "RESOLUTION"),
99
+ ("HEVC", "SOURCE"),
100
+ ],
101
+ )
102
+ yield char_item(
103
+ "[Billion Meta Lab] 魔法姊妹露露莉莉 Mahou no Shimai Rurutto Riryi [07][1080P][CHT&JPN][檢索:魔法姊妹露露特莉莉].mp4",
104
+ [
105
+ ("Billion Meta Lab", "GROUP"),
106
+ ("魔法姊妹露露莉莉 Mahou no Shimai Rurutto Riryi", "TITLE"),
107
+ ("07", "EPISODE"),
108
+ ("1080P", "RESOLUTION"),
109
+ ("CHT&JPN", "SOURCE"),
110
+ ("檢索:魔法姊妹露露特莉莉", "SPECIAL"),
111
+ ],
112
+ )
113
+ yield char_item(
114
+ "[Billion Meta Lab] 魔法姊妹露露莉莉 Mahou no Shimai Rurutto Riryi [08][1080P][CHT&JPN][检索:魔法姊妹露露特莉莉].mp4",
115
+ [
116
+ ("Billion Meta Lab", "GROUP"),
117
+ ("魔法姊妹露露莉莉 Mahou no Shimai Rurutto Riryi", "TITLE"),
118
+ ("08", "EPISODE"),
119
+ ("1080P", "RESOLUTION"),
120
+ ("CHT&JPN", "SOURCE"),
121
+ ("检索:魔法姊妹露露特莉莉", "SPECIAL"),
122
+ ],
123
+ )
124
+ yield char_item(
125
+ "[LoliHouse] Kakuriyo no Yadomeshi Ni - 12 [WebRip 1080p HEVC-10bit AAC SRTx2].mkv",
126
+ [
127
+ ("LoliHouse", "GROUP"),
128
+ ("Kakuriyo no Yadomeshi", "TITLE"),
129
+ ("Ni", "SEASON"),
130
+ ("12", "EPISODE"),
131
+ ("WebRip", "SOURCE"),
132
+ ("1080p", "RESOLUTION"),
133
+ ("HEVC", "SOURCE"),
134
+ ("AAC", "SOURCE"),
135
+ ("SRTx2", "SOURCE"),
136
+ ],
137
+ )
138
+ yield char_item(
139
+ "[LoliHouse] Kakuriyo no Yadomeshi Ni - 13 [WebRip 1080p HEVC-10bit AAC SRTx2].mkv",
140
+ [
141
+ ("LoliHouse", "GROUP"),
142
+ ("Kakuriyo no Yadomeshi", "TITLE"),
143
+ ("Ni", "SEASON"),
144
+ ("13", "EPISODE"),
145
+ ("WebRip", "SOURCE"),
146
+ ("1080p", "RESOLUTION"),
147
+ ("HEVC", "SOURCE"),
148
+ ("AAC", "SOURCE"),
149
+ ("SRTx2", "SOURCE"),
150
+ ],
151
+ )
152
  yield char_item(
153
  "[AI-Raws] 炎炎の消防隊 弐ノ章 #13 (BD HEVC 1920x1080 yuv444p10le FLAC)[FC74A2D5].mkv",
154
  [
case_metrics.json CHANGED
@@ -1,567 +1,1739 @@
1
  {
2
- "model_dir": ".",
3
- "case_file": "data/parser_regression_cases.json",
4
- "tokenizer_variant": "char",
5
- "max_length": 128,
6
- "use_rules": true,
7
- "constrain_bio": true,
8
- "case_count": 26,
9
- "full_correct": 26,
10
- "full_accuracy": 1.0,
11
- "field_correct": {
12
- "group": 22,
13
- "title": 26,
14
- "episode": 26,
15
- "resolution": 26,
16
- "source": 19,
17
- "season": 9,
18
- "special": 5
19
- },
20
- "field_total": {
21
- "group": 22,
22
- "title": 26,
23
- "episode": 26,
24
- "resolution": 26,
25
- "source": 19,
26
- "season": 9,
27
- "special": 5
28
- },
29
- "field_accuracy": {
30
- "episode": 1.0,
31
- "group": 1.0,
32
- "resolution": 1.0,
33
- "season": 1.0,
34
- "source": 1.0,
35
- "special": 1.0,
36
- "title": 1.0
37
- },
38
- "failures": [],
39
- "results": [
40
- {
41
- "id": "lolihouse_dash_episode",
42
- "filename": "[LoliHouse] Yomi no Tsugai - 07 [WebRip 1080p HEVC-10bit AAC ASSx2]",
43
- "ok": true,
44
- "errors": {},
45
- "expected": {
46
- "group": "LoliHouse",
47
- "title": "Yomi no Tsugai",
48
- "episode": 7,
49
- "resolution": "1080p",
50
- "source": "WebRip"
51
  },
52
- "pred": {
53
- "episode": 7,
54
- "group": "LoliHouse",
55
- "resolution": "1080p",
56
- "source": "WebRip",
57
- "title": "Yomi no Tsugai"
58
- }
59
- },
60
- {
61
- "id": "dot_season_episode_no_group",
62
- "filename": "Witch.Hat.Atelier.S01E07.1080p.NF.WEB-DL.JPN.AAC2.0.H.264.MSubs-ToonsHub",
63
- "ok": true,
64
- "errors": {},
65
- "expected": {
66
- "title": "Witch.Hat.Atelier",
67
- "season": 1,
68
- "episode": 7,
69
- "group": null,
70
- "resolution": "1080p",
71
- "source": "NF"
72
- },
73
- "pred": {
74
- "episode": 7,
75
- "group": null,
76
- "resolution": "1080p",
77
- "season": 1,
78
- "source": "NF",
79
- "title": "Witch.Hat.Atelier"
80
- }
81
- },
82
- {
83
- "id": "ani_cjk_season_dash_episode",
84
- "filename": "[ANi] 異世界悠閒農家 2 - 06 [1080P][Baha][WEB-DL][AAC AVC][CHT]",
85
- "ok": true,
86
- "errors": {},
87
- "expected": {
88
- "group": "ANi",
89
- "title": "異世界悠閒農家",
90
- "season": 2,
91
- "episode": 6,
92
- "resolution": "1080P",
93
- "source": "Baha"
94
- },
95
- "pred": {
96
- "episode": 6,
97
- "group": "ANi",
98
- "resolution": "1080P",
99
- "season": 2,
100
- "source": "Baha",
101
- "title": "異世界悠閒農家"
102
- }
103
- },
104
- {
105
- "id": "kisssub_bracket_title_episode",
106
- "filename": "[KissSub][Shunkashuutou Daikousha - Haru no Mai][05][1080P][GB][MP4]",
107
- "ok": true,
108
- "errors": {},
109
- "expected": {
110
- "group": "KissSub",
111
- "title": "Shunkashuutou Daikousha - Haru no Mai",
112
- "episode": 5,
113
- "resolution": "1080P",
114
- "source": "GB"
115
- },
116
- "pred": {
117
- "episode": 5,
118
- "group": "KissSub",
119
- "resolution": "1080P",
120
- "source": "GB",
121
- "title": "Shunkashuutou Daikousha - Haru no Mai"
122
- }
123
- },
124
- {
125
- "id": "airotabracket_title_episode",
126
- "filename": "[Airota][Sousou no Frieren][29][1080p AVC AAC][CHT]",
127
- "ok": true,
128
- "errors": {},
129
- "expected": {
130
- "group": "Airota",
131
- "title": "Sousou no Frieren",
132
- "episode": 29,
133
- "resolution": "1080p",
134
- "source": "CHT"
135
- },
136
- "pred": {
137
- "episode": 29,
138
- "group": "Airota",
139
- "resolution": "1080p",
140
- "source": "CHT",
141
- "title": "Sousou no Frieren"
142
- }
143
- },
144
- {
145
- "id": "subsplease_parenthesized_resolution",
146
- "filename": "[SubsPlease] Mushoku Tensei - 12 (1080p) [x265][AAC]",
147
- "ok": true,
148
- "errors": {},
149
- "expected": {
150
- "group": "SubsPlease",
151
- "title": "Mushoku Tensei",
152
- "episode": 12,
153
- "resolution": "1080p"
154
- },
155
- "pred": {
156
- "episode": 12,
157
- "group": "SubsPlease",
158
- "resolution": "1080p",
159
- "title": "Mushoku Tensei"
160
- }
161
- },
162
- {
163
- "id": "vcb_bracket_episode",
164
- "filename": "[VCB-Studio] Girls Band Cry [01][Ma10p_1080p][x265_flac]",
165
- "ok": true,
166
- "errors": {},
167
- "expected": {
168
- "group": "VCB-Studio",
169
- "title": "Girls Band Cry",
170
- "episode": 1,
171
- "resolution": "1080p"
172
- },
173
- "pred": {
174
- "episode": 1,
175
- "group": "VCB-Studio",
176
- "resolution": "1080p",
177
- "title": "Girls Band Cry"
178
- }
179
- },
180
- {
181
- "id": "numeric_title_not_episode",
182
- "filename": "86 Eighty Six - 01 [1080P][Baha]",
183
- "ok": true,
184
- "errors": {},
185
- "expected": {
186
- "title": "86 Eighty Six",
187
- "episode": 1,
188
- "resolution": "1080P",
189
- "source": "Baha"
190
- },
191
- "pred": {
192
- "episode": 1,
193
- "resolution": "1080P",
194
- "source": "Baha",
195
- "title": "86 Eighty Six"
196
- }
197
- },
198
- {
199
- "id": "erai_raws_dash_episode",
200
- "filename": "[Erai-raws] Sousou no Frieren - 01 [1080p][Multiple Subtitle][ENG]",
201
- "ok": true,
202
- "errors": {},
203
- "expected": {
204
- "group": "Erai-raws",
205
- "title": "Sousou no Frieren",
206
- "episode": 1,
207
- "resolution": "1080p"
208
- },
209
- "pred": {
210
- "episode": 1,
211
- "group": "Erai-raws",
212
- "resolution": "1080p",
213
- "title": "Sousou no Frieren"
214
- }
215
- },
216
- {
217
- "id": "nekomoe_space_group",
218
- "filename": "[Nekomoe kissaten][Watashi no Shiawase na Kekkon][01][1080p][JPSC]",
219
- "ok": true,
220
- "errors": {},
221
- "expected": {
222
- "group": "Nekomoe kissaten",
223
- "title": "Watashi no Shiawase na Kekkon",
224
- "episode": 1,
225
- "resolution": "1080p"
226
- },
227
- "pred": {
228
- "episode": 1,
229
- "group": "Nekomoe kissaten",
230
- "resolution": "1080p",
231
- "title": "Watashi no Shiawase na Kekkon"
232
- }
233
- },
234
- {
235
- "id": "long_running_episode",
236
- "filename": "One.Piece.1110.1080p.WEB-DL.AAC2.0.H.264",
237
- "ok": true,
238
- "errors": {},
239
- "expected": {
240
- "title": "One.Piece",
241
- "episode": 1110,
242
- "resolution": "1080p",
243
- "source": "WEB-DL"
244
- },
245
- "pred": {
246
- "episode": 1110,
247
- "resolution": "1080p",
248
- "source": "WEB-DL",
249
- "title": "One.Piece"
250
- }
251
- },
252
- {
253
- "id": "season_episode_amzn",
254
- "filename": "Example.Show.S02E03.2160p.AMZN.WEB-DL.DDP5.1.H.265",
255
- "ok": true,
256
- "errors": {},
257
- "expected": {
258
- "title": "Example.Show",
259
- "season": 2,
260
- "episode": 3,
261
- "resolution": "2160p",
262
- "source": "AMZN"
263
- },
264
- "pred": {
265
- "episode": 3,
266
- "resolution": "2160p",
267
- "season": 2,
268
- "source": "AMZN",
269
- "title": "Example.Show"
270
- }
271
- },
272
- {
273
- "id": "cjk_group_with_prefix_tag",
274
- "filename": "【喵萌奶茶屋】★04月新番★[葬送的芙莉莲][01][1080P][HEVC]",
275
- "ok": true,
276
- "errors": {},
277
- "expected": {
278
- "group": "喵萌奶茶屋",
279
- "title": "葬送的芙莉莲",
280
- "episode": 1,
281
- "resolution": "1080P"
282
- },
283
- "pred": {
284
- "episode": 1,
285
- "group": "喵萌奶茶屋",
286
- "resolution": "1080P",
287
- "title": "葬送的芙莉莲"
288
- }
289
- },
290
- {
291
- "id": "leading_meta_not_group",
292
- "filename": "[1080p] Witch Watch - 15 [CHS]",
293
- "ok": true,
294
- "errors": {},
295
- "expected": {
296
- "group": null,
297
- "title": "Witch Watch",
298
- "episode": 15,
299
- "resolution": "1080p",
300
- "source": "CHS"
301
- },
302
- "pred": {
303
- "episode": 15,
304
- "group": null,
305
- "resolution": "1080p",
306
- "source": "CHS",
307
- "title": "Witch Watch"
308
- }
309
- },
310
- {
311
- "id": "sakurato_group_language_source",
312
- "filename": "[Sakurato] Witch Watch - 15 [1080p][CHS]",
313
- "ok": true,
314
- "errors": {},
315
- "expected": {
316
- "group": "Sakurato",
317
- "title": "Witch Watch",
318
- "episode": 15,
319
- "resolution": "1080p",
320
- "source": "CHS"
321
- },
322
- "pred": {
323
- "episode": 15,
324
- "group": "Sakurato",
325
- "resolution": "1080p",
326
- "source": "CHS",
327
- "title": "Witch Watch"
328
- }
329
- },
330
- {
331
- "id": "billion_meta_lab_search_special",
332
- "filename": "[Billion Meta Lab] 魔法姊妹露露莉莉 Mahou no Shimai Rurutto Riryi [07][1080P][CHT&JPN][檢索:魔法姊妹露露特莉莉].mp4",
333
- "ok": true,
334
- "errors": {},
335
- "expected": {
336
- "group": "Billion Meta Lab",
337
- "title": "魔法姊妹露露莉莉 Mahou no Shimai Rurutto Riryi",
338
- "episode": 7,
339
- "resolution": "1080P",
340
- "source": "CHT&JPN",
341
- "special": "檢索:魔法姊妹露露特莉莉"
342
- },
343
- "pred": {
344
- "episode": 7,
345
- "group": "Billion Meta Lab",
346
- "resolution": "1080P",
347
- "source": "CHT&JPN",
348
- "special": "檢索:魔法姊妹露露特莉莉",
349
- "title": "魔法姊妹露露莉莉 Mahou no Shimai Rurutto Riryi"
350
- }
351
- },
352
- {
353
- "id": "studio_greentea_s2_bracket_episode",
354
- "filename": "[Studio GreenTea] Otonari no Tenshi-sama ni Itsunomanika Dame Ningen ni Sareteita Ken S2 [06][WebRip][HEVC-10bit 1080p AAC][JPSC].mp4",
355
- "ok": true,
356
- "errors": {},
357
- "expected": {
358
- "group": "Studio GreenTea",
359
- "title": "Otonari no Tenshi-sama ni Itsunomanika Dame Ningen ni Sareteita Ken",
360
- "season": 2,
361
- "episode": 6,
362
- "resolution": "1080p",
363
- "source": "WebRip"
364
- },
365
- "pred": {
366
- "episode": 6,
367
- "group": "Studio GreenTea",
368
- "resolution": "1080p",
369
- "season": 2,
370
- "source": "WebRip",
371
- "title": "Otonari no Tenshi-sama ni Itsunomanika Dame Ningen ni Sareteita Ken"
372
- }
373
- },
374
- {
375
- "id": "lolihouse_kakuriyo_bare_ni_season",
376
- "filename": "[LoliHouse] Kakuriyo no Yadomeshi Ni - 12 [WebRip 1080p HEVC-10bit AAC SRTx2].mkv",
377
- "ok": true,
378
- "errors": {},
379
- "expected": {
380
- "group": "LoliHouse",
381
- "title": "Kakuriyo no Yadomeshi",
382
- "season": 2,
383
- "episode": 12,
384
- "resolution": "1080p",
385
- "source": "WebRip"
386
- },
387
- "pred": {
388
- "episode": 12,
389
- "group": "LoliHouse",
390
- "resolution": "1080p",
391
- "season": 2,
392
- "source": "WebRip",
393
- "title": "Kakuriyo no Yadomeshi"
394
- }
395
- },
396
- {
397
- "id": "ani_kakuriyo_traditional_ni",
398
- "filename": "[ANi] 妖怪旅館營業中 貳 - 11 [1080P][Baha][WEB-DL][AAC AVC][CHT].mp4",
399
- "ok": true,
400
- "errors": {},
401
- "expected": {
402
- "group": "ANi",
403
- "title": "妖怪旅館營業中",
404
- "season": 2,
405
- "episode": 11,
406
- "resolution": "1080P",
407
- "source": "Baha"
408
  },
409
- "pred": {
410
- "episode": 11,
411
- "group": "ANi",
412
- "resolution": "1080P",
413
- "season": 2,
414
- "source": "Baha",
415
- "title": "妖怪旅館營業中"
416
- }
417
- },
418
- {
419
- "id": "jibaketa_shokugeki_ni_no_sara",
420
- "filename": "[jibaketa]Shokugeki no Souma Ni no Sara - 13 END [BD 1920x1080 x264 AACx2 SRT TVB CHT].mkv",
421
- "ok": true,
422
- "errors": {},
423
- "expected": {
424
- "group": "jibaketa",
425
- "title": "Shokugeki no Souma",
426
- "season": 2,
427
- "episode": 13,
428
- "resolution": "1920x1080"
429
  },
430
- "pred": {
431
- "episode": 13,
432
- "group": "jibaketa",
433
- "resolution": "1920x1080",
434
- "season": 2,
435
- "title": "Shokugeki no Souma"
436
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
437
  },
438
- {
439
- "id": "ai_raws_fire_force_cjk_season_hash_episode",
440
- "filename": "[AI-Raws] 炎炎の消防隊 弐ノ章 #13 (BD HEVC 1920x1080 yuv444p10le FLAC)[FC74A2D5].mkv",
441
- "ok": true,
442
- "errors": {},
443
- "expected": {
444
- "group": "AI-Raws",
445
- "title": "炎炎の消防隊",
446
- "season": 2,
447
- "episode": 13,
448
- "resolution": "1920x1080"
 
 
 
 
 
 
 
449
  },
450
- "pred": {
451
- "episode": 13,
452
- "group": "AI-Raws",
453
- "resolution": "1920x1080",
454
- "season": 2,
455
- "title": "炎炎の消防隊"
456
- }
457
- },
458
- {
459
- "id": "gm_team_guoman_bilingual_s2",
460
- "filename": "[GM-Team][国漫][逆天邪神 第2季][Against the Gods Ⅱ][2026][04][HEVC][GB][4K].mp4",
461
- "ok": true,
462
- "errors": {},
463
- "expected": {
464
- "group": "GM-Team",
465
- "title": "逆天邪神",
466
- "season": 2,
467
- "episode": 4,
468
- "resolution": "4K",
469
- "source": "GB"
470
  },
471
- "pred": {
472
- "episode": 4,
473
- "group": "GM-Team",
474
- "resolution": "4K",
475
- "season": 2,
476
- "source": "GB",
477
- "title": "逆天邪神"
478
- }
479
- },
480
- {
481
- "id": "vcb_special_iv_not_episode",
482
- "filename": "[YYDM&VCB-Studio] Shinsekai Yori [IV05][Ma10p_1080p][x265_aac].mkv",
483
- "ok": true,
484
- "errors": {},
485
- "expected": {
486
- "group": "YYDM&VCB-Studio",
487
- "title": "Shinsekai Yori",
488
- "episode": null,
489
- "resolution": "1080p",
490
- "source": "x265_aac",
491
- "special": "IV05"
492
  },
493
- "pred": {
494
- "episode": null,
495
- "group": "YYDM&VCB-Studio",
496
- "resolution": "1080p",
497
- "source": "x265_aac",
498
- "special": "IV05",
499
- "title": "Shinsekai Yori"
500
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
501
  },
502
- {
503
- "id": "vcb_nced_not_episode",
504
- "filename": "[YYDM&VCB-Studio] Shinsekai Yori [NCED02][Ma10p_1080p][x265_flac].mkv",
505
- "ok": true,
506
- "errors": {},
507
- "expected": {
508
- "group": "YYDM&VCB-Studio",
509
- "title": "Shinsekai Yori",
510
- "episode": null,
511
- "resolution": "1080p",
512
- "source": "x265_flac",
513
- "special": "NCED02"
 
 
 
 
 
 
514
  },
515
- "pred": {
516
- "episode": null,
517
- "group": "YYDM&VCB-Studio",
518
- "resolution": "1080p",
519
- "source": "x265_flac",
520
- "special": "NCED02",
521
- "title": "Shinsekai Yori"
522
- }
523
- },
524
- {
525
- "id": "dot_nced_suffix_not_episode",
526
- "filename": "InuYasha.2000.NCED02.BDrip.AV1.10Bit.DTS.1080p-CalChi",
527
- "ok": true,
528
- "errors": {},
529
- "expected": {
530
- "title": "InuYasha",
531
- "episode": null,
532
- "resolution": "1080p",
533
- "source": "BDrip",
534
- "special": "NCED02"
535
  },
536
- "pred": {
537
- "episode": null,
538
- "resolution": "1080p",
539
- "source": "BDrip",
540
- "special": "NCED02",
541
- "title": "InuYasha"
542
- }
543
- },
544
- {
545
- "id": "vcb_numeric_title_nced",
546
- "filename": "[VCB-Studio] Yamada-kun to 7-nin no Majo [NCED][Ma10p_1080p][x265_flac]",
547
- "ok": true,
548
- "errors": {},
549
- "expected": {
550
- "group": "VCB-Studio",
551
- "title": "Yamada-kun to 7-nin no Majo",
552
- "episode": null,
553
- "resolution": "1080p",
554
- "source": "x265_flac",
555
- "special": "NCED"
556
  },
557
- "pred": {
558
- "episode": null,
559
- "group": "VCB-Studio",
560
- "resolution": "1080p",
561
- "source": "x265_flac",
562
- "special": "NCED",
563
- "title": "Yamada-kun to 7-nin no Majo"
564
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
565
  }
566
- ]
567
  }
 
1
  {
2
+ "primary_metric": "normalized_only",
3
+ "modes": {
4
+ "model_only": {
5
+ "model_dir": ".",
6
+ "case_file": "data/parser_regression_cases.json",
7
+ "tokenizer_variant": "char",
8
+ "max_length": 128,
9
+ "use_rules": false,
10
+ "constrain_bio": false,
11
+ "case_count": 26,
12
+ "full_correct": 25,
13
+ "full_accuracy": 0.9615384615384616,
14
+ "field_correct": {
15
+ "group": 22,
16
+ "title": 26,
17
+ "episode": 26,
18
+ "resolution": 25,
19
+ "source": 19,
20
+ "season": 9,
21
+ "special": 5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
  },
23
+ "field_total": {
24
+ "group": 22,
25
+ "title": 26,
26
+ "episode": 26,
27
+ "resolution": 26,
28
+ "source": 19,
29
+ "season": 9,
30
+ "special": 5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
31
  },
32
+ "field_accuracy": {
33
+ "episode": 1.0,
34
+ "group": 1.0,
35
+ "resolution": 0.9615384615384616,
36
+ "season": 1.0,
37
+ "source": 1.0,
38
+ "special": 1.0,
39
+ "title": 1.0
 
 
 
 
 
 
 
 
 
 
 
 
40
  },
41
+ "failures": [
42
+ {
43
+ "id": "studio_greentea_s2_bracket_episode",
44
+ "filename": "[Studio GreenTea] Otonari no Tenshi-sama ni Itsunomanika Dame Ningen ni Sareteita Ken S2 [06][WebRip][HEVC-10bit 1080p AAC][JPSC].mp4",
45
+ "ok": false,
46
+ "errors": {
47
+ "resolution": {
48
+ "expected": "1080p",
49
+ "pred": "P"
50
+ }
51
+ },
52
+ "expected": {
53
+ "group": "Studio GreenTea",
54
+ "title": "Otonari no Tenshi-sama ni Itsunomanika Dame Ningen ni Sareteita Ken",
55
+ "season": 2,
56
+ "episode": 6,
57
+ "resolution": "1080p",
58
+ "source": "WebRip"
59
+ },
60
+ "pred": {
61
+ "episode": 6,
62
+ "group": "Studio GreenTea",
63
+ "resolution": "P",
64
+ "season": 2,
65
+ "source": "WebRip",
66
+ "title": "Otonari no Tenshi-sama ni Itsunomanika Dame Ningen ni Sareteita Ken"
67
+ }
68
+ }
69
+ ],
70
+ "results": [
71
+ {
72
+ "id": "lolihouse_dash_episode",
73
+ "filename": "[LoliHouse] Yomi no Tsugai - 07 [WebRip 1080p HEVC-10bit AAC ASSx2]",
74
+ "ok": true,
75
+ "errors": {},
76
+ "expected": {
77
+ "group": "LoliHouse",
78
+ "title": "Yomi no Tsugai",
79
+ "episode": 7,
80
+ "resolution": "1080p",
81
+ "source": "WebRip"
82
+ },
83
+ "pred": {
84
+ "episode": 7,
85
+ "group": "LoliHouse",
86
+ "resolution": "1080p",
87
+ "source": "WebRip",
88
+ "title": "Yomi no Tsugai"
89
+ }
90
+ },
91
+ {
92
+ "id": "dot_season_episode_no_group",
93
+ "filename": "Witch.Hat.Atelier.S01E07.1080p.NF.WEB-DL.JPN.AAC2.0.H.264.MSubs-ToonsHub",
94
+ "ok": true,
95
+ "errors": {},
96
+ "expected": {
97
+ "title": "Witch.Hat.Atelier",
98
+ "season": 1,
99
+ "episode": 7,
100
+ "group": null,
101
+ "resolution": "1080p",
102
+ "source": "NF"
103
+ },
104
+ "pred": {
105
+ "episode": 7,
106
+ "group": null,
107
+ "resolution": "1080p",
108
+ "season": 1,
109
+ "source": "NF",
110
+ "title": "Witch.Hat.Atelier"
111
+ }
112
+ },
113
+ {
114
+ "id": "ani_cjk_season_dash_episode",
115
+ "filename": "[ANi] 異世界悠閒農家 2 - 06 [1080P][Baha][WEB-DL][AAC AVC][CHT]",
116
+ "ok": true,
117
+ "errors": {},
118
+ "expected": {
119
+ "group": "ANi",
120
+ "title": "異世界悠閒農家",
121
+ "season": 2,
122
+ "episode": 6,
123
+ "resolution": "1080P",
124
+ "source": "Baha"
125
+ },
126
+ "pred": {
127
+ "episode": 6,
128
+ "group": "ANi",
129
+ "resolution": "1080P",
130
+ "season": 2,
131
+ "source": "Baha",
132
+ "title": "異世界悠閒農家"
133
+ }
134
+ },
135
+ {
136
+ "id": "kisssub_bracket_title_episode",
137
+ "filename": "[KissSub][Shunkashuutou Daikousha - Haru no Mai][05][1080P][GB][MP4]",
138
+ "ok": true,
139
+ "errors": {},
140
+ "expected": {
141
+ "group": "KissSub",
142
+ "title": "Shunkashuutou Daikousha - Haru no Mai",
143
+ "episode": 5,
144
+ "resolution": "1080P",
145
+ "source": "GB"
146
+ },
147
+ "pred": {
148
+ "episode": 5,
149
+ "group": "KissSub",
150
+ "resolution": "1080P",
151
+ "source": "GB",
152
+ "title": "Shunkashuutou Daikousha - Haru no Mai"
153
+ }
154
+ },
155
+ {
156
+ "id": "airotabracket_title_episode",
157
+ "filename": "[Airota][Sousou no Frieren][29][1080p AVC AAC][CHT]",
158
+ "ok": true,
159
+ "errors": {},
160
+ "expected": {
161
+ "group": "Airota",
162
+ "title": "Sousou no Frieren",
163
+ "episode": 29,
164
+ "resolution": "1080p",
165
+ "source": "CHT"
166
+ },
167
+ "pred": {
168
+ "episode": 29,
169
+ "group": "Airota",
170
+ "resolution": "1080p",
171
+ "source": "CHT",
172
+ "title": "Sousou no Frieren"
173
+ }
174
+ },
175
+ {
176
+ "id": "subsplease_parenthesized_resolution",
177
+ "filename": "[SubsPlease] Mushoku Tensei - 12 (1080p) [x265][AAC]",
178
+ "ok": true,
179
+ "errors": {},
180
+ "expected": {
181
+ "group": "SubsPlease",
182
+ "title": "Mushoku Tensei",
183
+ "episode": 12,
184
+ "resolution": "1080p"
185
+ },
186
+ "pred": {
187
+ "episode": 12,
188
+ "group": "SubsPlease",
189
+ "resolution": "1080p",
190
+ "title": "Mushoku Tensei"
191
+ }
192
+ },
193
+ {
194
+ "id": "vcb_bracket_episode",
195
+ "filename": "[VCB-Studio] Girls Band Cry [01][Ma10p_1080p][x265_flac]",
196
+ "ok": true,
197
+ "errors": {},
198
+ "expected": {
199
+ "group": "VCB-Studio",
200
+ "title": "Girls Band Cry",
201
+ "episode": 1,
202
+ "resolution": "1080p"
203
+ },
204
+ "pred": {
205
+ "episode": 1,
206
+ "group": "VCB-Studio",
207
+ "resolution": "1080p",
208
+ "title": "Girls Band Cry"
209
+ }
210
+ },
211
+ {
212
+ "id": "numeric_title_not_episode",
213
+ "filename": "86 Eighty Six - 01 [1080P][Baha]",
214
+ "ok": true,
215
+ "errors": {},
216
+ "expected": {
217
+ "title": "86 Eighty Six",
218
+ "episode": 1,
219
+ "resolution": "1080P",
220
+ "source": "Baha"
221
+ },
222
+ "pred": {
223
+ "episode": 1,
224
+ "resolution": "1080P",
225
+ "source": "Baha",
226
+ "title": "86 Eighty Six"
227
+ }
228
+ },
229
+ {
230
+ "id": "erai_raws_dash_episode",
231
+ "filename": "[Erai-raws] Sousou no Frieren - 01 [1080p][Multiple Subtitle][ENG]",
232
+ "ok": true,
233
+ "errors": {},
234
+ "expected": {
235
+ "group": "Erai-raws",
236
+ "title": "Sousou no Frieren",
237
+ "episode": 1,
238
+ "resolution": "1080p"
239
+ },
240
+ "pred": {
241
+ "episode": 1,
242
+ "group": "Erai-raws",
243
+ "resolution": "1080p",
244
+ "title": "Sousou no Frieren"
245
+ }
246
+ },
247
+ {
248
+ "id": "nekomoe_space_group",
249
+ "filename": "[Nekomoe kissaten][Watashi no Shiawase na Kekkon][01][1080p][JPSC]",
250
+ "ok": true,
251
+ "errors": {},
252
+ "expected": {
253
+ "group": "Nekomoe kissaten",
254
+ "title": "Watashi no Shiawase na Kekkon",
255
+ "episode": 1,
256
+ "resolution": "1080p"
257
+ },
258
+ "pred": {
259
+ "episode": 1,
260
+ "group": "Nekomoe kissaten",
261
+ "resolution": "1080p",
262
+ "title": "Watashi no Shiawase na Kekkon"
263
+ }
264
+ },
265
+ {
266
+ "id": "long_running_episode",
267
+ "filename": "One.Piece.1110.1080p.WEB-DL.AAC2.0.H.264",
268
+ "ok": true,
269
+ "errors": {},
270
+ "expected": {
271
+ "title": "One.Piece",
272
+ "episode": 1110,
273
+ "resolution": "1080p",
274
+ "source": "WEB-DL"
275
+ },
276
+ "pred": {
277
+ "episode": 1110,
278
+ "resolution": "1080p",
279
+ "source": "WEB-DL",
280
+ "title": "One.Piece"
281
+ }
282
+ },
283
+ {
284
+ "id": "season_episode_amzn",
285
+ "filename": "Example.Show.S02E03.2160p.AMZN.WEB-DL.DDP5.1.H.265",
286
+ "ok": true,
287
+ "errors": {},
288
+ "expected": {
289
+ "title": "Example.Show",
290
+ "season": 2,
291
+ "episode": 3,
292
+ "resolution": "2160p",
293
+ "source": "AMZN"
294
+ },
295
+ "pred": {
296
+ "episode": 3,
297
+ "resolution": "2160p",
298
+ "season": 2,
299
+ "source": "AMZN",
300
+ "title": "Example.Show"
301
+ }
302
+ },
303
+ {
304
+ "id": "cjk_group_with_prefix_tag",
305
+ "filename": "【喵萌奶茶屋】★04月新番★[葬送的芙莉莲][01][1080P][HEVC]",
306
+ "ok": true,
307
+ "errors": {},
308
+ "expected": {
309
+ "group": "喵萌奶茶屋",
310
+ "title": "葬送的芙莉莲",
311
+ "episode": 1,
312
+ "resolution": "1080P"
313
+ },
314
+ "pred": {
315
+ "episode": 1,
316
+ "group": "喵萌奶茶屋",
317
+ "resolution": "1080P",
318
+ "title": "葬送的芙莉莲"
319
+ }
320
+ },
321
+ {
322
+ "id": "leading_meta_not_group",
323
+ "filename": "[1080p] Witch Watch - 15 [CHS]",
324
+ "ok": true,
325
+ "errors": {},
326
+ "expected": {
327
+ "group": null,
328
+ "title": "Witch Watch",
329
+ "episode": 15,
330
+ "resolution": "1080p",
331
+ "source": "CHS"
332
+ },
333
+ "pred": {
334
+ "episode": 15,
335
+ "group": null,
336
+ "resolution": "1080p",
337
+ "source": "CHS",
338
+ "title": "Witch Watch"
339
+ }
340
+ },
341
+ {
342
+ "id": "sakurato_group_language_source",
343
+ "filename": "[Sakurato] Witch Watch - 15 [1080p][CHS]",
344
+ "ok": true,
345
+ "errors": {},
346
+ "expected": {
347
+ "group": "Sakurato",
348
+ "title": "Witch Watch",
349
+ "episode": 15,
350
+ "resolution": "1080p",
351
+ "source": "CHS"
352
+ },
353
+ "pred": {
354
+ "episode": 15,
355
+ "group": "Sakurato",
356
+ "resolution": "1080p",
357
+ "source": "CHS",
358
+ "title": "Witch Watch"
359
+ }
360
+ },
361
+ {
362
+ "id": "billion_meta_lab_search_special",
363
+ "filename": "[Billion Meta Lab] 魔法姊妹露露莉莉 Mahou no Shimai Rurutto Riryi [07][1080P][CHT&JPN][檢索:魔法姊妹露露特莉莉].mp4",
364
+ "ok": true,
365
+ "errors": {},
366
+ "expected": {
367
+ "group": "Billion Meta Lab",
368
+ "title": "魔法姊妹露露莉莉 Mahou no Shimai Rurutto Riryi",
369
+ "episode": 7,
370
+ "resolution": "1080P",
371
+ "source": "CHT&JPN",
372
+ "special": "檢索:魔法姊妹露露特莉莉"
373
+ },
374
+ "pred": {
375
+ "episode": 7,
376
+ "group": "Billion Meta Lab",
377
+ "resolution": "1080P",
378
+ "source": "CHT&JPN",
379
+ "special": "檢索:魔法姊妹露露特莉莉",
380
+ "title": "魔法姊妹露露莉莉 Mahou no Shimai Rurutto Riryi"
381
+ }
382
+ },
383
+ {
384
+ "id": "studio_greentea_s2_bracket_episode",
385
+ "filename": "[Studio GreenTea] Otonari no Tenshi-sama ni Itsunomanika Dame Ningen ni Sareteita Ken S2 [06][WebRip][HEVC-10bit 1080p AAC][JPSC].mp4",
386
+ "ok": false,
387
+ "errors": {
388
+ "resolution": {
389
+ "expected": "1080p",
390
+ "pred": "P"
391
+ }
392
+ },
393
+ "expected": {
394
+ "group": "Studio GreenTea",
395
+ "title": "Otonari no Tenshi-sama ni Itsunomanika Dame Ningen ni Sareteita Ken",
396
+ "season": 2,
397
+ "episode": 6,
398
+ "resolution": "1080p",
399
+ "source": "WebRip"
400
+ },
401
+ "pred": {
402
+ "episode": 6,
403
+ "group": "Studio GreenTea",
404
+ "resolution": "P",
405
+ "season": 2,
406
+ "source": "WebRip",
407
+ "title": "Otonari no Tenshi-sama ni Itsunomanika Dame Ningen ni Sareteita Ken"
408
+ }
409
+ },
410
+ {
411
+ "id": "lolihouse_kakuriyo_bare_ni_season",
412
+ "filename": "[LoliHouse] Kakuriyo no Yadomeshi Ni - 12 [WebRip 1080p HEVC-10bit AAC SRTx2].mkv",
413
+ "ok": true,
414
+ "errors": {},
415
+ "expected": {
416
+ "group": "LoliHouse",
417
+ "title": "Kakuriyo no Yadomeshi",
418
+ "season": 2,
419
+ "episode": 12,
420
+ "resolution": "1080p",
421
+ "source": "WebRip"
422
+ },
423
+ "pred": {
424
+ "episode": 12,
425
+ "group": "LoliHouse",
426
+ "resolution": "1080p",
427
+ "season": 2,
428
+ "source": "WebRip",
429
+ "title": "Kakuriyo no Yadomeshi"
430
+ }
431
+ },
432
+ {
433
+ "id": "ani_kakuriyo_traditional_ni",
434
+ "filename": "[ANi] 妖怪旅館營業中 貳 - 11 [1080P][Baha][WEB-DL][AAC AVC][CHT].mp4",
435
+ "ok": true,
436
+ "errors": {},
437
+ "expected": {
438
+ "group": "ANi",
439
+ "title": "妖怪旅館營業中",
440
+ "season": 2,
441
+ "episode": 11,
442
+ "resolution": "1080P",
443
+ "source": "Baha"
444
+ },
445
+ "pred": {
446
+ "episode": 11,
447
+ "group": "ANi",
448
+ "resolution": "1080P",
449
+ "season": 2,
450
+ "source": "Baha",
451
+ "title": "妖怪旅館營業中"
452
+ }
453
+ },
454
+ {
455
+ "id": "jibaketa_shokugeki_ni_no_sara",
456
+ "filename": "[jibaketa]Shokugeki no Souma Ni no Sara - 13 END [BD 1920x1080 x264 AACx2 SRT TVB CHT].mkv",
457
+ "ok": true,
458
+ "errors": {},
459
+ "expected": {
460
+ "group": "jibaketa",
461
+ "title": "Shokugeki no Souma",
462
+ "season": 2,
463
+ "episode": 13,
464
+ "resolution": "1920x1080"
465
+ },
466
+ "pred": {
467
+ "episode": 13,
468
+ "group": "jibaketa",
469
+ "resolution": "1920x1080",
470
+ "season": 2,
471
+ "title": "Shokugeki no Souma"
472
+ }
473
+ },
474
+ {
475
+ "id": "ai_raws_fire_force_cjk_season_hash_episode",
476
+ "filename": "[AI-Raws] 炎炎の消防隊 弐ノ章 #13 (BD HEVC 1920x1080 yuv444p10le FLAC)[FC74A2D5].mkv",
477
+ "ok": true,
478
+ "errors": {},
479
+ "expected": {
480
+ "group": "AI-Raws",
481
+ "title": "炎炎の消防隊",
482
+ "season": 2,
483
+ "episode": 13,
484
+ "resolution": "1920x1080"
485
+ },
486
+ "pred": {
487
+ "episode": 13,
488
+ "group": "AI-Raws",
489
+ "resolution": "1920x1080",
490
+ "season": 2,
491
+ "title": "炎炎の消防隊"
492
+ }
493
+ },
494
+ {
495
+ "id": "gm_team_guoman_bilingual_s2",
496
+ "filename": "[GM-Team][国漫][逆天邪神 第2季][Against the Gods Ⅱ][2026][04][HEVC][GB][4K].mp4",
497
+ "ok": true,
498
+ "errors": {},
499
+ "expected": {
500
+ "group": "GM-Team",
501
+ "title": "逆天邪神",
502
+ "season": 2,
503
+ "episode": 4,
504
+ "resolution": "4K",
505
+ "source": "GB"
506
+ },
507
+ "pred": {
508
+ "episode": 4,
509
+ "group": "GM-Team",
510
+ "resolution": "4K",
511
+ "season": 2,
512
+ "source": "GB",
513
+ "title": "逆天邪神"
514
+ }
515
+ },
516
+ {
517
+ "id": "vcb_special_iv_not_episode",
518
+ "filename": "[YYDM&VCB-Studio] Shinsekai Yori [IV05][Ma10p_1080p][x265_aac].mkv",
519
+ "ok": true,
520
+ "errors": {},
521
+ "expected": {
522
+ "group": "YYDM&VCB-Studio",
523
+ "title": "Shinsekai Yori",
524
+ "episode": null,
525
+ "resolution": "1080p",
526
+ "source": "x265_aac",
527
+ "special": "IV05"
528
+ },
529
+ "pred": {
530
+ "episode": null,
531
+ "group": "YYDM&VCB-Studio",
532
+ "resolution": "1080p",
533
+ "source": "x265-aac",
534
+ "special": "IV05",
535
+ "title": "Shinsekai Yori"
536
+ }
537
+ },
538
+ {
539
+ "id": "vcb_nced_not_episode",
540
+ "filename": "[YYDM&VCB-Studio] Shinsekai Yori [NCED02][Ma10p_1080p][x265_flac].mkv",
541
+ "ok": true,
542
+ "errors": {},
543
+ "expected": {
544
+ "group": "YYDM&VCB-Studio",
545
+ "title": "Shinsekai Yori",
546
+ "episode": null,
547
+ "resolution": "1080p",
548
+ "source": "x265_flac",
549
+ "special": "NCED02"
550
+ },
551
+ "pred": {
552
+ "episode": null,
553
+ "group": "YYDM&VCB-Studio",
554
+ "resolution": "1080p",
555
+ "source": "x265-flac",
556
+ "special": "NCED02",
557
+ "title": "Shinsekai Yori"
558
+ }
559
+ },
560
+ {
561
+ "id": "dot_nced_suffix_not_episode",
562
+ "filename": "InuYasha.2000.NCED02.BDrip.AV1.10Bit.DTS.1080p-CalChi",
563
+ "ok": true,
564
+ "errors": {},
565
+ "expected": {
566
+ "title": "InuYasha",
567
+ "episode": null,
568
+ "resolution": "1080p",
569
+ "source": "BDrip",
570
+ "special": "NCED02"
571
+ },
572
+ "pred": {
573
+ "episode": null,
574
+ "resolution": "1080p",
575
+ "source": "BDrip",
576
+ "special": "NCED02",
577
+ "title": "InuYasha"
578
+ }
579
+ },
580
+ {
581
+ "id": "vcb_numeric_title_nced",
582
+ "filename": "[VCB-Studio] Yamada-kun to 7-nin no Majo [NCED][Ma10p_1080p][x265_flac]",
583
+ "ok": true,
584
+ "errors": {},
585
+ "expected": {
586
+ "group": "VCB-Studio",
587
+ "title": "Yamada-kun to 7-nin no Majo",
588
+ "episode": null,
589
+ "resolution": "1080p",
590
+ "source": "x265_flac",
591
+ "special": "NCED"
592
+ },
593
+ "pred": {
594
+ "episode": null,
595
+ "group": "VCB-Studio",
596
+ "resolution": "1080p",
597
+ "source": "x265-flac",
598
+ "special": "NCED",
599
+ "title": "Yamada-kun to 7-nin no Majo"
600
+ }
601
+ }
602
+ ]
603
  },
604
+ "normalized_only": {
605
+ "model_dir": ".",
606
+ "case_file": "data/parser_regression_cases.json",
607
+ "tokenizer_variant": "char",
608
+ "max_length": 128,
609
+ "use_rules": false,
610
+ "constrain_bio": true,
611
+ "case_count": 26,
612
+ "full_correct": 26,
613
+ "full_accuracy": 1.0,
614
+ "field_correct": {
615
+ "group": 22,
616
+ "title": 26,
617
+ "episode": 26,
618
+ "resolution": 26,
619
+ "source": 19,
620
+ "season": 9,
621
+ "special": 5
622
  },
623
+ "field_total": {
624
+ "group": 22,
625
+ "title": 26,
626
+ "episode": 26,
627
+ "resolution": 26,
628
+ "source": 19,
629
+ "season": 9,
630
+ "special": 5
 
 
 
 
 
 
 
 
 
 
 
 
631
  },
632
+ "field_accuracy": {
633
+ "episode": 1.0,
634
+ "group": 1.0,
635
+ "resolution": 1.0,
636
+ "season": 1.0,
637
+ "source": 1.0,
638
+ "special": 1.0,
639
+ "title": 1.0
 
 
 
 
 
 
 
 
 
 
 
 
 
640
  },
641
+ "failures": [],
642
+ "results": [
643
+ {
644
+ "id": "lolihouse_dash_episode",
645
+ "filename": "[LoliHouse] Yomi no Tsugai - 07 [WebRip 1080p HEVC-10bit AAC ASSx2]",
646
+ "ok": true,
647
+ "errors": {},
648
+ "expected": {
649
+ "group": "LoliHouse",
650
+ "title": "Yomi no Tsugai",
651
+ "episode": 7,
652
+ "resolution": "1080p",
653
+ "source": "WebRip"
654
+ },
655
+ "pred": {
656
+ "episode": 7,
657
+ "group": "LoliHouse",
658
+ "resolution": "1080p",
659
+ "source": "WebRip",
660
+ "title": "Yomi no Tsugai"
661
+ }
662
+ },
663
+ {
664
+ "id": "dot_season_episode_no_group",
665
+ "filename": "Witch.Hat.Atelier.S01E07.1080p.NF.WEB-DL.JPN.AAC2.0.H.264.MSubs-ToonsHub",
666
+ "ok": true,
667
+ "errors": {},
668
+ "expected": {
669
+ "title": "Witch.Hat.Atelier",
670
+ "season": 1,
671
+ "episode": 7,
672
+ "group": null,
673
+ "resolution": "1080p",
674
+ "source": "NF"
675
+ },
676
+ "pred": {
677
+ "episode": 7,
678
+ "group": null,
679
+ "resolution": "1080p",
680
+ "season": 1,
681
+ "source": "NF",
682
+ "title": "Witch.Hat.Atelier"
683
+ }
684
+ },
685
+ {
686
+ "id": "ani_cjk_season_dash_episode",
687
+ "filename": "[ANi] 異世界悠閒農家 2 - 06 [1080P][Baha][WEB-DL][AAC AVC][CHT]",
688
+ "ok": true,
689
+ "errors": {},
690
+ "expected": {
691
+ "group": "ANi",
692
+ "title": "異世界悠閒農家",
693
+ "season": 2,
694
+ "episode": 6,
695
+ "resolution": "1080P",
696
+ "source": "Baha"
697
+ },
698
+ "pred": {
699
+ "episode": 6,
700
+ "group": "ANi",
701
+ "resolution": "1080P",
702
+ "season": 2,
703
+ "source": "Baha",
704
+ "title": "異世界悠閒農家"
705
+ }
706
+ },
707
+ {
708
+ "id": "kisssub_bracket_title_episode",
709
+ "filename": "[KissSub][Shunkashuutou Daikousha - Haru no Mai][05][1080P][GB][MP4]",
710
+ "ok": true,
711
+ "errors": {},
712
+ "expected": {
713
+ "group": "KissSub",
714
+ "title": "Shunkashuutou Daikousha - Haru no Mai",
715
+ "episode": 5,
716
+ "resolution": "1080P",
717
+ "source": "GB"
718
+ },
719
+ "pred": {
720
+ "episode": 5,
721
+ "group": "KissSub",
722
+ "resolution": "1080P",
723
+ "source": "GB",
724
+ "title": "Shunkashuutou Daikousha - Haru no Mai"
725
+ }
726
+ },
727
+ {
728
+ "id": "airotabracket_title_episode",
729
+ "filename": "[Airota][Sousou no Frieren][29][1080p AVC AAC][CHT]",
730
+ "ok": true,
731
+ "errors": {},
732
+ "expected": {
733
+ "group": "Airota",
734
+ "title": "Sousou no Frieren",
735
+ "episode": 29,
736
+ "resolution": "1080p",
737
+ "source": "CHT"
738
+ },
739
+ "pred": {
740
+ "episode": 29,
741
+ "group": "Airota",
742
+ "resolution": "1080p",
743
+ "source": "CHT",
744
+ "title": "Sousou no Frieren"
745
+ }
746
+ },
747
+ {
748
+ "id": "subsplease_parenthesized_resolution",
749
+ "filename": "[SubsPlease] Mushoku Tensei - 12 (1080p) [x265][AAC]",
750
+ "ok": true,
751
+ "errors": {},
752
+ "expected": {
753
+ "group": "SubsPlease",
754
+ "title": "Mushoku Tensei",
755
+ "episode": 12,
756
+ "resolution": "1080p"
757
+ },
758
+ "pred": {
759
+ "episode": 12,
760
+ "group": "SubsPlease",
761
+ "resolution": "1080p",
762
+ "title": "Mushoku Tensei"
763
+ }
764
+ },
765
+ {
766
+ "id": "vcb_bracket_episode",
767
+ "filename": "[VCB-Studio] Girls Band Cry [01][Ma10p_1080p][x265_flac]",
768
+ "ok": true,
769
+ "errors": {},
770
+ "expected": {
771
+ "group": "VCB-Studio",
772
+ "title": "Girls Band Cry",
773
+ "episode": 1,
774
+ "resolution": "1080p"
775
+ },
776
+ "pred": {
777
+ "episode": 1,
778
+ "group": "VCB-Studio",
779
+ "resolution": "1080p",
780
+ "title": "Girls Band Cry"
781
+ }
782
+ },
783
+ {
784
+ "id": "numeric_title_not_episode",
785
+ "filename": "86 Eighty Six - 01 [1080P][Baha]",
786
+ "ok": true,
787
+ "errors": {},
788
+ "expected": {
789
+ "title": "86 Eighty Six",
790
+ "episode": 1,
791
+ "resolution": "1080P",
792
+ "source": "Baha"
793
+ },
794
+ "pred": {
795
+ "episode": 1,
796
+ "resolution": "1080P",
797
+ "source": "Baha",
798
+ "title": "86 Eighty Six"
799
+ }
800
+ },
801
+ {
802
+ "id": "erai_raws_dash_episode",
803
+ "filename": "[Erai-raws] Sousou no Frieren - 01 [1080p][Multiple Subtitle][ENG]",
804
+ "ok": true,
805
+ "errors": {},
806
+ "expected": {
807
+ "group": "Erai-raws",
808
+ "title": "Sousou no Frieren",
809
+ "episode": 1,
810
+ "resolution": "1080p"
811
+ },
812
+ "pred": {
813
+ "episode": 1,
814
+ "group": "Erai-raws",
815
+ "resolution": "1080p",
816
+ "title": "Sousou no Frieren"
817
+ }
818
+ },
819
+ {
820
+ "id": "nekomoe_space_group",
821
+ "filename": "[Nekomoe kissaten][Watashi no Shiawase na Kekkon][01][1080p][JPSC]",
822
+ "ok": true,
823
+ "errors": {},
824
+ "expected": {
825
+ "group": "Nekomoe kissaten",
826
+ "title": "Watashi no Shiawase na Kekkon",
827
+ "episode": 1,
828
+ "resolution": "1080p"
829
+ },
830
+ "pred": {
831
+ "episode": 1,
832
+ "group": "Nekomoe kissaten",
833
+ "resolution": "1080p",
834
+ "title": "Watashi no Shiawase na Kekkon"
835
+ }
836
+ },
837
+ {
838
+ "id": "long_running_episode",
839
+ "filename": "One.Piece.1110.1080p.WEB-DL.AAC2.0.H.264",
840
+ "ok": true,
841
+ "errors": {},
842
+ "expected": {
843
+ "title": "One.Piece",
844
+ "episode": 1110,
845
+ "resolution": "1080p",
846
+ "source": "WEB-DL"
847
+ },
848
+ "pred": {
849
+ "episode": 1110,
850
+ "resolution": "1080p",
851
+ "source": "WEB-DL",
852
+ "title": "One.Piece"
853
+ }
854
+ },
855
+ {
856
+ "id": "season_episode_amzn",
857
+ "filename": "Example.Show.S02E03.2160p.AMZN.WEB-DL.DDP5.1.H.265",
858
+ "ok": true,
859
+ "errors": {},
860
+ "expected": {
861
+ "title": "Example.Show",
862
+ "season": 2,
863
+ "episode": 3,
864
+ "resolution": "2160p",
865
+ "source": "AMZN"
866
+ },
867
+ "pred": {
868
+ "episode": 3,
869
+ "resolution": "2160p",
870
+ "season": 2,
871
+ "source": "AMZN",
872
+ "title": "Example.Show"
873
+ }
874
+ },
875
+ {
876
+ "id": "cjk_group_with_prefix_tag",
877
+ "filename": "【喵萌奶茶屋】★04月新番★[葬送的芙莉莲][01][1080P][HEVC]",
878
+ "ok": true,
879
+ "errors": {},
880
+ "expected": {
881
+ "group": "喵萌奶茶屋",
882
+ "title": "葬送的芙莉莲",
883
+ "episode": 1,
884
+ "resolution": "1080P"
885
+ },
886
+ "pred": {
887
+ "episode": 1,
888
+ "group": "喵萌奶茶屋",
889
+ "resolution": "1080P",
890
+ "title": "葬送的芙莉莲"
891
+ }
892
+ },
893
+ {
894
+ "id": "leading_meta_not_group",
895
+ "filename": "[1080p] Witch Watch - 15 [CHS]",
896
+ "ok": true,
897
+ "errors": {},
898
+ "expected": {
899
+ "group": null,
900
+ "title": "Witch Watch",
901
+ "episode": 15,
902
+ "resolution": "1080p",
903
+ "source": "CHS"
904
+ },
905
+ "pred": {
906
+ "episode": 15,
907
+ "group": null,
908
+ "resolution": "1080p",
909
+ "source": "CHS",
910
+ "title": "Witch Watch"
911
+ }
912
+ },
913
+ {
914
+ "id": "sakurato_group_language_source",
915
+ "filename": "[Sakurato] Witch Watch - 15 [1080p][CHS]",
916
+ "ok": true,
917
+ "errors": {},
918
+ "expected": {
919
+ "group": "Sakurato",
920
+ "title": "Witch Watch",
921
+ "episode": 15,
922
+ "resolution": "1080p",
923
+ "source": "CHS"
924
+ },
925
+ "pred": {
926
+ "episode": 15,
927
+ "group": "Sakurato",
928
+ "resolution": "1080p",
929
+ "source": "CHS",
930
+ "title": "Witch Watch"
931
+ }
932
+ },
933
+ {
934
+ "id": "billion_meta_lab_search_special",
935
+ "filename": "[Billion Meta Lab] 魔法姊妹露露莉莉 Mahou no Shimai Rurutto Riryi [07][1080P][CHT&JPN][檢索:魔法姊妹露露特莉莉].mp4",
936
+ "ok": true,
937
+ "errors": {},
938
+ "expected": {
939
+ "group": "Billion Meta Lab",
940
+ "title": "魔法姊妹露露莉莉 Mahou no Shimai Rurutto Riryi",
941
+ "episode": 7,
942
+ "resolution": "1080P",
943
+ "source": "CHT&JPN",
944
+ "special": "檢索:魔法姊妹露露特莉莉"
945
+ },
946
+ "pred": {
947
+ "episode": 7,
948
+ "group": "Billion Meta Lab",
949
+ "resolution": "1080P",
950
+ "source": "CHT&JPN",
951
+ "special": "檢索:魔法姊妹露露特莉莉",
952
+ "title": "魔法姊妹露露莉莉 Mahou no Shimai Rurutto Riryi"
953
+ }
954
+ },
955
+ {
956
+ "id": "studio_greentea_s2_bracket_episode",
957
+ "filename": "[Studio GreenTea] Otonari no Tenshi-sama ni Itsunomanika Dame Ningen ni Sareteita Ken S2 [06][WebRip][HEVC-10bit 1080p AAC][JPSC].mp4",
958
+ "ok": true,
959
+ "errors": {},
960
+ "expected": {
961
+ "group": "Studio GreenTea",
962
+ "title": "Otonari no Tenshi-sama ni Itsunomanika Dame Ningen ni Sareteita Ken",
963
+ "season": 2,
964
+ "episode": 6,
965
+ "resolution": "1080p",
966
+ "source": "WebRip"
967
+ },
968
+ "pred": {
969
+ "episode": 6,
970
+ "group": "Studio GreenTea",
971
+ "resolution": "1080p",
972
+ "season": 2,
973
+ "source": "WebRip",
974
+ "title": "Otonari no Tenshi-sama ni Itsunomanika Dame Ningen ni Sareteita Ken"
975
+ }
976
+ },
977
+ {
978
+ "id": "lolihouse_kakuriyo_bare_ni_season",
979
+ "filename": "[LoliHouse] Kakuriyo no Yadomeshi Ni - 12 [WebRip 1080p HEVC-10bit AAC SRTx2].mkv",
980
+ "ok": true,
981
+ "errors": {},
982
+ "expected": {
983
+ "group": "LoliHouse",
984
+ "title": "Kakuriyo no Yadomeshi",
985
+ "season": 2,
986
+ "episode": 12,
987
+ "resolution": "1080p",
988
+ "source": "WebRip"
989
+ },
990
+ "pred": {
991
+ "episode": 12,
992
+ "group": "LoliHouse",
993
+ "resolution": "1080p",
994
+ "season": 2,
995
+ "source": "WebRip",
996
+ "title": "Kakuriyo no Yadomeshi"
997
+ }
998
+ },
999
+ {
1000
+ "id": "ani_kakuriyo_traditional_ni",
1001
+ "filename": "[ANi] 妖怪旅館營業中 貳 - 11 [1080P][Baha][WEB-DL][AAC AVC][CHT].mp4",
1002
+ "ok": true,
1003
+ "errors": {},
1004
+ "expected": {
1005
+ "group": "ANi",
1006
+ "title": "妖怪旅館營業中",
1007
+ "season": 2,
1008
+ "episode": 11,
1009
+ "resolution": "1080P",
1010
+ "source": "Baha"
1011
+ },
1012
+ "pred": {
1013
+ "episode": 11,
1014
+ "group": "ANi",
1015
+ "resolution": "1080P",
1016
+ "season": 2,
1017
+ "source": "Baha",
1018
+ "title": "妖怪旅館營業中"
1019
+ }
1020
+ },
1021
+ {
1022
+ "id": "jibaketa_shokugeki_ni_no_sara",
1023
+ "filename": "[jibaketa]Shokugeki no Souma Ni no Sara - 13 END [BD 1920x1080 x264 AACx2 SRT TVB CHT].mkv",
1024
+ "ok": true,
1025
+ "errors": {},
1026
+ "expected": {
1027
+ "group": "jibaketa",
1028
+ "title": "Shokugeki no Souma",
1029
+ "season": 2,
1030
+ "episode": 13,
1031
+ "resolution": "1920x1080"
1032
+ },
1033
+ "pred": {
1034
+ "episode": 13,
1035
+ "group": "jibaketa",
1036
+ "resolution": "1920x1080",
1037
+ "season": 2,
1038
+ "title": "Shokugeki no Souma"
1039
+ }
1040
+ },
1041
+ {
1042
+ "id": "ai_raws_fire_force_cjk_season_hash_episode",
1043
+ "filename": "[AI-Raws] 炎炎の消防隊 弐ノ章 #13 (BD HEVC 1920x1080 yuv444p10le FLAC)[FC74A2D5].mkv",
1044
+ "ok": true,
1045
+ "errors": {},
1046
+ "expected": {
1047
+ "group": "AI-Raws",
1048
+ "title": "炎炎の消防隊",
1049
+ "season": 2,
1050
+ "episode": 13,
1051
+ "resolution": "1920x1080"
1052
+ },
1053
+ "pred": {
1054
+ "episode": 13,
1055
+ "group": "AI-Raws",
1056
+ "resolution": "1920x1080",
1057
+ "season": 2,
1058
+ "title": "炎炎の消防隊"
1059
+ }
1060
+ },
1061
+ {
1062
+ "id": "gm_team_guoman_bilingual_s2",
1063
+ "filename": "[GM-Team][国漫][逆天邪神 第2季][Against the Gods Ⅱ][2026][04][HEVC][GB][4K].mp4",
1064
+ "ok": true,
1065
+ "errors": {},
1066
+ "expected": {
1067
+ "group": "GM-Team",
1068
+ "title": "逆天邪神",
1069
+ "season": 2,
1070
+ "episode": 4,
1071
+ "resolution": "4K",
1072
+ "source": "GB"
1073
+ },
1074
+ "pred": {
1075
+ "episode": 4,
1076
+ "group": "GM-Team",
1077
+ "resolution": "4K",
1078
+ "season": 2,
1079
+ "source": "GB",
1080
+ "title": "逆天邪神"
1081
+ }
1082
+ },
1083
+ {
1084
+ "id": "vcb_special_iv_not_episode",
1085
+ "filename": "[YYDM&VCB-Studio] Shinsekai Yori [IV05][Ma10p_1080p][x265_aac].mkv",
1086
+ "ok": true,
1087
+ "errors": {},
1088
+ "expected": {
1089
+ "group": "YYDM&VCB-Studio",
1090
+ "title": "Shinsekai Yori",
1091
+ "episode": null,
1092
+ "resolution": "1080p",
1093
+ "source": "x265_aac",
1094
+ "special": "IV05"
1095
+ },
1096
+ "pred": {
1097
+ "episode": null,
1098
+ "group": "YYDM&VCB-Studio",
1099
+ "resolution": "1080p",
1100
+ "source": "x265-aac",
1101
+ "special": "IV05",
1102
+ "title": "Shinsekai Yori"
1103
+ }
1104
+ },
1105
+ {
1106
+ "id": "vcb_nced_not_episode",
1107
+ "filename": "[YYDM&VCB-Studio] Shinsekai Yori [NCED02][Ma10p_1080p][x265_flac].mkv",
1108
+ "ok": true,
1109
+ "errors": {},
1110
+ "expected": {
1111
+ "group": "YYDM&VCB-Studio",
1112
+ "title": "Shinsekai Yori",
1113
+ "episode": null,
1114
+ "resolution": "1080p",
1115
+ "source": "x265_flac",
1116
+ "special": "NCED02"
1117
+ },
1118
+ "pred": {
1119
+ "episode": null,
1120
+ "group": "YYDM&VCB-Studio",
1121
+ "resolution": "1080p",
1122
+ "source": "x265-flac",
1123
+ "special": "NCED02",
1124
+ "title": "Shinsekai Yori"
1125
+ }
1126
+ },
1127
+ {
1128
+ "id": "dot_nced_suffix_not_episode",
1129
+ "filename": "InuYasha.2000.NCED02.BDrip.AV1.10Bit.DTS.1080p-CalChi",
1130
+ "ok": true,
1131
+ "errors": {},
1132
+ "expected": {
1133
+ "title": "InuYasha",
1134
+ "episode": null,
1135
+ "resolution": "1080p",
1136
+ "source": "BDrip",
1137
+ "special": "NCED02"
1138
+ },
1139
+ "pred": {
1140
+ "episode": null,
1141
+ "resolution": "1080p",
1142
+ "source": "BDrip",
1143
+ "special": "NCED02",
1144
+ "title": "InuYasha"
1145
+ }
1146
+ },
1147
+ {
1148
+ "id": "vcb_numeric_title_nced",
1149
+ "filename": "[VCB-Studio] Yamada-kun to 7-nin no Majo [NCED][Ma10p_1080p][x265_flac]",
1150
+ "ok": true,
1151
+ "errors": {},
1152
+ "expected": {
1153
+ "group": "VCB-Studio",
1154
+ "title": "Yamada-kun to 7-nin no Majo",
1155
+ "episode": null,
1156
+ "resolution": "1080p",
1157
+ "source": "x265_flac",
1158
+ "special": "NCED"
1159
+ },
1160
+ "pred": {
1161
+ "episode": null,
1162
+ "group": "VCB-Studio",
1163
+ "resolution": "1080p",
1164
+ "source": "x265-flac",
1165
+ "special": "NCED",
1166
+ "title": "Yamada-kun to 7-nin no Majo"
1167
+ }
1168
+ }
1169
+ ]
1170
  },
1171
+ "rule_assisted": {
1172
+ "model_dir": ".",
1173
+ "case_file": "data/parser_regression_cases.json",
1174
+ "tokenizer_variant": "char",
1175
+ "max_length": 128,
1176
+ "use_rules": true,
1177
+ "constrain_bio": true,
1178
+ "case_count": 26,
1179
+ "full_correct": 26,
1180
+ "full_accuracy": 1.0,
1181
+ "field_correct": {
1182
+ "group": 22,
1183
+ "title": 26,
1184
+ "episode": 26,
1185
+ "resolution": 26,
1186
+ "source": 19,
1187
+ "season": 9,
1188
+ "special": 5
1189
  },
1190
+ "field_total": {
1191
+ "group": 22,
1192
+ "title": 26,
1193
+ "episode": 26,
1194
+ "resolution": 26,
1195
+ "source": 19,
1196
+ "season": 9,
1197
+ "special": 5
 
 
 
 
 
 
 
 
 
 
 
 
1198
  },
1199
+ "field_accuracy": {
1200
+ "episode": 1.0,
1201
+ "group": 1.0,
1202
+ "resolution": 1.0,
1203
+ "season": 1.0,
1204
+ "source": 1.0,
1205
+ "special": 1.0,
1206
+ "title": 1.0
 
 
 
 
 
 
 
 
 
 
 
 
1207
  },
1208
+ "failures": [],
1209
+ "results": [
1210
+ {
1211
+ "id": "lolihouse_dash_episode",
1212
+ "filename": "[LoliHouse] Yomi no Tsugai - 07 [WebRip 1080p HEVC-10bit AAC ASSx2]",
1213
+ "ok": true,
1214
+ "errors": {},
1215
+ "expected": {
1216
+ "group": "LoliHouse",
1217
+ "title": "Yomi no Tsugai",
1218
+ "episode": 7,
1219
+ "resolution": "1080p",
1220
+ "source": "WebRip"
1221
+ },
1222
+ "pred": {
1223
+ "episode": 7,
1224
+ "group": "LoliHouse",
1225
+ "resolution": "1080p",
1226
+ "source": "WebRip",
1227
+ "title": "Yomi no Tsugai"
1228
+ }
1229
+ },
1230
+ {
1231
+ "id": "dot_season_episode_no_group",
1232
+ "filename": "Witch.Hat.Atelier.S01E07.1080p.NF.WEB-DL.JPN.AAC2.0.H.264.MSubs-ToonsHub",
1233
+ "ok": true,
1234
+ "errors": {},
1235
+ "expected": {
1236
+ "title": "Witch.Hat.Atelier",
1237
+ "season": 1,
1238
+ "episode": 7,
1239
+ "group": null,
1240
+ "resolution": "1080p",
1241
+ "source": "NF"
1242
+ },
1243
+ "pred": {
1244
+ "episode": 7,
1245
+ "group": null,
1246
+ "resolution": "1080p",
1247
+ "season": 1,
1248
+ "source": "NF",
1249
+ "title": "Witch.Hat.Atelier"
1250
+ }
1251
+ },
1252
+ {
1253
+ "id": "ani_cjk_season_dash_episode",
1254
+ "filename": "[ANi] 異世界悠閒農家 2 - 06 [1080P][Baha][WEB-DL][AAC AVC][CHT]",
1255
+ "ok": true,
1256
+ "errors": {},
1257
+ "expected": {
1258
+ "group": "ANi",
1259
+ "title": "異世界悠閒農家",
1260
+ "season": 2,
1261
+ "episode": 6,
1262
+ "resolution": "1080P",
1263
+ "source": "Baha"
1264
+ },
1265
+ "pred": {
1266
+ "episode": 6,
1267
+ "group": "ANi",
1268
+ "resolution": "1080P",
1269
+ "season": 2,
1270
+ "source": "Baha",
1271
+ "title": "異世界悠閒農家"
1272
+ }
1273
+ },
1274
+ {
1275
+ "id": "kisssub_bracket_title_episode",
1276
+ "filename": "[KissSub][Shunkashuutou Daikousha - Haru no Mai][05][1080P][GB][MP4]",
1277
+ "ok": true,
1278
+ "errors": {},
1279
+ "expected": {
1280
+ "group": "KissSub",
1281
+ "title": "Shunkashuutou Daikousha - Haru no Mai",
1282
+ "episode": 5,
1283
+ "resolution": "1080P",
1284
+ "source": "GB"
1285
+ },
1286
+ "pred": {
1287
+ "episode": 5,
1288
+ "group": "KissSub",
1289
+ "resolution": "1080P",
1290
+ "source": "GB",
1291
+ "title": "Shunkashuutou Daikousha - Haru no Mai"
1292
+ }
1293
+ },
1294
+ {
1295
+ "id": "airotabracket_title_episode",
1296
+ "filename": "[Airota][Sousou no Frieren][29][1080p AVC AAC][CHT]",
1297
+ "ok": true,
1298
+ "errors": {},
1299
+ "expected": {
1300
+ "group": "Airota",
1301
+ "title": "Sousou no Frieren",
1302
+ "episode": 29,
1303
+ "resolution": "1080p",
1304
+ "source": "CHT"
1305
+ },
1306
+ "pred": {
1307
+ "episode": 29,
1308
+ "group": "Airota",
1309
+ "resolution": "1080p",
1310
+ "source": "CHT",
1311
+ "title": "Sousou no Frieren"
1312
+ }
1313
+ },
1314
+ {
1315
+ "id": "subsplease_parenthesized_resolution",
1316
+ "filename": "[SubsPlease] Mushoku Tensei - 12 (1080p) [x265][AAC]",
1317
+ "ok": true,
1318
+ "errors": {},
1319
+ "expected": {
1320
+ "group": "SubsPlease",
1321
+ "title": "Mushoku Tensei",
1322
+ "episode": 12,
1323
+ "resolution": "1080p"
1324
+ },
1325
+ "pred": {
1326
+ "episode": 12,
1327
+ "group": "SubsPlease",
1328
+ "resolution": "1080p",
1329
+ "title": "Mushoku Tensei"
1330
+ }
1331
+ },
1332
+ {
1333
+ "id": "vcb_bracket_episode",
1334
+ "filename": "[VCB-Studio] Girls Band Cry [01][Ma10p_1080p][x265_flac]",
1335
+ "ok": true,
1336
+ "errors": {},
1337
+ "expected": {
1338
+ "group": "VCB-Studio",
1339
+ "title": "Girls Band Cry",
1340
+ "episode": 1,
1341
+ "resolution": "1080p"
1342
+ },
1343
+ "pred": {
1344
+ "episode": 1,
1345
+ "group": "VCB-Studio",
1346
+ "resolution": "1080p",
1347
+ "title": "Girls Band Cry"
1348
+ }
1349
+ },
1350
+ {
1351
+ "id": "numeric_title_not_episode",
1352
+ "filename": "86 Eighty Six - 01 [1080P][Baha]",
1353
+ "ok": true,
1354
+ "errors": {},
1355
+ "expected": {
1356
+ "title": "86 Eighty Six",
1357
+ "episode": 1,
1358
+ "resolution": "1080P",
1359
+ "source": "Baha"
1360
+ },
1361
+ "pred": {
1362
+ "episode": 1,
1363
+ "resolution": "1080P",
1364
+ "source": "Baha",
1365
+ "title": "86 Eighty Six"
1366
+ }
1367
+ },
1368
+ {
1369
+ "id": "erai_raws_dash_episode",
1370
+ "filename": "[Erai-raws] Sousou no Frieren - 01 [1080p][Multiple Subtitle][ENG]",
1371
+ "ok": true,
1372
+ "errors": {},
1373
+ "expected": {
1374
+ "group": "Erai-raws",
1375
+ "title": "Sousou no Frieren",
1376
+ "episode": 1,
1377
+ "resolution": "1080p"
1378
+ },
1379
+ "pred": {
1380
+ "episode": 1,
1381
+ "group": "Erai-raws",
1382
+ "resolution": "1080p",
1383
+ "title": "Sousou no Frieren"
1384
+ }
1385
+ },
1386
+ {
1387
+ "id": "nekomoe_space_group",
1388
+ "filename": "[Nekomoe kissaten][Watashi no Shiawase na Kekkon][01][1080p][JPSC]",
1389
+ "ok": true,
1390
+ "errors": {},
1391
+ "expected": {
1392
+ "group": "Nekomoe kissaten",
1393
+ "title": "Watashi no Shiawase na Kekkon",
1394
+ "episode": 1,
1395
+ "resolution": "1080p"
1396
+ },
1397
+ "pred": {
1398
+ "episode": 1,
1399
+ "group": "Nekomoe kissaten",
1400
+ "resolution": "1080p",
1401
+ "title": "Watashi no Shiawase na Kekkon"
1402
+ }
1403
+ },
1404
+ {
1405
+ "id": "long_running_episode",
1406
+ "filename": "One.Piece.1110.1080p.WEB-DL.AAC2.0.H.264",
1407
+ "ok": true,
1408
+ "errors": {},
1409
+ "expected": {
1410
+ "title": "One.Piece",
1411
+ "episode": 1110,
1412
+ "resolution": "1080p",
1413
+ "source": "WEB-DL"
1414
+ },
1415
+ "pred": {
1416
+ "episode": 1110,
1417
+ "resolution": "1080p",
1418
+ "source": "WEB-DL",
1419
+ "title": "One.Piece"
1420
+ }
1421
+ },
1422
+ {
1423
+ "id": "season_episode_amzn",
1424
+ "filename": "Example.Show.S02E03.2160p.AMZN.WEB-DL.DDP5.1.H.265",
1425
+ "ok": true,
1426
+ "errors": {},
1427
+ "expected": {
1428
+ "title": "Example.Show",
1429
+ "season": 2,
1430
+ "episode": 3,
1431
+ "resolution": "2160p",
1432
+ "source": "AMZN"
1433
+ },
1434
+ "pred": {
1435
+ "episode": 3,
1436
+ "resolution": "2160p",
1437
+ "season": 2,
1438
+ "source": "AMZN",
1439
+ "title": "Example.Show"
1440
+ }
1441
+ },
1442
+ {
1443
+ "id": "cjk_group_with_prefix_tag",
1444
+ "filename": "【喵萌奶茶屋】★04月新番★[葬送的芙莉莲][01][1080P][HEVC]",
1445
+ "ok": true,
1446
+ "errors": {},
1447
+ "expected": {
1448
+ "group": "喵萌奶茶屋",
1449
+ "title": "葬送的芙莉莲",
1450
+ "episode": 1,
1451
+ "resolution": "1080P"
1452
+ },
1453
+ "pred": {
1454
+ "episode": 1,
1455
+ "group": "喵萌奶茶屋",
1456
+ "resolution": "1080P",
1457
+ "title": "葬送的芙莉莲"
1458
+ }
1459
+ },
1460
+ {
1461
+ "id": "leading_meta_not_group",
1462
+ "filename": "[1080p] Witch Watch - 15 [CHS]",
1463
+ "ok": true,
1464
+ "errors": {},
1465
+ "expected": {
1466
+ "group": null,
1467
+ "title": "Witch Watch",
1468
+ "episode": 15,
1469
+ "resolution": "1080p",
1470
+ "source": "CHS"
1471
+ },
1472
+ "pred": {
1473
+ "episode": 15,
1474
+ "group": null,
1475
+ "resolution": "1080p",
1476
+ "source": "CHS",
1477
+ "title": "Witch Watch"
1478
+ }
1479
+ },
1480
+ {
1481
+ "id": "sakurato_group_language_source",
1482
+ "filename": "[Sakurato] Witch Watch - 15 [1080p][CHS]",
1483
+ "ok": true,
1484
+ "errors": {},
1485
+ "expected": {
1486
+ "group": "Sakurato",
1487
+ "title": "Witch Watch",
1488
+ "episode": 15,
1489
+ "resolution": "1080p",
1490
+ "source": "CHS"
1491
+ },
1492
+ "pred": {
1493
+ "episode": 15,
1494
+ "group": "Sakurato",
1495
+ "resolution": "1080p",
1496
+ "source": "CHS",
1497
+ "title": "Witch Watch"
1498
+ }
1499
+ },
1500
+ {
1501
+ "id": "billion_meta_lab_search_special",
1502
+ "filename": "[Billion Meta Lab] 魔法姊妹露露莉莉 Mahou no Shimai Rurutto Riryi [07][1080P][CHT&JPN][檢索:魔法姊妹露露特莉莉].mp4",
1503
+ "ok": true,
1504
+ "errors": {},
1505
+ "expected": {
1506
+ "group": "Billion Meta Lab",
1507
+ "title": "魔法姊妹露露莉莉 Mahou no Shimai Rurutto Riryi",
1508
+ "episode": 7,
1509
+ "resolution": "1080P",
1510
+ "source": "CHT&JPN",
1511
+ "special": "檢索:魔法姊妹露露特莉莉"
1512
+ },
1513
+ "pred": {
1514
+ "episode": 7,
1515
+ "group": "Billion Meta Lab",
1516
+ "resolution": "1080P",
1517
+ "source": "CHT&JPN",
1518
+ "special": "檢索:魔法姊妹露露特莉莉",
1519
+ "title": "魔法姊妹露露莉莉 Mahou no Shimai Rurutto Riryi"
1520
+ }
1521
+ },
1522
+ {
1523
+ "id": "studio_greentea_s2_bracket_episode",
1524
+ "filename": "[Studio GreenTea] Otonari no Tenshi-sama ni Itsunomanika Dame Ningen ni Sareteita Ken S2 [06][WebRip][HEVC-10bit 1080p AAC][JPSC].mp4",
1525
+ "ok": true,
1526
+ "errors": {},
1527
+ "expected": {
1528
+ "group": "Studio GreenTea",
1529
+ "title": "Otonari no Tenshi-sama ni Itsunomanika Dame Ningen ni Sareteita Ken",
1530
+ "season": 2,
1531
+ "episode": 6,
1532
+ "resolution": "1080p",
1533
+ "source": "WebRip"
1534
+ },
1535
+ "pred": {
1536
+ "episode": 6,
1537
+ "group": "Studio GreenTea",
1538
+ "resolution": "1080p",
1539
+ "season": 2,
1540
+ "source": "WebRip",
1541
+ "title": "Otonari no Tenshi-sama ni Itsunomanika Dame Ningen ni Sareteita Ken"
1542
+ }
1543
+ },
1544
+ {
1545
+ "id": "lolihouse_kakuriyo_bare_ni_season",
1546
+ "filename": "[LoliHouse] Kakuriyo no Yadomeshi Ni - 12 [WebRip 1080p HEVC-10bit AAC SRTx2].mkv",
1547
+ "ok": true,
1548
+ "errors": {},
1549
+ "expected": {
1550
+ "group": "LoliHouse",
1551
+ "title": "Kakuriyo no Yadomeshi",
1552
+ "season": 2,
1553
+ "episode": 12,
1554
+ "resolution": "1080p",
1555
+ "source": "WebRip"
1556
+ },
1557
+ "pred": {
1558
+ "episode": 12,
1559
+ "group": "LoliHouse",
1560
+ "resolution": "1080p",
1561
+ "season": 2,
1562
+ "source": "WebRip",
1563
+ "title": "Kakuriyo no Yadomeshi"
1564
+ }
1565
+ },
1566
+ {
1567
+ "id": "ani_kakuriyo_traditional_ni",
1568
+ "filename": "[ANi] 妖怪旅館營業中 貳 - 11 [1080P][Baha][WEB-DL][AAC AVC][CHT].mp4",
1569
+ "ok": true,
1570
+ "errors": {},
1571
+ "expected": {
1572
+ "group": "ANi",
1573
+ "title": "妖怪旅館營業中",
1574
+ "season": 2,
1575
+ "episode": 11,
1576
+ "resolution": "1080P",
1577
+ "source": "Baha"
1578
+ },
1579
+ "pred": {
1580
+ "episode": 11,
1581
+ "group": "ANi",
1582
+ "resolution": "1080P",
1583
+ "season": 2,
1584
+ "source": "Baha",
1585
+ "title": "妖怪旅館營業中"
1586
+ }
1587
+ },
1588
+ {
1589
+ "id": "jibaketa_shokugeki_ni_no_sara",
1590
+ "filename": "[jibaketa]Shokugeki no Souma Ni no Sara - 13 END [BD 1920x1080 x264 AACx2 SRT TVB CHT].mkv",
1591
+ "ok": true,
1592
+ "errors": {},
1593
+ "expected": {
1594
+ "group": "jibaketa",
1595
+ "title": "Shokugeki no Souma",
1596
+ "season": 2,
1597
+ "episode": 13,
1598
+ "resolution": "1920x1080"
1599
+ },
1600
+ "pred": {
1601
+ "episode": 13,
1602
+ "group": "jibaketa",
1603
+ "resolution": "1920x1080",
1604
+ "season": 2,
1605
+ "title": "Shokugeki no Souma"
1606
+ }
1607
+ },
1608
+ {
1609
+ "id": "ai_raws_fire_force_cjk_season_hash_episode",
1610
+ "filename": "[AI-Raws] 炎炎の消防隊 弐ノ章 #13 (BD HEVC 1920x1080 yuv444p10le FLAC)[FC74A2D5].mkv",
1611
+ "ok": true,
1612
+ "errors": {},
1613
+ "expected": {
1614
+ "group": "AI-Raws",
1615
+ "title": "炎炎の消防隊",
1616
+ "season": 2,
1617
+ "episode": 13,
1618
+ "resolution": "1920x1080"
1619
+ },
1620
+ "pred": {
1621
+ "episode": 13,
1622
+ "group": "AI-Raws",
1623
+ "resolution": "1920x1080",
1624
+ "season": 2,
1625
+ "title": "炎炎の消防隊"
1626
+ }
1627
+ },
1628
+ {
1629
+ "id": "gm_team_guoman_bilingual_s2",
1630
+ "filename": "[GM-Team][国漫][逆天邪神 第2季][Against the Gods Ⅱ][2026][04][HEVC][GB][4K].mp4",
1631
+ "ok": true,
1632
+ "errors": {},
1633
+ "expected": {
1634
+ "group": "GM-Team",
1635
+ "title": "逆天邪神",
1636
+ "season": 2,
1637
+ "episode": 4,
1638
+ "resolution": "4K",
1639
+ "source": "GB"
1640
+ },
1641
+ "pred": {
1642
+ "episode": 4,
1643
+ "group": "GM-Team",
1644
+ "resolution": "4K",
1645
+ "season": 2,
1646
+ "source": "GB",
1647
+ "title": "逆天邪神"
1648
+ }
1649
+ },
1650
+ {
1651
+ "id": "vcb_special_iv_not_episode",
1652
+ "filename": "[YYDM&VCB-Studio] Shinsekai Yori [IV05][Ma10p_1080p][x265_aac].mkv",
1653
+ "ok": true,
1654
+ "errors": {},
1655
+ "expected": {
1656
+ "group": "YYDM&VCB-Studio",
1657
+ "title": "Shinsekai Yori",
1658
+ "episode": null,
1659
+ "resolution": "1080p",
1660
+ "source": "x265_aac",
1661
+ "special": "IV05"
1662
+ },
1663
+ "pred": {
1664
+ "episode": null,
1665
+ "group": "YYDM&VCB-Studio",
1666
+ "resolution": "1080p",
1667
+ "source": "x265-aac",
1668
+ "special": "IV05",
1669
+ "title": "Shinsekai Yori"
1670
+ }
1671
+ },
1672
+ {
1673
+ "id": "vcb_nced_not_episode",
1674
+ "filename": "[YYDM&VCB-Studio] Shinsekai Yori [NCED02][Ma10p_1080p][x265_flac].mkv",
1675
+ "ok": true,
1676
+ "errors": {},
1677
+ "expected": {
1678
+ "group": "YYDM&VCB-Studio",
1679
+ "title": "Shinsekai Yori",
1680
+ "episode": null,
1681
+ "resolution": "1080p",
1682
+ "source": "x265_flac",
1683
+ "special": "NCED02"
1684
+ },
1685
+ "pred": {
1686
+ "episode": null,
1687
+ "group": "YYDM&VCB-Studio",
1688
+ "resolution": "1080p",
1689
+ "source": "x265-flac",
1690
+ "special": "NCED02",
1691
+ "title": "Shinsekai Yori"
1692
+ }
1693
+ },
1694
+ {
1695
+ "id": "dot_nced_suffix_not_episode",
1696
+ "filename": "InuYasha.2000.NCED02.BDrip.AV1.10Bit.DTS.1080p-CalChi",
1697
+ "ok": true,
1698
+ "errors": {},
1699
+ "expected": {
1700
+ "title": "InuYasha",
1701
+ "episode": null,
1702
+ "resolution": "1080p",
1703
+ "source": "BDrip",
1704
+ "special": "NCED02"
1705
+ },
1706
+ "pred": {
1707
+ "episode": null,
1708
+ "resolution": "1080p",
1709
+ "source": "BDrip",
1710
+ "special": "NCED02",
1711
+ "title": "InuYasha"
1712
+ }
1713
+ },
1714
+ {
1715
+ "id": "vcb_numeric_title_nced",
1716
+ "filename": "[VCB-Studio] Yamada-kun to 7-nin no Majo [NCED][Ma10p_1080p][x265_flac]",
1717
+ "ok": true,
1718
+ "errors": {},
1719
+ "expected": {
1720
+ "group": "VCB-Studio",
1721
+ "title": "Yamada-kun to 7-nin no Majo",
1722
+ "episode": null,
1723
+ "resolution": "1080p",
1724
+ "source": "x265_flac",
1725
+ "special": "NCED"
1726
+ },
1727
+ "pred": {
1728
+ "episode": null,
1729
+ "group": "VCB-Studio",
1730
+ "resolution": "1080p",
1731
+ "source": "x265-flac",
1732
+ "special": "NCED",
1733
+ "title": "Yamada-kun to 7-nin no Majo"
1734
+ }
1735
+ }
1736
+ ]
1737
  }
1738
+ }
1739
  }
datasets/AnimeName CHANGED
@@ -1 +1 @@
1
- Subproject commit c40cb38963a390a61c6d375409031f8a6c5eb927
 
1
+ Subproject commit 255d53ecf84d339b87618c34a593c7f2f3a0040b
dmhy_dataset.py CHANGED
@@ -412,8 +412,44 @@ def is_title_token(token: str) -> bool:
412
  return True
413
 
414
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
415
  def trim_title_span(tokens: Sequence[str], start: int, end: int) -> tuple[int, int]:
416
- while start < end and not is_title_token(tokens[start]):
 
417
  start += 1
418
  while end > start and not is_title_token(tokens[end - 1]):
419
  end -= 1
@@ -556,6 +592,10 @@ def label_context_season_tokens(
556
  continue
557
  if is_context_season_token(tokens, idx, episode_idx):
558
  categories[idx] = "season"
 
 
 
 
559
 
560
 
561
  def label_special_index_sequences(tokens: Sequence[str], categories: List[str]) -> None:
 
412
  return True
413
 
414
 
415
+ def is_title_start_token(tokens: Sequence[str], idx: int, end: int) -> bool:
416
+ """Allow numeric title starts like `86 Eighty Six` without allowing episode tails."""
417
+ if is_title_token(tokens[idx]):
418
+ return True
419
+ clean = clean_bracket(tokens[idx])
420
+ if not re.fullmatch(r"\d{1,4}", clean):
421
+ return False
422
+ next_idx = idx + 1
423
+ while next_idx < end and is_separator_token(tokens[next_idx]):
424
+ next_idx += 1
425
+ return next_idx < end and is_title_token(tokens[next_idx])
426
+
427
+
428
+ def skip_leading_title_decoration(tokens: Sequence[str], start: int, end: int) -> int:
429
+ """Drop decorative release prefixes such as `★04月新番★` from title spans."""
430
+ while start < end:
431
+ token = clean_bracket(tokens[start])
432
+ if token not in {"★", "☆"}:
433
+ break
434
+ closing = None
435
+ for idx in range(start + 1, min(end, start + 12)):
436
+ if clean_bracket(tokens[idx]) == token:
437
+ closing = idx
438
+ break
439
+ if closing is None:
440
+ break
441
+ prefix_text = "".join(clean_bracket(piece) for piece in tokens[start:closing + 1])
442
+ if not re.search(r"(?:新番|月番|合集|合輯|全集|完结|完結)", prefix_text):
443
+ break
444
+ start = closing + 1
445
+ while start < end and is_separator_token(tokens[start]):
446
+ start += 1
447
+ return start
448
+
449
+
450
  def trim_title_span(tokens: Sequence[str], start: int, end: int) -> tuple[int, int]:
451
+ start = skip_leading_title_decoration(tokens, start, end)
452
+ while start < end and not is_title_start_token(tokens, start, end):
453
  start += 1
454
  while end > start and not is_title_token(tokens[end - 1]):
455
  end -= 1
 
592
  continue
593
  if is_context_season_token(tokens, idx, episode_idx):
594
  categories[idx] = "season"
595
+ prev_idx = idx - 1
596
+ while prev_idx >= 0 and is_separator_token(tokens[prev_idx]) and categories[prev_idx] == "title":
597
+ categories[prev_idx] = "sep"
598
+ prev_idx -= 1
599
 
600
 
601
  def label_special_index_sequences(tokens: Sequence[str], categories: List[str]) -> None:
docs/onnx.md CHANGED
@@ -24,7 +24,7 @@ It does **not** contain:
24
  - token-to-id conversion / token 到 id 的转换
25
  - constrained BIO decoding / 约束 BIO 解码
26
  - field aggregation / 字段聚合
27
- - structural cleanup / 结构清理
28
 
29
  Those steps must stay aligned with `tokenizer.py`, `inference.py`, `config.json`,
30
  and `vocab.json`.
@@ -107,8 +107,15 @@ The runtime parser should do this:
107
  使用约束 BIO transition 解码标签。
108
  8. Aggregate labels into parser fields.
109
  聚合标签为结构化字段。
110
- 9. Apply high-confidence structural cleanup.
111
- 应用高置信结构修正。
 
 
 
 
 
 
 
112
 
113
  ## 5. Android Notes / Android 注意事项
114
 
@@ -162,18 +169,19 @@ Run:
162
  uv run python benchmark_inference.py --model-dir . --onnx exports/anime_filename_parser.onnx --case-file data/parser_regression_cases.json --repeat 20 --warmup 20 --torch-threads 1 --ort-threads 1 --output benchmark_results.json
163
  ```
164
 
165
- Local single-thread CPU result, measured on 26 real-world regression cases:
 
166
 
167
- 本地 CPU 单线程结果,使用 26 条真实回归 case:
168
 
169
  | Backend / 后端 | Load ms / 加载 ms | Avg ms / 平均 ms | P50 ms | P95 ms | P99 ms | files/s |
170
  | --- | ---: | ---: | ---: | ---: | ---: | ---: |
171
- | PyTorch | 64.63 | 32.86 | 32.43 | 38.42 | 41.09 | 30.4 |
172
- | ONNX Runtime | 898.63 | 30.35 | 30.12 | 34.44 | 36.86 | 33.0 |
173
 
174
  The benchmark includes tokenization, model/session forward, constrained BIO
175
- decode, and postprocessing. It does not include repeatedly constructing the
176
- ONNX Runtime session inside the loop.
177
 
178
- 该基准包含 tokenizer、模型/session 前向、约束 BIO 解码和后处理循环内不会重复创建
179
- ONNX Runtime session。
 
24
  - token-to-id conversion / token 到 id 的转换
25
  - constrained BIO decoding / 约束 BIO 解码
26
  - field aggregation / 字段聚合
27
+ - thin string and number normalization / 薄字符串和数字规范
28
 
29
  Those steps must stay aligned with `tokenizer.py`, `inference.py`, `config.json`,
30
  and `vocab.json`.
 
107
  使用约束 BIO transition 解码标签。
108
  8. Aggregate labels into parser fields.
109
  聚合标签为结构化字段。
110
+ 9. Apply thin normalization only: trim brackets/extensions and convert numeric
111
+ fields.
112
+ 只做薄层规范化:裁剪括号/扩展名并转换数字字段。
113
+
114
+ The legacy structural assist layer is available only behind `--rule-assist` in
115
+ the Python tools. It is not part of the default ONNX reference runtime.
116
+
117
+ 旧结构辅助层只在 Python 工具的 `--rule-assist` 下显式启用,不属于默认 ONNX
118
+ 参考运行时。
119
 
120
  ## 5. Android Notes / Android 注意事项
121
 
 
169
  uv run python benchmark_inference.py --model-dir . --onnx exports/anime_filename_parser.onnx --case-file data/parser_regression_cases.json --repeat 20 --warmup 20 --torch-threads 1 --ort-threads 1 --output benchmark_results.json
170
  ```
171
 
172
+ Local single-thread CPU result, measured on 26 real-world regression cases with
173
+ the default thin runtime:
174
 
175
+ 本地 CPU 单线程结果,使用 26 条真实回归 case 和默认薄层运行时
176
 
177
  | Backend / 后端 | Load ms / 加载 ms | Avg ms / 平均 ms | P50 ms | P95 ms | P99 ms | files/s |
178
  | --- | ---: | ---: | ---: | ---: | ---: | ---: |
179
+ | PyTorch | 49.07 | 15.16 | 14.87 | 18.50 | 21.91 | 66.0 |
180
+ | ONNX Runtime | 568.85 | 13.08 | 12.82 | 15.95 | 20.19 | 76.5 |
181
 
182
  The benchmark includes tokenization, model/session forward, constrained BIO
183
+ decode, entity aggregation, and thin normalization. It does not include
184
+ repeatedly constructing the ONNX Runtime session inside the loop.
185
 
186
+ 该基准包含 tokenizer、模型/session 前向、约束 BIO 解码、实体聚合薄层规范化
187
+ 循环内不会重复创建 ONNX Runtime session。
docs/training.md CHANGED
@@ -127,42 +127,58 @@ Training outputs:
127
  - `final/case_metrics.json`: fixed real-world case regression / 固定真实 case 回归
128
  - TensorBoard logs unless `--no-tensorboard` is set / 默认写 TensorBoard
129
 
130
- ## 6. Focus Fine-Tuning / 针对性微调
131
 
132
- Use focus fine-tuning only after a specific real-world failure pattern has been
133
- confirmed and added to `data/parser_regression_cases.json`.
 
134
 
135
- 只有在确认某类真实失败样式并加入 `data/parser_regression_cases.json` 后,才使用针对性微调。
 
136
 
137
  ```powershell
138
  uv run python build_repair_focus_dataset.py `
139
  --input datasets/AnimeName/dmhy_weak_char.jsonl `
140
- --output data/repair_focus_char.jsonl `
141
- --context-samples 50000 `
142
- --repeat-repaired 4 `
143
- --repeat-manual 24 `
144
- --seed 75
145
 
146
  uv run python train.py --tokenizer char `
147
- --data-file data/repair_focus_char.jsonl `
148
  --vocab-file datasets/AnimeName/vocab.char.json `
149
- --save-dir checkpoints/dmhy-char-special-focus `
150
  --init-model-dir . `
151
- --epochs 1 `
152
- --batch-size 64 `
153
- --learning-rate 0.00003 `
154
- --warmup-steps 50 `
155
  --max-seq-length 128 `
156
  --train-split 0.95 `
157
- --num-workers 0 `
158
- --checkpoint-steps 500 `
159
  --save-total-limit 2 `
160
- --parse-eval-limit 512 `
161
  --case-eval-file data/parser_regression_cases.json `
162
- --seed 75 `
163
- --experiment-name dmhy-char-special-focus
164
  ```
165
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
166
  ## 7. Publish to Repository Root / 发布到仓库根目录
167
 
168
  The repository root is the Hugging Face checkpoint surface.
 
127
  - `final/case_metrics.json`: fixed real-world case regression / 固定真实 case 回归
128
  - TensorBoard logs unless `--no-tensorboard` is set / 默认写 TensorBoard
129
 
130
+ ## 6. Thin Hard-Case Fine-Tuning / 薄层困难样本微调
131
 
132
+ Use hard-case fine-tuning only after a specific real-world failure pattern has
133
+ been confirmed, fixed in the weak labels, and added to
134
+ `data/parser_regression_cases.json`.
135
 
136
+ 只有在确认某类真实失败样式、修复弱标注并加入
137
+ `data/parser_regression_cases.json` 后,才使用困难样本微调。
138
 
139
  ```powershell
140
  uv run python build_repair_focus_dataset.py `
141
  --input datasets/AnimeName/dmhy_weak_char.jsonl `
142
+ --output data/thin_hard_focus_char.jsonl `
143
+ --context-samples 30000 `
144
+ --repeat-focus 3 `
145
+ --repeat-manual 240 `
146
+ --seed 57
147
 
148
  uv run python train.py --tokenizer char `
149
+ --data-file data/thin_hard_focus_char.jsonl `
150
  --vocab-file datasets/AnimeName/vocab.char.json `
151
+ --save-dir checkpoints/dmhy-char-thin-hardfocus `
152
  --init-model-dir . `
153
+ --epochs 2 `
154
+ --batch-size 256 `
155
+ --learning-rate 0.00004 `
156
+ --warmup-steps 80 `
157
  --max-seq-length 128 `
158
  --train-split 0.95 `
159
+ --num-workers 4 `
160
+ --checkpoint-steps 300 `
161
  --save-total-limit 2 `
162
+ --parse-eval-limit 1024 `
163
  --case-eval-file data/parser_regression_cases.json `
164
+ --seed 58 `
165
+ --experiment-name dmhy-char-thin-hardfocus
166
  ```
167
 
168
+ The default quality gate is model-led parsing:
169
+
170
+ 默认质量门槛以模型主导解析为准:
171
+
172
+ - fixed regression `model_only >= 85%`
173
+ - held-out parse `model_only >= 75%`
174
+ - `normalized_only` is the default thin runtime metric
175
+ - `rule_assisted` is compatibility/diagnostic only
176
+
177
+ - 固定回归 `model_only >= 85%`
178
+ - held-out 解析 `model_only >= 75%`
179
+ - `normalized_only` 是默认薄层运行时指标
180
+ - `rule_assisted` 只作为兼容/诊断对照
181
+
182
  ## 7. Publish to Repository Root / 发布到仓库根目录
183
 
184
  The repository root is the Hugging Face checkpoint surface.
evaluate_parser_cases.py CHANGED
@@ -121,38 +121,89 @@ def evaluate_cases(
121
  }
122
 
123
 
124
- def main() -> None:
125
- parser = argparse.ArgumentParser(description="Evaluate parser on fixed filename regression cases")
126
- parser.add_argument("--model-dir", required=True)
127
- parser.add_argument("--case-file", default=DEFAULT_CASE_FILE)
128
- parser.add_argument("--tokenizer", choices=["regex", "char"], default=None)
129
- parser.add_argument("--max-length", type=int, default=None)
130
- parser.add_argument("--output", default=None, help="Optional JSON output path")
131
- parser.add_argument("--no-rule-assist", action="store_true")
132
- parser.add_argument("--no-constrained-bio", action="store_true")
133
- args = parser.parse_args()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
134
 
135
- metrics = evaluate_cases(
136
- model_dir=args.model_dir,
137
- case_file=args.case_file,
138
- tokenizer_variant=args.tokenizer,
139
- max_length=args.max_length,
140
- use_rules=not args.no_rule_assist,
141
- constrain_bio=not args.no_constrained_bio,
142
- )
143
 
 
144
  print(
145
- f"Full case accuracy: {metrics['full_correct']}/{metrics['case_count']} "
146
  f"({metrics['full_accuracy']:.4f})"
147
  )
148
  for field, total in metrics["field_total"].items():
149
  correct = metrics["field_correct"].get(field, 0)
150
  print(f" {field}: {correct}/{total} ({correct / total:.4f})")
151
  if metrics["failures"]:
152
- print("\nFailures:")
153
  for failure in metrics["failures"]:
154
  print(json.dumps(failure, ensure_ascii=False))
155
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
156
  if args.output:
157
  os.makedirs(os.path.dirname(args.output) or ".", exist_ok=True)
158
  with open(args.output, "w", encoding="utf-8") as f:
 
121
  }
122
 
123
 
124
+ def evaluate_case_modes(
125
+ model_dir: str,
126
+ case_file: str,
127
+ tokenizer_variant: Optional[str],
128
+ max_length: Optional[int],
129
+ ) -> Dict:
130
+ modes = {
131
+ "model_only": {"use_rules": False, "constrain_bio": False},
132
+ "normalized_only": {"use_rules": False, "constrain_bio": True},
133
+ "rule_assisted": {"use_rules": True, "constrain_bio": True},
134
+ }
135
+ results = {
136
+ name: evaluate_cases(
137
+ model_dir=model_dir,
138
+ case_file=case_file,
139
+ tokenizer_variant=tokenizer_variant,
140
+ max_length=max_length,
141
+ use_rules=settings["use_rules"],
142
+ constrain_bio=settings["constrain_bio"],
143
+ )
144
+ for name, settings in modes.items()
145
+ }
146
+ return {
147
+ "primary_metric": "normalized_only",
148
+ "modes": results,
149
+ }
150
 
 
 
 
 
 
 
 
 
151
 
152
+ def print_metrics(name: str, metrics: Dict) -> None:
153
  print(
154
+ f"{name} full case accuracy: {metrics['full_correct']}/{metrics['case_count']} "
155
  f"({metrics['full_accuracy']:.4f})"
156
  )
157
  for field, total in metrics["field_total"].items():
158
  correct = metrics["field_correct"].get(field, 0)
159
  print(f" {field}: {correct}/{total} ({correct / total:.4f})")
160
  if metrics["failures"]:
161
+ print(f"\n{name} failures:")
162
  for failure in metrics["failures"]:
163
  print(json.dumps(failure, ensure_ascii=False))
164
 
165
+
166
+ def main() -> None:
167
+ parser = argparse.ArgumentParser(description="Evaluate parser on fixed filename regression cases")
168
+ parser.add_argument("--model-dir", required=True)
169
+ parser.add_argument("--case-file", default=DEFAULT_CASE_FILE)
170
+ parser.add_argument("--tokenizer", choices=["regex", "char"], default=None)
171
+ parser.add_argument("--max-length", type=int, default=None)
172
+ parser.add_argument("--output", default=None, help="Optional JSON output path")
173
+ parser.add_argument("--mode", choices=["all", "model-only", "normalized-only", "rule-assisted"], default="all")
174
+ parser.add_argument("--rule-assist", action="store_true", help="Shortcut for --mode rule-assisted")
175
+ parser.add_argument("--no-rule-assist", action="store_true", help=argparse.SUPPRESS)
176
+ parser.add_argument("--no-constrained-bio", action="store_true")
177
+ args = parser.parse_args()
178
+
179
+ if args.rule_assist:
180
+ args.mode = "rule-assisted"
181
+ if args.no_rule_assist and args.mode == "rule-assisted":
182
+ args.mode = "normalized-only"
183
+
184
+ if args.mode == "all" and not args.no_constrained_bio:
185
+ metrics = evaluate_case_modes(
186
+ model_dir=args.model_dir,
187
+ case_file=args.case_file,
188
+ tokenizer_variant=args.tokenizer,
189
+ max_length=args.max_length,
190
+ )
191
+ for name in ("model_only", "normalized_only", "rule_assisted"):
192
+ print_metrics(name, metrics["modes"][name])
193
+ print()
194
+ else:
195
+ use_rules = args.mode == "rule-assisted"
196
+ constrain_bio = not args.no_constrained_bio and args.mode != "model-only"
197
+ metrics = evaluate_cases(
198
+ model_dir=args.model_dir,
199
+ case_file=args.case_file,
200
+ tokenizer_variant=args.tokenizer,
201
+ max_length=args.max_length,
202
+ use_rules=use_rules,
203
+ constrain_bio=constrain_bio,
204
+ )
205
+ print_metrics(args.mode, metrics)
206
+
207
  if args.output:
208
  os.makedirs(os.path.dirname(args.output) or ".", exist_ok=True)
209
  with open(args.output, "w", encoding="utf-8") as f:
exports/anime_filename_parser.metadata.json CHANGED
@@ -8,5 +8,5 @@
8
  128,
9
  15
10
  ],
11
- "max_abs_diff": 2.6702880859375e-05
12
  }
 
8
  128,
9
  15
10
  ],
11
+ "max_abs_diff": 4.0531158447265625e-05
12
  }
exports/anime_filename_parser.onnx CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:28ac9b1e17d0e70f31a986a1d677513d97e77748ccdf96c8d77245cadc54fa4e
3
- size 19652184
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:845e01ebdbf4a933610fcfb9f5be2fec2367e9db26d98308d827cbe23817b072
3
+ size 19645986
inference.py CHANGED
@@ -75,6 +75,36 @@ def extract_resolution(text: str) -> Optional[str]:
75
  return clean if clean else None
76
 
77
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
78
  def display_token(token: str) -> str:
79
  """Make whitespace tokens visible in debug output."""
80
  if token == " ":
@@ -210,7 +240,7 @@ def postprocess(
210
  labels: List[str],
211
  tokenizer: Optional[AnimeTokenizer] = None,
212
  filename: Optional[str] = None,
213
- use_rules: bool = True,
214
  ) -> Dict:
215
  """
216
  Convert BIO-labeled tokens into structured metadata.
@@ -230,45 +260,43 @@ def postprocess(
230
 
231
  entities = labels_to_entities(tokens, labels, tokenizer)
232
 
233
- # Fill result
234
  for entity_type, text in entities:
235
- if entity_type == "TITLE":
236
- result["title"] = result["title"] or trim_decorations(text)
237
- # If we find multiple title fragments, concatenate them
238
- # (handles "That" + ... + "Time" etc.)
239
- elif entity_type == "SEASON":
 
 
 
 
 
240
  season_num = extract_season_number(text)
241
  if season_num is not None:
242
- # Keep the highest/last season number if multiple
243
  result["season"] = season_num
244
- elif entity_type == "EPISODE":
 
245
  ep_num = extract_episode_number(text)
246
  if ep_num is not None:
247
  if result["episode"] is None:
248
  result["episode"] = ep_num
249
- elif entity_type == "GROUP":
250
- group = text.strip("[]()【】")
 
251
  if result["group"] is None:
252
  result["group"] = group
253
- elif entity_type == "SPECIAL":
254
- special = text.strip("[]()【】")
 
255
  result["special"] = special
256
- elif entity_type == "RESOLUTION":
 
257
  res = extract_resolution(text)
258
  if res:
259
  result["resolution"] = res
260
- elif entity_type == "SOURCE":
261
- src = text.strip("[]()【】")
262
- result["source"] = src
263
 
264
- # Handle multi-fragment titles: concatenate all TITLE fragments
265
- # (This is needed because O tokens between words break entity continuity)
266
- title_fragments = [t for e, t in entities if e == "TITLE"]
267
- if title_fragments:
268
- result["title"] = " ".join(
269
- trimmed for f in title_fragments
270
- if (trimmed := trim_decorations(f))
271
- )
272
 
273
  if use_rules and filename:
274
  result = apply_rule_assists(filename, result)
@@ -929,7 +957,7 @@ def parse_filename(
929
  id2label: Dict[int, str],
930
  max_length: int = 64,
931
  debug: bool = False,
932
- use_rules: bool = True,
933
  constrain_bio: bool = True,
934
  ) -> Dict:
935
  """
@@ -1025,6 +1053,7 @@ def parse_filename(
1025
  result["_debug"] = {
1026
  "tokenizer_variant": getattr(tokenizer, "tokenizer_variant", "regex"),
1027
  "decoder": "constrained_bio" if constrain_bio else "greedy",
 
1028
  "max_length": max_length,
1029
  "token_count": len(tokens),
1030
  "available_token_count": available,
@@ -1072,8 +1101,10 @@ def main():
1072
  help="Maximum sequence length")
1073
  parser.add_argument("--debug", action="store_true",
1074
  help="Include tokenizer, labels, scores, and entity spans in JSON output")
 
 
1075
  parser.add_argument("--no-rule-assist", action="store_true",
1076
- help="Disable high-confidence structural post-processing rules")
1077
  parser.add_argument("--no-constrained-bio", action="store_true",
1078
  help="Use greedy per-token decoding instead of constrained BIO Viterbi")
1079
  args = parser.parse_args()
@@ -1121,7 +1152,7 @@ def main():
1121
  id2label,
1122
  max_length,
1123
  debug=args.debug,
1124
- use_rules=not args.no_rule_assist,
1125
  constrain_bio=not args.no_constrained_bio,
1126
  )
1127
  result["_input"] = fn
 
75
  return clean if clean else None
76
 
77
 
78
+ def normalize_field_text(text: str) -> str:
79
+ return trim_decorations(text).strip(" \t-_.")
80
+
81
+
82
+ def thin_source_priority(source: str) -> int:
83
+ normalized = source.lower().replace("_", "-").replace(" ", "")
84
+ if normalized in {
85
+ "nf", "netflix", "amzn", "baha", "cr", "abema", "dsnp", "u-next", "hulu", "at-x",
86
+ "web-dl", "webdl", "webrip", "web-rip", "bdrip", "bluray", "bdmv", "bd",
87
+ "dvdrip", "dvd", "tvrip", "hdtv",
88
+ }:
89
+ return 90
90
+ if normalized in {"chs", "cht", "gb", "big5", "jpn", "jp", "jpsc", "jptc", "繁中", "简中"}:
91
+ return 70
92
+ if normalized in {
93
+ "x264", "x265", "h.264", "h264", "h.265", "h265", "hevc", "avc", "av1",
94
+ "aac", "flac", "mp3", "dts", "opus", "10bit", "8bit", "hi10p", "ma10p",
95
+ "srt", "srtx2", "ass", "assx2",
96
+ }:
97
+ return 20
98
+ return 40 if re.search(r"[&+/,]", source) else 30
99
+
100
+
101
+ def choose_thin_source(sources: List[str]) -> Optional[str]:
102
+ cleaned = [normalize_source_text(source) for source in sources if normalize_field_text(source)]
103
+ if not cleaned:
104
+ return None
105
+ return max(enumerate(cleaned), key=lambda item: (thin_source_priority(item[1]), -item[0]))[1]
106
+
107
+
108
  def display_token(token: str) -> str:
109
  """Make whitespace tokens visible in debug output."""
110
  if token == " ":
 
240
  labels: List[str],
241
  tokenizer: Optional[AnimeTokenizer] = None,
242
  filename: Optional[str] = None,
243
+ use_rules: bool = False,
244
  ) -> Dict:
245
  """
246
  Convert BIO-labeled tokens into structured metadata.
 
260
 
261
  entities = labels_to_entities(tokens, labels, tokenizer)
262
 
263
+ grouped_entities: Dict[str, List[str]] = {}
264
  for entity_type, text in entities:
265
+ grouped_entities.setdefault(entity_type, []).append(text)
266
+
267
+ title_fragments = [
268
+ cleaned for text in grouped_entities.get("TITLE", [])
269
+ if (cleaned := normalize_field_text(text))
270
+ ]
271
+ if title_fragments:
272
+ result["title"] = " ".join(title_fragments)
273
+
274
+ for text in grouped_entities.get("SEASON", []):
275
  season_num = extract_season_number(text)
276
  if season_num is not None:
 
277
  result["season"] = season_num
278
+
279
+ for text in grouped_entities.get("EPISODE", []):
280
  ep_num = extract_episode_number(text)
281
  if ep_num is not None:
282
  if result["episode"] is None:
283
  result["episode"] = ep_num
284
+
285
+ for text in grouped_entities.get("GROUP", []):
286
+ group = normalize_field_text(text)
287
  if result["group"] is None:
288
  result["group"] = group
289
+
290
+ for text in grouped_entities.get("SPECIAL", []):
291
+ special = normalize_field_text(text)
292
  result["special"] = special
293
+
294
+ for text in grouped_entities.get("RESOLUTION", []):
295
  res = extract_resolution(text)
296
  if res:
297
  result["resolution"] = res
 
 
 
298
 
299
+ result["source"] = choose_thin_source(grouped_entities.get("SOURCE", []))
 
 
 
 
 
 
 
300
 
301
  if use_rules and filename:
302
  result = apply_rule_assists(filename, result)
 
957
  id2label: Dict[int, str],
958
  max_length: int = 64,
959
  debug: bool = False,
960
+ use_rules: bool = False,
961
  constrain_bio: bool = True,
962
  ) -> Dict:
963
  """
 
1053
  result["_debug"] = {
1054
  "tokenizer_variant": getattr(tokenizer, "tokenizer_variant", "regex"),
1055
  "decoder": "constrained_bio" if constrain_bio else "greedy",
1056
+ "postprocess": "rule_assisted" if use_rules else "thin_normalize",
1057
  "max_length": max_length,
1058
  "token_count": len(tokens),
1059
  "available_token_count": available,
 
1101
  help="Maximum sequence length")
1102
  parser.add_argument("--debug", action="store_true",
1103
  help="Include tokenizer, labels, scores, and entity spans in JSON output")
1104
+ parser.add_argument("--rule-assist", action="store_true",
1105
+ help="Enable legacy structural post-processing rules")
1106
  parser.add_argument("--no-rule-assist", action="store_true",
1107
+ help=argparse.SUPPRESS)
1108
  parser.add_argument("--no-constrained-bio", action="store_true",
1109
  help="Use greedy per-token decoding instead of constrained BIO Viterbi")
1110
  args = parser.parse_args()
 
1152
  id2label,
1153
  max_length,
1154
  debug=args.debug,
1155
+ use_rules=args.rule_assist and not args.no_rule_assist,
1156
  constrain_bio=not args.no_constrained_bio,
1157
  )
1158
  result["_input"] = fn
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:9f251f8d4bbb750ba3bfd6fceffbec32eff3f32e9f07820bdab48294052d15a5
3
  size 19142604
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4c5c4e57443ed66fed3812c7c6fe2f14af292a8a32523017930fde2ac93d67ff
3
  size 19142604
onnx_inference.py CHANGED
@@ -59,7 +59,7 @@ def parse_with_onnx(
59
  model_dir: Path,
60
  onnx_path: Path,
61
  max_length: int,
62
- use_rules: bool = True,
63
  ) -> Dict:
64
  parser = OnnxFilenameParser(model_dir, onnx_path, max_length)
65
  return parser.parse(filename, use_rules=use_rules)
@@ -87,7 +87,7 @@ class OnnxFilenameParser:
87
  providers=providers or ["CPUExecutionProvider"],
88
  )
89
 
90
- def parse(self, filename: str, use_rules: bool = True) -> Dict:
91
  tokens, input_ids, attention_mask, available = encode(filename, self.tokenizer, self.max_length)
92
  logits = self.session.run(
93
  ["logits"],
@@ -111,7 +111,8 @@ def main() -> None:
111
  parser.add_argument("--model-dir", default=".", help="Directory containing vocab.json and config.json")
112
  parser.add_argument("--onnx", default="exports/anime_filename_parser.onnx", help="ONNX model path")
113
  parser.add_argument("--max-length", type=int, default=128, help="Static ONNX sequence length")
114
- parser.add_argument("--no-rule-assist", action="store_true", help="Disable structural postprocessing")
 
115
  args = parser.parse_args()
116
 
117
  result = parse_with_onnx(
@@ -119,7 +120,7 @@ def main() -> None:
119
  model_dir=Path(args.model_dir),
120
  onnx_path=Path(args.onnx),
121
  max_length=args.max_length,
122
- use_rules=not args.no_rule_assist,
123
  )
124
  print(json.dumps(result, ensure_ascii=False))
125
 
 
59
  model_dir: Path,
60
  onnx_path: Path,
61
  max_length: int,
62
+ use_rules: bool = False,
63
  ) -> Dict:
64
  parser = OnnxFilenameParser(model_dir, onnx_path, max_length)
65
  return parser.parse(filename, use_rules=use_rules)
 
87
  providers=providers or ["CPUExecutionProvider"],
88
  )
89
 
90
+ def parse(self, filename: str, use_rules: bool = False) -> Dict:
91
  tokens, input_ids, attention_mask, available = encode(filename, self.tokenizer, self.max_length)
92
  logits = self.session.run(
93
  ["logits"],
 
111
  parser.add_argument("--model-dir", default=".", help="Directory containing vocab.json and config.json")
112
  parser.add_argument("--onnx", default="exports/anime_filename_parser.onnx", help="ONNX model path")
113
  parser.add_argument("--max-length", type=int, default=128, help="Static ONNX sequence length")
114
+ parser.add_argument("--rule-assist", action="store_true", help="Enable legacy structural postprocessing")
115
+ parser.add_argument("--no-rule-assist", action="store_true", help=argparse.SUPPRESS)
116
  args = parser.parse_args()
117
 
118
  result = parse_with_onnx(
 
120
  model_dir=Path(args.model_dir),
121
  onnx_path=Path(args.onnx),
122
  max_length=args.max_length,
123
+ use_rules=args.rule_assist and not args.no_rule_assist,
124
  )
125
  print(json.dumps(result, ensure_ascii=False))
126
 
parse_eval_metrics.json CHANGED
@@ -1,583 +1,1159 @@
1
  {
2
- "sample_count": 512,
3
- "field_accuracy": {
4
- "group": 1.0,
5
- "title": 0.974609375,
6
- "season": 0.98046875,
7
- "episode": 0.806640625,
8
- "resolution": 1.0,
9
- "source": 0.998046875,
10
- "special": 0.96875
11
- },
12
- "field_correct": {
13
- "group": 512,
14
- "title": 499,
15
- "season": 502,
16
- "episode": 413,
17
- "resolution": 512,
18
- "source": 511,
19
- "special": 496
20
- },
21
- "field_total": {
22
- "group": 512,
23
- "title": 512,
24
- "season": 512,
25
- "episode": 512,
26
- "resolution": 512,
27
- "source": 512,
28
- "special": 512
29
- },
30
- "full_match_accuracy": 0.751953125,
31
- "full_match_correct": 385,
32
- "full_match_total": 512,
33
- "failures": [
34
- {
35
- "filename": "[ReinForce] Sword Art Online II - ED3 (BDRip 1920x1080 x264 FLAC)",
36
- "errors": {
37
- "season": {
38
- "gold": null,
39
- "pred": "2"
40
- }
41
- },
42
- "gold": {
43
- "group": "ReinForce",
44
- "title": "Sword Art Online II",
45
- "season": null,
46
- "episode": null,
47
- "resolution": "1920x1080",
48
- "source": "BDRip",
49
- "special": "ED3"
50
- },
51
- "pred": {
52
- "group": "ReinForce",
53
- "title": "Sword Art Online II",
54
- "season": 2,
55
- "episode": null,
56
- "resolution": "1920x1080",
57
- "source": "BDRip",
58
- "special": "ED3"
59
- }
60
- },
61
- {
62
- "filename": "[アニメ DVD] 銀装騎攻オーディアン ACT.06 特典映像 川田&榎本トーク (DVD 640x480 WMV9 QB90 30fps MP3 192kbps)",
63
- "errors": {
64
- "title": {
65
- "gold": "銀装騎攻オーディアン act.06 特典映像 川田&榎本トーク",
66
- "pred": "銀装騎攻オーディアン act.06 特典映像 川田"
67
- }
68
- },
69
- "gold": {
70
- "group": "アニメ DVD",
71
- "title": "銀装騎攻オーディアン ACT.06 特典映像 川田&榎本トーク",
72
- "season": null,
73
- "episode": null,
74
- "resolution": "640x480",
75
- "source": "DVD",
76
- "special": null
77
- },
78
- "pred": {
79
- "group": "アニメ DVD",
80
- "title": "銀装騎攻オーディアン ACT.06 特典映像 川田",
81
- "season": null,
82
- "episode": null,
83
- "resolution": "640x480",
84
- "source": "DVD",
85
- "special": null
86
- }
87
- },
88
- {
89
- "filename": "05-ラディアン 第2シリーズ_ED",
90
- "errors": {
91
- "title": {
92
- "gold": "05-ラディアン 第2シリーズ",
93
- "pred": "05-ラディアン 第2"
94
- }
95
- },
96
- "gold": {
97
- "group": null,
98
- "title": "05-ラディアン 第2シリーズ",
99
- "season": null,
100
- "episode": null,
101
- "resolution": null,
102
- "source": null,
103
- "special": "ED"
104
- },
105
- "pred": {
106
- "group": null,
107
- "title": "05-ラディアン 第2",
108
- "season": null,
109
- "episode": null,
110
- "resolution": null,
111
- "source": null,
112
- "special": "ED"
113
- }
114
- },
115
- {
116
- "filename": "[A.A] hinotori 03",
117
- "errors": {
118
- "title": {
119
- "gold": "hinotori 03",
120
- "pred": "hinotori"
121
- },
122
- "episode": {
123
- "gold": null,
124
- "pred": "3"
125
- }
126
- },
127
- "gold": {
128
- "group": "A.A",
129
- "title": "hinotori 03",
130
- "season": null,
131
- "episode": null,
132
- "resolution": null,
133
- "source": null,
134
- "special": null
135
- },
136
- "pred": {
137
- "group": "A.A",
138
- "title": "hinotori",
139
- "season": null,
140
- "episode": 3,
141
- "resolution": null,
142
- "source": null,
143
- "special": null
144
- }
145
- },
146
- {
147
- "filename": "[Nekomoe kissaten] Azur Lane Bisoku Zenshin! [ED][05][BDRip 1080p HEVC-10bit FLAC]",
148
- "errors": {
149
- "episode": {
150
- "gold": null,
151
- "pred": "5"
152
- }
153
- },
154
- "gold": {
155
- "group": "Nekomoe kissaten",
156
- "title": "Azur Lane Bisoku Zenshin! [ED",
157
- "season": null,
158
- "episode": null,
159
- "resolution": "1080p",
160
- "source": "BDRip",
161
- "special": "05"
162
- },
163
- "pred": {
164
- "group": "Nekomoe kissaten",
165
- "title": "Azur Lane Bisoku Zenshin! [ED",
166
- "season": null,
167
- "episode": 5,
168
- "resolution": "1080p",
169
- "source": "BDRip",
170
- "special": "05"
171
- }
172
- },
173
- {
174
- "filename": "[VCB-Studio] Danmachi IV [10][Ma10p_1080p][x265_flac]",
175
- "errors": {
176
- "season": {
177
- "gold": null,
178
- "pred": "4"
179
- },
180
- "episode": {
181
- "gold": null,
182
- "pred": "10"
183
- }
184
- },
185
- "gold": {
186
- "group": "VCB-Studio",
187
- "title": "Danmachi",
188
- "season": null,
189
- "episode": null,
190
- "resolution": "1080p",
191
- "source": "x265_flac",
192
- "special": "10"
193
- },
194
- "pred": {
195
- "group": "VCB-Studio",
196
- "title": "Danmachi",
197
- "season": 4,
198
- "episode": 10,
199
- "resolution": "1080p",
200
- "source": "x265_flac",
201
- "special": "10"
202
- }
203
- },
204
- {
205
- "filename": "[FZSD&DBD-Raws][King of Prism Dramatic Prism.1][PV][12][1080P][BDRip][HEVC-10bit][FLAC]",
206
- "errors": {
207
- "episode": {
208
- "gold": null,
209
- "pred": "12"
210
- }
211
  },
212
- "gold": {
213
- "group": "FZSD&DBD-Raws",
214
- "title": "King of Prism Dramatic Prism.1",
215
- "season": null,
216
- "episode": null,
217
- "resolution": "1080P",
218
- "source": "BDRip",
219
- "special": "12"
220
  },
221
- "pred": {
222
- "group": "FZSD&DBD-Raws",
223
- "title": "King of Prism Dramatic Prism.1",
224
- "season": null,
225
- "episode": 12,
226
- "resolution": "1080P",
227
- "source": "BDRip",
228
- "special": "12"
229
- }
230
- },
231
- {
232
- "filename": "[SAIO-Raws] Wakaokami wa Shougakusei! PV 02 [BD 1920x1080 HEVC-10bit OPUS]",
233
- "errors": {
234
- "episode": {
235
- "gold": null,
236
- "pred": "2"
237
- }
238
- },
239
- "gold": {
240
- "group": "SAIO-Raws",
241
- "title": "Wakaokami wa Shougakusei! PV 02",
242
- "season": null,
243
- "episode": null,
244
- "resolution": "1920x1080",
245
- "source": "BD",
246
- "special": "PV 02"
247
- },
248
- "pred": {
249
- "group": "SAIO-Raws",
250
- "title": "Wakaokami wa Shougakusei! PV 02",
251
- "season": null,
252
- "episode": 2,
253
- "resolution": "1920x1080",
254
- "source": "BD",
255
- "special": "PV 02"
256
- }
257
- },
258
- {
259
- "filename": "[DBD-Raws][Hime-sama Goumon no Jikan Desu][PV][01][1080P][BDRip][HEVC-10bit][FLAC]",
260
- "errors": {
261
- "episode": {
262
- "gold": null,
263
- "pred": "1"
264
- }
265
- },
266
- "gold": {
267
- "group": "DBD-Raws",
268
- "title": "Hime-sama Goumon no Jikan Desu",
269
- "season": null,
270
- "episode": null,
271
- "resolution": "1080P",
272
- "source": "BDRip",
273
- "special": "01"
274
  },
275
- "pred": {
276
- "group": "DBD-Raws",
277
- "title": "Hime-sama Goumon no Jikan Desu",
278
- "season": null,
279
- "episode": 1,
280
- "resolution": "1080P",
281
- "source": "BDRip",
282
- "special": "01"
283
- }
284
- },
285
- {
286
- "filename": "[DBD-Raws][Tenshi no 3P!][PV][03][1080P][BDRip][HEVC-10bit][FLAC]",
287
- "errors": {
288
- "episode": {
289
- "gold": null,
290
- "pred": "3"
291
- }
292
- },
293
- "gold": {
294
- "group": "DBD-Raws",
295
- "title": "Tenshi no 3P!",
296
- "season": null,
297
- "episode": null,
298
- "resolution": "1080P",
299
- "source": "BDRip",
300
- "special": "03"
301
- },
302
- "pred": {
303
- "group": "DBD-Raws",
304
- "title": "Tenshi no 3P!",
305
- "season": null,
306
- "episode": 3,
307
- "resolution": "1080P",
308
- "source": "BDRip",
309
- "special": "03"
310
- }
311
- },
312
- {
313
- "filename": "[Suzu-Kaze] DanMachi IV 21 [WebRip 1920x1080 HEVC YUV420P10 AAC]",
314
- "errors": {
315
- "episode": {
316
- "gold": null,
317
- "pred": "21"
318
- }
319
- },
320
- "gold": {
321
- "group": "Suzu-Kaze",
322
- "title": "DanMachi IV 21",
323
- "season": null,
324
- "episode": null,
325
- "resolution": "1920x1080",
326
- "source": "WebRip",
327
- "special": "IV 21"
328
- },
329
- "pred": {
330
- "group": "Suzu-Kaze",
331
- "title": "DanMachi IV 21",
332
- "season": null,
333
- "episode": 21,
334
- "resolution": "1920x1080",
335
- "source": "WebRip",
336
- "special": "IV 21"
337
- }
338
- },
339
- {
340
- "filename": "[VCB-Studio] Log Horizon 2 [IV03][Ma10p_1080p][x265_aac]",
341
- "errors": {
342
- "season": {
343
- "gold": null,
344
- "pred": "2"
345
- }
346
- },
347
- "gold": {
348
- "group": "VCB-Studio",
349
- "title": "Log Horizon 2",
350
- "season": null,
351
- "episode": null,
352
- "resolution": "1080p",
353
- "source": "x265_aac",
354
- "special": "IV03"
355
- },
356
- "pred": {
357
- "group": "VCB-Studio",
358
- "title": "Log Horizon 2",
359
- "season": 2,
360
- "episode": null,
361
- "resolution": "1080p",
362
- "source": "x265_aac",
363
- "special": "IV03"
364
- }
365
- },
366
- {
367
- "filename": "[DBD-Raws][Mahou Shoujo Lyrical Nanoha The Movie 2nd A's][PV][06][1080P][BDRip][HEVC-10bit][FLAC]",
368
- "errors": {
369
- "episode": {
370
- "gold": null,
371
- "pred": "6"
372
- }
373
- },
374
- "gold": {
375
- "group": "DBD-Raws",
376
- "title": "Mahou Shoujo Lyrical Nanoha The Movie 2nd A's",
377
- "season": null,
378
- "episode": null,
379
- "resolution": "1080P",
380
- "source": "BDRip",
381
- "special": "06"
382
- },
383
- "pred": {
384
- "group": "DBD-Raws",
385
- "title": "Mahou Shoujo Lyrical Nanoha The Movie 2nd A's",
386
- "season": null,
387
- "episode": 6,
388
- "resolution": "1080P",
389
- "source": "BDRip",
390
- "special": "06"
391
- }
392
- },
393
- {
394
- "filename": "[DBD-Raws][Hana wa Saku, Shura no Gotoku][PV][11][1080P][BDRip][HEVC-10bit][FLAC]",
395
- "errors": {
396
- "episode": {
397
- "gold": null,
398
- "pred": "11"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
399
  }
400
- },
401
- "gold": {
402
- "group": "DBD-Raws",
403
- "title": "Hana wa Saku, Shura no Gotoku",
404
- "season": null,
405
- "episode": null,
406
- "resolution": "1080P",
407
- "source": "BDRip",
408
- "special": "11"
409
- },
410
- "pred": {
411
- "group": "DBD-Raws",
412
- "title": "Hana wa Saku, Shura no Gotoku",
413
- "season": null,
414
- "episode": 11,
415
- "resolution": "1080P",
416
- "source": "BDRip",
417
- "special": "11"
418
- }
419
  },
420
- {
421
- "filename": "[Seed-Raws] Strike the Blood IV - OVA Vol.01 Menu 02 (BD 1280x720 AVC AAC)",
422
- "errors": {
423
- "season": {
424
- "gold": "4",
425
- "pred": null
426
- }
 
 
 
 
 
427
  },
428
- "gold": {
429
- "group": "Seed-Raws",
430
- "title": "Strike the Blood IV - OVA Vol.01 Menu 02",
431
- "season": 4,
432
- "episode": null,
433
- "resolution": "1280x720",
434
- "source": "BD",
435
- "special": "OVA"
436
  },
437
- "pred": {
438
- "group": "Seed-Raws",
439
- "title": "Strike the Blood IV - OVA Vol.01 Menu 02",
440
- "season": null,
441
- "episode": null,
442
- "resolution": "1280x720",
443
- "source": "BD",
444
- "special": "OVA"
445
- }
446
- },
447
- {
448
- "filename": "[DBD-Raws][Hametsu no Oukoku][PV][05][1080P][BDRip][HEVC-10bit][FLAC]",
449
- "errors": {
450
- "episode": {
451
- "gold": null,
452
- "pred": "5"
453
- }
454
  },
455
- "gold": {
456
- "group": "DBD-Raws",
457
- "title": "Hametsu no Oukoku",
458
- "season": null,
459
- "episode": null,
460
- "resolution": "1080P",
461
- "source": "BDRip",
462
- "special": "05"
463
- },
464
- "pred": {
465
- "group": "DBD-Raws",
466
- "title": "Hametsu no Oukoku",
467
- "season": null,
468
- "episode": 5,
469
- "resolution": "1080P",
470
- "source": "BDRip",
471
- "special": "05"
472
- }
473
- },
474
- {
475
- "filename": "[DBD-Raws][Tate no Yuusha no Nariagari S1][PV][03][1080P][BDRip][HEVC-10bit][FLAC]",
476
- "errors": {
477
- "episode": {
478
- "gold": null,
479
- "pred": "3"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
480
  }
481
- },
482
- "gold": {
483
- "group": "DBD-Raws",
484
- "title": "Tate no Yuusha no Nariagari",
485
- "season": 1,
486
- "episode": null,
487
- "resolution": "1080P",
488
- "source": "BDRip",
489
- "special": "03"
490
- },
491
- "pred": {
492
- "group": "DBD-Raws",
493
- "title": "Tate no Yuusha no Nariagari",
494
- "season": 1,
495
- "episode": 3,
496
- "resolution": "1080P",
497
- "source": "BDRip",
498
- "special": "03"
499
- }
500
  },
501
- {
502
- "filename": "[DBD-Raws][Kimi no Iro][PV][12][1080P][BDRip][HEVC-10bit][FLAC]",
503
- "errors": {
504
- "episode": {
505
- "gold": null,
506
- "pred": "12"
507
- }
 
 
 
 
 
508
  },
509
- "gold": {
510
- "group": "DBD-Raws",
511
- "title": "Kimi no Iro",
512
- "season": null,
513
- "episode": null,
514
- "resolution": "1080P",
515
- "source": "BDRip",
516
- "special": "12"
517
  },
518
- "pred": {
519
- "group": "DBD-Raws",
520
- "title": "Kimi no Iro",
521
- "season": null,
522
- "episode": 12,
523
- "resolution": "1080P",
524
- "source": "BDRip",
525
- "special": "12"
526
- }
527
- },
528
- {
529
- "filename": "[DBD-Raws][Hime-sama Goumon no Jikan Desu][PV][02][1080P][BDRip][HEVC-10bit][FLAC]",
530
- "errors": {
531
- "episode": {
532
- "gold": null,
533
- "pred": "2"
534
- }
535
- },
536
- "gold": {
537
- "group": "DBD-Raws",
538
- "title": "Hime-sama Goumon no Jikan Desu",
539
- "season": null,
540
- "episode": null,
541
- "resolution": "1080P",
542
- "source": "BDRip",
543
- "special": "02"
544
  },
545
- "pred": {
546
- "group": "DBD-Raws",
547
- "title": "Hime-sama Goumon no Jikan Desu",
548
- "season": null,
549
- "episode": 2,
550
- "resolution": "1080P",
551
- "source": "BDRip",
552
- "special": "02"
553
- }
554
- },
555
- {
556
- "filename": "Mahou.no.Angel.Sweet.Mint.TV.1990.DVDRip-Hi.x264.AC3.1024.EP21-nezumi",
557
- "errors": {
558
- "title": {
559
- "gold": "mahou.no.angel.sweet.mint.tv.1990. -hi. .ac",
560
- "pred": "mahou.no.angel.sweet.mint.tv.1 -h"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
561
  }
562
- },
563
- "gold": {
564
- "group": null,
565
- "title": "Mahou.no.Angel.Sweet.Mint.TV.1990. -Hi. .AC",
566
- "season": null,
567
- "episode": 21,
568
- "resolution": null,
569
- "source": "DVDRip",
570
- "special": null
571
- },
572
- "pred": {
573
- "group": null,
574
- "title": "Mahou.no.Angel.Sweet.Mint.TV.1 -H",
575
- "season": null,
576
- "episode": 21,
577
- "resolution": null,
578
- "source": "DVDRip",
579
- "special": null
580
- }
581
  }
582
- ]
583
  }
 
1
  {
2
+ "primary_metric": "normalized_only",
3
+ "modes": {
4
+ "model_only": {
5
+ "use_rules": false,
6
+ "constrain_bio": false,
7
+ "sample_count": 1024,
8
+ "field_accuracy": {
9
+ "group": 1.0,
10
+ "title": 0.9970703125,
11
+ "season": 1.0,
12
+ "episode": 1.0,
13
+ "resolution": 1.0,
14
+ "source": 0.9990234375,
15
+ "special": 0.994140625
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
  },
17
+ "field_correct": {
18
+ "group": 1024,
19
+ "title": 1021,
20
+ "season": 1024,
21
+ "episode": 1024,
22
+ "resolution": 1024,
23
+ "source": 1023,
24
+ "special": 1018
25
  },
26
+ "field_total": {
27
+ "group": 1024,
28
+ "title": 1024,
29
+ "season": 1024,
30
+ "episode": 1024,
31
+ "resolution": 1024,
32
+ "source": 1024,
33
+ "special": 1024
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
34
  },
35
+ "full_match_accuracy": 0.990234375,
36
+ "full_match_correct": 1014,
37
+ "full_match_total": 1024,
38
+ "failures": [
39
+ {
40
+ "filename": "BD Menu Vol.05",
41
+ "errors": {
42
+ "title": {
43
+ "gold": "menu",
44
+ "pred": "men"
45
+ }
46
+ },
47
+ "gold": {
48
+ "group": null,
49
+ "title": "Menu",
50
+ "season": null,
51
+ "episode": null,
52
+ "resolution": null,
53
+ "source": "BD",
54
+ "special": null
55
+ },
56
+ "pred": {
57
+ "group": null,
58
+ "title": "Men",
59
+ "season": null,
60
+ "episode": null,
61
+ "resolution": null,
62
+ "source": "BD",
63
+ "special": null
64
+ }
65
+ },
66
+ {
67
+ "filename": "[YYDM-11FANS][Shaman King][43][DVDRip][720P][X264-10bit_AACx2][ED490609]",
68
+ "errors": {
69
+ "special": {
70
+ "gold": "ed490609",
71
+ "pred": "ed"
72
+ }
73
+ },
74
+ "gold": {
75
+ "group": "YYDM-11FANS",
76
+ "title": "Shaman King",
77
+ "season": null,
78
+ "episode": 43,
79
+ "resolution": "720P",
80
+ "source": "DVDRip",
81
+ "special": "ED490609"
82
+ },
83
+ "pred": {
84
+ "group": "YYDM-11FANS",
85
+ "title": "Shaman King",
86
+ "season": null,
87
+ "episode": 43,
88
+ "resolution": "720P",
89
+ "source": "DVDRip",
90
+ "special": "ED"
91
+ }
92
+ },
93
+ {
94
+ "filename": "[UHA-WINGS&VCB-Studio] Karakai Jouzu no Takagi-san 2 [Binaural Sound EP09][Ma10p_1080p][x265_flac]",
95
+ "errors": {
96
+ "title": {
97
+ "gold": "uha-wings&vcb-studio karakai jouzu no takagi san 2 binaural sound ep",
98
+ "pred": "uha-wings&vcb-studio karakai jouzu no takagi san 2 binaural sound"
99
+ }
100
+ },
101
+ "gold": {
102
+ "group": null,
103
+ "title": "UHA-WINGS&VCB-Studio Karakai Jouzu no Takagi san 2 Binaural Sound EP",
104
+ "season": null,
105
+ "episode": 9,
106
+ "resolution": "1080p",
107
+ "source": "x265-flac",
108
+ "special": null
109
+ },
110
+ "pred": {
111
+ "group": null,
112
+ "title": "UHA-WINGS&VCB-Studio Karakai Jouzu no Takagi san 2 Binaural Sound",
113
+ "season": null,
114
+ "episode": 9,
115
+ "resolution": "1080p",
116
+ "source": "x265-flac",
117
+ "special": null
118
+ }
119
+ },
120
+ {
121
+ "filename": "終末なにしてますか?忙しいですか?救ってもらっていいですか? OP2 「DEAREST DROP」",
122
+ "errors": {
123
+ "special": {
124
+ "gold": "op2",
125
+ "pred": "p"
126
+ }
127
+ },
128
+ "gold": {
129
+ "group": null,
130
+ "title": "終末なにしてますか?忙しいですか?救ってもらっていいですか?",
131
+ "season": null,
132
+ "episode": null,
133
+ "resolution": null,
134
+ "source": null,
135
+ "special": "OP2"
136
+ },
137
+ "pred": {
138
+ "group": null,
139
+ "title": "終末なにしてますか?忙しいですか?救ってもらっていいですか?",
140
+ "season": null,
141
+ "episode": null,
142
+ "resolution": null,
143
+ "source": null,
144
+ "special": "P"
145
+ }
146
+ },
147
+ {
148
+ "filename": "[QTS] CITY HUNTER TV 3rd & '91 Series BD-BOX Eizou Tokuten - City Hunter 3 Housouchuu CM2 30sec (BD Hi10P 960x720 AAC)",
149
+ "errors": {
150
+ "special": {
151
+ "gold": "cm2",
152
+ "pred": "3"
153
+ }
154
+ },
155
+ "gold": {
156
+ "group": "QTS",
157
+ "title": "CITY HUNTER TV 3rd & '91 Series",
158
+ "season": null,
159
+ "episode": null,
160
+ "resolution": "960x720",
161
+ "source": "BD",
162
+ "special": "CM2"
163
+ },
164
+ "pred": {
165
+ "group": "QTS",
166
+ "title": "CITY HUNTER TV 3rd & '91 Series",
167
+ "season": null,
168
+ "episode": null,
169
+ "resolution": "960x720",
170
+ "source": "BD",
171
+ "special": "3"
172
+ }
173
+ },
174
+ {
175
+ "filename": "22話「最凶恶的“他”Part 1/最凶恶的“他”Part 2」",
176
+ "errors": {
177
+ "title": {
178
+ "gold": "22話「最凶恶的“他”part 1/最凶恶的“他”part 2」",
179
+ "pred": "2 話「最凶恶的“他”part 1/最凶恶的“他”part 2」"
180
+ }
181
+ },
182
+ "gold": {
183
+ "group": null,
184
+ "title": "22話「最凶恶的“他”Part 1/最凶恶的“他”Part 2」",
185
+ "season": null,
186
+ "episode": null,
187
+ "resolution": null,
188
+ "source": null,
189
+ "special": null
190
+ },
191
+ "pred": {
192
+ "group": null,
193
+ "title": "2 話「最凶恶的“他”Part 1/最凶恶的“他”Part 2」",
194
+ "season": null,
195
+ "episode": null,
196
+ "resolution": null,
197
+ "source": null,
198
+ "special": null
199
+ }
200
+ },
201
+ {
202
+ "filename": "[Judgment] Kareshi Kanojo no Jijou - NCED19.0 [6C5BD6E2]",
203
+ "errors": {
204
+ "source": {
205
+ "gold": "bd",
206
+ "pred": null
207
+ }
208
+ },
209
+ "gold": {
210
+ "group": "Judgment",
211
+ "title": "Kareshi Kanojo no Jijou",
212
+ "season": null,
213
+ "episode": null,
214
+ "resolution": null,
215
+ "source": "BD",
216
+ "special": "NCED19"
217
+ },
218
+ "pred": {
219
+ "group": "Judgment",
220
+ "title": "Kareshi Kanojo no Jijou",
221
+ "season": null,
222
+ "episode": null,
223
+ "resolution": null,
224
+ "source": null,
225
+ "special": "NCED19"
226
+ }
227
+ },
228
+ {
229
+ "filename": "[QTS] OVA QUIZ MAGIC ACADEMY The Original Animation OMAKE Eizou - EP1 OP + EP2 OPED (BD Hi10P 1920x1080 WV)",
230
+ "errors": {
231
+ "special": {
232
+ "gold": "op",
233
+ "pred": "d"
234
+ }
235
+ },
236
+ "gold": {
237
+ "group": "QTS",
238
+ "title": "QUIZ MAGIC ACADEMY The Original Animation OMAKE Eizou",
239
+ "season": null,
240
+ "episode": 1,
241
+ "resolution": "1920x1080",
242
+ "source": "BD",
243
+ "special": "OP"
244
+ },
245
+ "pred": {
246
+ "group": "QTS",
247
+ "title": "QUIZ MAGIC ACADEMY The Original Animation OMAKE Eizou",
248
+ "season": null,
249
+ "episode": 1,
250
+ "resolution": "1920x1080",
251
+ "source": "BD",
252
+ "special": "D"
253
+ }
254
+ },
255
+ {
256
+ "filename": "[Moozzi2] Ojamajo Doremi Movie - 01 [ Sharp ] (BD 1920x1032 x.264 Flac)",
257
+ "errors": {
258
+ "special": {
259
+ "gold": "movie",
260
+ "pred": "0"
261
+ }
262
+ },
263
+ "gold": {
264
+ "group": "Moozzi2",
265
+ "title": "Ojamajo Doremi",
266
+ "season": null,
267
+ "episode": 1,
268
+ "resolution": "1920x1032",
269
+ "source": "BD",
270
+ "special": "Movie"
271
+ },
272
+ "pred": {
273
+ "group": "Moozzi2",
274
+ "title": "Ojamajo Doremi",
275
+ "season": null,
276
+ "episode": 1,
277
+ "resolution": "1920x1032",
278
+ "source": "BD",
279
+ "special": "0"
280
+ }
281
+ },
282
+ {
283
+ "filename": "[Nekomoe kissaten&VCB-Studio] THE IDOLM@STER CINDERELLA GIRLS U149 [NCED07 お願い!シンデレラ][Ma10p_1080p][x265_flac]",
284
+ "errors": {
285
+ "special": {
286
+ "gold": "nced",
287
+ "pred": "7"
288
+ }
289
+ },
290
+ "gold": {
291
+ "group": "Nekomoe kissaten&VCB-Studio",
292
+ "title": "THE IDOLM@STER CINDERELLA GIRLS U",
293
+ "season": null,
294
+ "episode": 149,
295
+ "resolution": "1080p",
296
+ "source": "x265-flac",
297
+ "special": "NCED"
298
+ },
299
+ "pred": {
300
+ "group": "Nekomoe kissaten&VCB-Studio",
301
+ "title": "THE IDOLM@STER CINDERELLA GIRLS U",
302
+ "season": null,
303
+ "episode": 149,
304
+ "resolution": "1080p",
305
+ "source": "x265-flac",
306
+ "special": "7"
307
+ }
308
  }
309
+ ]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
310
  },
311
+ "normalized_only": {
312
+ "use_rules": false,
313
+ "constrain_bio": true,
314
+ "sample_count": 1024,
315
+ "field_accuracy": {
316
+ "group": 1.0,
317
+ "title": 0.9970703125,
318
+ "season": 1.0,
319
+ "episode": 1.0,
320
+ "resolution": 1.0,
321
+ "source": 0.9990234375,
322
+ "special": 0.9970703125
323
  },
324
+ "field_correct": {
325
+ "group": 1024,
326
+ "title": 1021,
327
+ "season": 1024,
328
+ "episode": 1024,
329
+ "resolution": 1024,
330
+ "source": 1023,
331
+ "special": 1021
332
  },
333
+ "field_total": {
334
+ "group": 1024,
335
+ "title": 1024,
336
+ "season": 1024,
337
+ "episode": 1024,
338
+ "resolution": 1024,
339
+ "source": 1024,
340
+ "special": 1024
 
 
 
 
 
 
 
 
 
341
  },
342
+ "full_match_accuracy": 0.9931640625,
343
+ "full_match_correct": 1017,
344
+ "full_match_total": 1024,
345
+ "failures": [
346
+ {
347
+ "filename": "BD Menu Vol.05",
348
+ "errors": {
349
+ "title": {
350
+ "gold": "menu",
351
+ "pred": "men"
352
+ }
353
+ },
354
+ "gold": {
355
+ "group": null,
356
+ "title": "Menu",
357
+ "season": null,
358
+ "episode": null,
359
+ "resolution": null,
360
+ "source": "BD",
361
+ "special": null
362
+ },
363
+ "pred": {
364
+ "group": null,
365
+ "title": "Men",
366
+ "season": null,
367
+ "episode": null,
368
+ "resolution": null,
369
+ "source": "BD",
370
+ "special": null
371
+ }
372
+ },
373
+ {
374
+ "filename": "[YYDM-11FANS][Shaman King][43][DVDRip][720P][X264-10bit_AACx2][ED490609]",
375
+ "errors": {
376
+ "special": {
377
+ "gold": "ed490609",
378
+ "pred": "ed"
379
+ }
380
+ },
381
+ "gold": {
382
+ "group": "YYDM-11FANS",
383
+ "title": "Shaman King",
384
+ "season": null,
385
+ "episode": 43,
386
+ "resolution": "720P",
387
+ "source": "DVDRip",
388
+ "special": "ED490609"
389
+ },
390
+ "pred": {
391
+ "group": "YYDM-11FANS",
392
+ "title": "Shaman King",
393
+ "season": null,
394
+ "episode": 43,
395
+ "resolution": "720P",
396
+ "source": "DVDRip",
397
+ "special": "ED"
398
+ }
399
+ },
400
+ {
401
+ "filename": "[UHA-WINGS&VCB-Studio] Karakai Jouzu no Takagi-san 2 [Binaural Sound EP09][Ma10p_1080p][x265_flac]",
402
+ "errors": {
403
+ "title": {
404
+ "gold": "uha-wings&vcb-studio karakai jouzu no takagi san 2 binaural sound ep",
405
+ "pred": "uha-wings&vcb-studio karakai jouzu no takagi san 2 binaural sound"
406
+ }
407
+ },
408
+ "gold": {
409
+ "group": null,
410
+ "title": "UHA-WINGS&VCB-Studio Karakai Jouzu no Takagi san 2 Binaural Sound EP",
411
+ "season": null,
412
+ "episode": 9,
413
+ "resolution": "1080p",
414
+ "source": "x265-flac",
415
+ "special": null
416
+ },
417
+ "pred": {
418
+ "group": null,
419
+ "title": "UHA-WINGS&VCB-Studio Karakai Jouzu no Takagi san 2 Binaural Sound",
420
+ "season": null,
421
+ "episode": 9,
422
+ "resolution": "1080p",
423
+ "source": "x265-flac",
424
+ "special": null
425
+ }
426
+ },
427
+ {
428
+ "filename": "[QTS] CITY HUNTER TV 3rd & '91 Series BD-BOX Eizou Tokuten - City Hunter 3 Housouchuu CM2 30sec (BD Hi10P 960x720 AAC)",
429
+ "errors": {
430
+ "special": {
431
+ "gold": "cm2",
432
+ "pred": "cm2 3"
433
+ }
434
+ },
435
+ "gold": {
436
+ "group": "QTS",
437
+ "title": "CITY HUNTER TV 3rd & '91 Series",
438
+ "season": null,
439
+ "episode": null,
440
+ "resolution": "960x720",
441
+ "source": "BD",
442
+ "special": "CM2"
443
+ },
444
+ "pred": {
445
+ "group": "QTS",
446
+ "title": "CITY HUNTER TV 3rd & '91 Series",
447
+ "season": null,
448
+ "episode": null,
449
+ "resolution": "960x720",
450
+ "source": "BD",
451
+ "special": "CM2 3"
452
+ }
453
+ },
454
+ {
455
+ "filename": "22話「最凶恶的“他”Part 1/最凶恶的“他”Part 2」",
456
+ "errors": {
457
+ "title": {
458
+ "gold": "22話「最凶恶的“他”part 1/最凶恶的“他”part 2」",
459
+ "pred": "22 話「最凶恶的“他”part 1/最凶恶的“他”part 2」"
460
+ }
461
+ },
462
+ "gold": {
463
+ "group": null,
464
+ "title": "22話「最凶恶的“他”Part 1/最凶恶的“他”Part 2」",
465
+ "season": null,
466
+ "episode": null,
467
+ "resolution": null,
468
+ "source": null,
469
+ "special": null
470
+ },
471
+ "pred": {
472
+ "group": null,
473
+ "title": "22 話「最凶恶的“他”Part 1/最凶恶的“他”Part 2」",
474
+ "season": null,
475
+ "episode": null,
476
+ "resolution": null,
477
+ "source": null,
478
+ "special": null
479
+ }
480
+ },
481
+ {
482
+ "filename": "[Judgment] Kareshi Kanojo no Jijou - NCED19.0 [6C5BD6E2]",
483
+ "errors": {
484
+ "source": {
485
+ "gold": "bd",
486
+ "pred": null
487
+ }
488
+ },
489
+ "gold": {
490
+ "group": "Judgment",
491
+ "title": "Kareshi Kanojo no Jijou",
492
+ "season": null,
493
+ "episode": null,
494
+ "resolution": null,
495
+ "source": "BD",
496
+ "special": "NCED19"
497
+ },
498
+ "pred": {
499
+ "group": "Judgment",
500
+ "title": "Kareshi Kanojo no Jijou",
501
+ "season": null,
502
+ "episode": null,
503
+ "resolution": null,
504
+ "source": null,
505
+ "special": "NCED19"
506
+ }
507
+ },
508
+ {
509
+ "filename": "[Nekomoe kissaten&VCB-Studio] THE IDOLM@STER CINDERELLA GIRLS U149 [NCED07 お願い!シンデレラ][Ma10p_1080p][x265_flac]",
510
+ "errors": {
511
+ "special": {
512
+ "gold": "nced",
513
+ "pred": "nced07"
514
+ }
515
+ },
516
+ "gold": {
517
+ "group": "Nekomoe kissaten&VCB-Studio",
518
+ "title": "THE IDOLM@STER CINDERELLA GIRLS U",
519
+ "season": null,
520
+ "episode": 149,
521
+ "resolution": "1080p",
522
+ "source": "x265-flac",
523
+ "special": "NCED"
524
+ },
525
+ "pred": {
526
+ "group": "Nekomoe kissaten&VCB-Studio",
527
+ "title": "THE IDOLM@STER CINDERELLA GIRLS U",
528
+ "season": null,
529
+ "episode": 149,
530
+ "resolution": "1080p",
531
+ "source": "x265-flac",
532
+ "special": "NCED07"
533
+ }
534
  }
535
+ ]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
536
  },
537
+ "rule_assisted": {
538
+ "use_rules": true,
539
+ "constrain_bio": true,
540
+ "sample_count": 1024,
541
+ "field_accuracy": {
542
+ "group": 0.9873046875,
543
+ "title": 0.7265625,
544
+ "season": 0.9912109375,
545
+ "episode": 0.7021484375,
546
+ "resolution": 1.0,
547
+ "source": 0.98046875,
548
+ "special": 0.951171875
549
  },
550
+ "field_correct": {
551
+ "group": 1011,
552
+ "title": 744,
553
+ "season": 1015,
554
+ "episode": 719,
555
+ "resolution": 1024,
556
+ "source": 1004,
557
+ "special": 974
558
  },
559
+ "field_total": {
560
+ "group": 1024,
561
+ "title": 1024,
562
+ "season": 1024,
563
+ "episode": 1024,
564
+ "resolution": 1024,
565
+ "source": 1024,
566
+ "special": 1024
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
567
  },
568
+ "full_match_accuracy": 0.5068359375,
569
+ "full_match_correct": 519,
570
+ "full_match_total": 1024,
571
+ "failures": [
572
+ {
573
+ "filename": "[DBD-Raws][Tokidoki Bosotto Russia-go de Dereru Tonari no Alya-san][PV][20][1080P][BDRip][HEVC-10bit][FLAC]",
574
+ "errors": {
575
+ "episode": {
576
+ "gold": null,
577
+ "pred": "20"
578
+ }
579
+ },
580
+ "gold": {
581
+ "group": "DBD-Raws",
582
+ "title": "Tokidoki Bosotto Russia-go de Dereru Tonari no Alya-san",
583
+ "season": null,
584
+ "episode": null,
585
+ "resolution": "1080P",
586
+ "source": "BDRip",
587
+ "special": "20"
588
+ },
589
+ "pred": {
590
+ "group": "DBD-Raws",
591
+ "title": "Tokidoki Bosotto Russia-go de Dereru Tonari no Alya-san",
592
+ "season": null,
593
+ "episode": 20,
594
+ "resolution": "1080P",
595
+ "source": "BDRip",
596
+ "special": "20"
597
+ }
598
+ },
599
+ {
600
+ "filename": "[DBD-Raws][我的英雄学院 第三季][PV][02][1080P][BDRip][HEVC-10bit][FLAC]",
601
+ "errors": {
602
+ "title": {
603
+ "gold": "我的英雄学院",
604
+ "pred": "我的英雄学院 第三季"
605
+ },
606
+ "episode": {
607
+ "gold": null,
608
+ "pred": "2"
609
+ }
610
+ },
611
+ "gold": {
612
+ "group": "DBD-Raws",
613
+ "title": "我的英雄学院",
614
+ "season": 3,
615
+ "episode": null,
616
+ "resolution": "1080P",
617
+ "source": "BDRip",
618
+ "special": "02"
619
+ },
620
+ "pred": {
621
+ "group": "DBD-Raws",
622
+ "title": "我的英雄学院 第三季",
623
+ "season": 3,
624
+ "episode": 2,
625
+ "resolution": "1080P",
626
+ "source": "BDRip",
627
+ "special": "02"
628
+ }
629
+ },
630
+ {
631
+ "filename": "[Moozzi2] Katanagatari [SP01] NCOP - 02 (BD 1920x1080 x.264 Flac)",
632
+ "errors": {
633
+ "episode": {
634
+ "gold": "1",
635
+ "pred": null
636
+ }
637
+ },
638
+ "gold": {
639
+ "group": "Moozzi2",
640
+ "title": "Katanagatari",
641
+ "season": null,
642
+ "episode": 1,
643
+ "resolution": "1920x1080",
644
+ "source": "BD",
645
+ "special": "NCOP - 02"
646
+ },
647
+ "pred": {
648
+ "group": "Moozzi2",
649
+ "title": "Katanagatari",
650
+ "season": null,
651
+ "episode": null,
652
+ "resolution": "1920x1080",
653
+ "source": "BD",
654
+ "special": "NCOP - 02"
655
+ }
656
+ },
657
+ {
658
+ "filename": "[DBD-Raws][Ijiranaide, Nagatoro-san 2nd Attack][PV][06][1080P][BDRip][HEVC-10bit][FLAC]",
659
+ "errors": {
660
+ "episode": {
661
+ "gold": null,
662
+ "pred": "6"
663
+ }
664
+ },
665
+ "gold": {
666
+ "group": "DBD-Raws",
667
+ "title": "Ijiranaide, Nagatoro-san 2nd Attack",
668
+ "season": null,
669
+ "episode": null,
670
+ "resolution": "1080P",
671
+ "source": "BDRip",
672
+ "special": "06"
673
+ },
674
+ "pred": {
675
+ "group": "DBD-Raws",
676
+ "title": "Ijiranaide, Nagatoro-san 2nd Attack",
677
+ "season": null,
678
+ "episode": 6,
679
+ "resolution": "1080P",
680
+ "source": "BDRip",
681
+ "special": "06"
682
+ }
683
+ },
684
+ {
685
+ "filename": "【枫叶字幕组】宠物小精灵XY&Z[第30(122)话][720P][MP4][GB_JP].mp4",
686
+ "errors": {
687
+ "title": {
688
+ "gold": "宠物小精灵xy&z",
689
+ "pred": "宠物小精灵xy&z[第30"
690
+ },
691
+ "episode": {
692
+ "gold": "30",
693
+ "pred": "122"
694
+ }
695
+ },
696
+ "gold": {
697
+ "group": "枫叶字幕组",
698
+ "title": "宠物小精灵XY&Z",
699
+ "season": null,
700
+ "episode": 30,
701
+ "resolution": "720P",
702
+ "source": "GB-JP",
703
+ "special": null
704
+ },
705
+ "pred": {
706
+ "group": "枫叶字幕组",
707
+ "title": "宠物小精灵XY&Z[第30",
708
+ "season": null,
709
+ "episode": 122,
710
+ "resolution": "720P",
711
+ "source": "GB-JP",
712
+ "special": null
713
+ }
714
+ },
715
+ {
716
+ "filename": "[Snow-Raws] グランベルム CM&PV10 (BD 1920x1080 HEVC-YUV420P10 FLAC)",
717
+ "errors": {
718
+ "title": {
719
+ "gold": "グランベルム",
720
+ "pred": "グランベルム cm&pv10"
721
+ }
722
+ },
723
+ "gold": {
724
+ "group": "Snow-Raws",
725
+ "title": "グランベルム",
726
+ "season": null,
727
+ "episode": null,
728
+ "resolution": "1920x1080",
729
+ "source": "BD",
730
+ "special": "PV10"
731
+ },
732
+ "pred": {
733
+ "group": "Snow-Raws",
734
+ "title": "グランベルム CM&PV10",
735
+ "season": null,
736
+ "episode": null,
737
+ "resolution": "1920x1080",
738
+ "source": "BD",
739
+ "special": "PV10"
740
+ }
741
+ },
742
+ {
743
+ "filename": "[Moozzi2] High School D×D New [SP02] NCED - 01 (BD 1920x1080 x.264 Flac)",
744
+ "errors": {
745
+ "episode": {
746
+ "gold": "2",
747
+ "pred": null
748
+ }
749
+ },
750
+ "gold": {
751
+ "group": "Moozzi2",
752
+ "title": "High School D×D New",
753
+ "season": null,
754
+ "episode": 2,
755
+ "resolution": "1920x1080",
756
+ "source": "BD",
757
+ "special": "NCED - 01"
758
+ },
759
+ "pred": {
760
+ "group": "Moozzi2",
761
+ "title": "High School D×D New",
762
+ "season": null,
763
+ "episode": null,
764
+ "resolution": "1920x1080",
765
+ "source": "BD",
766
+ "special": "NCED - 01"
767
+ }
768
+ },
769
+ {
770
+ "filename": "[SFEO-Raws] Koimonogatari - CM_01 (BD 720P x264 10bit AAC)[783E6EF2]",
771
+ "errors": {
772
+ "title": {
773
+ "gold": "koimonogatari",
774
+ "pred": "koimonogatari - cm_01"
775
+ }
776
+ },
777
+ "gold": {
778
+ "group": "SFEO-Raws",
779
+ "title": "Koimonogatari",
780
+ "season": null,
781
+ "episode": null,
782
+ "resolution": "720P",
783
+ "source": "BD",
784
+ "special": "CM_01"
785
+ },
786
+ "pred": {
787
+ "group": "SFEO-Raws",
788
+ "title": "Koimonogatari - CM_01",
789
+ "season": null,
790
+ "episode": null,
791
+ "resolution": "720P",
792
+ "source": "BD",
793
+ "special": "CM_01"
794
+ }
795
+ },
796
+ {
797
+ "filename": "[H720] Sangatsu no Lion CM01 (BD 1208x720 HEVC AAC)",
798
+ "errors": {
799
+ "group": {
800
+ "gold": null,
801
+ "pred": "h720"
802
+ },
803
+ "title": {
804
+ "gold": "h",
805
+ "pred": "sangatsu no lion"
806
+ },
807
+ "episode": {
808
+ "gold": "720",
809
+ "pred": null
810
+ },
811
+ "special": {
812
+ "gold": "cm",
813
+ "pred": "cm01"
814
+ }
815
+ },
816
+ "gold": {
817
+ "group": null,
818
+ "title": "H",
819
+ "season": null,
820
+ "episode": 720,
821
+ "resolution": "1208x720",
822
+ "source": "BD",
823
+ "special": "CM"
824
+ },
825
+ "pred": {
826
+ "group": "H720",
827
+ "title": "Sangatsu no Lion",
828
+ "season": null,
829
+ "episode": null,
830
+ "resolution": "1208x720",
831
+ "source": "BD",
832
+ "special": "CM01"
833
+ }
834
+ },
835
+ {
836
+ "filename": "[FZSD&DBD-Raws][King of Prism Dramatic Prism.1][PV][08][1080P][BDRip][HEVC-10bit][FLAC]",
837
+ "errors": {
838
+ "episode": {
839
+ "gold": null,
840
+ "pred": "8"
841
+ }
842
+ },
843
+ "gold": {
844
+ "group": "FZSD&DBD-Raws",
845
+ "title": "King of Prism Dramatic Prism.1",
846
+ "season": null,
847
+ "episode": null,
848
+ "resolution": "1080P",
849
+ "source": "BDRip",
850
+ "special": "08"
851
+ },
852
+ "pred": {
853
+ "group": "FZSD&DBD-Raws",
854
+ "title": "King of Prism Dramatic Prism.1",
855
+ "season": null,
856
+ "episode": 8,
857
+ "resolution": "1080P",
858
+ "source": "BDRip",
859
+ "special": "08"
860
+ }
861
+ },
862
+ {
863
+ "filename": "Robin Hood no Daibouken 49",
864
+ "errors": {
865
+ "episode": {
866
+ "gold": null,
867
+ "pred": "49"
868
+ }
869
+ },
870
+ "gold": {
871
+ "group": null,
872
+ "title": "Robin Hood no Daibouken 49",
873
+ "season": null,
874
+ "episode": null,
875
+ "resolution": null,
876
+ "source": null,
877
+ "special": null
878
+ },
879
+ "pred": {
880
+ "group": null,
881
+ "title": "Robin Hood no Daibouken 49",
882
+ "season": null,
883
+ "episode": 49,
884
+ "resolution": null,
885
+ "source": null,
886
+ "special": null
887
+ }
888
+ },
889
+ {
890
+ "filename": "[Moozzi2] Paniponi Dash! [SP02] NCED - 07 [ EP.07 ] (BD 1920x1080 x.264 Flac)",
891
+ "errors": {
892
+ "episode": {
893
+ "gold": "2",
894
+ "pred": null
895
+ }
896
+ },
897
+ "gold": {
898
+ "group": "Moozzi2",
899
+ "title": "Paniponi Dash!",
900
+ "season": null,
901
+ "episode": 2,
902
+ "resolution": "1920x1080",
903
+ "source": "BD",
904
+ "special": "NCED - 07"
905
+ },
906
+ "pred": {
907
+ "group": "Moozzi2",
908
+ "title": "Paniponi Dash!",
909
+ "season": null,
910
+ "episode": null,
911
+ "resolution": "1920x1080",
912
+ "source": "BD",
913
+ "special": "NCED - 07"
914
+ }
915
+ },
916
+ {
917
+ "filename": "[Moozzi2] Onegai My Melody [SP10] Kuromi Naration TV-CM - 01 [ 30Sec. ] (BD 1024x768 x.264 AAC)",
918
+ "errors": {
919
+ "title": {
920
+ "gold": "onegai my melody",
921
+ "pred": "onegai my melody [sp10] kuromi naration tv-cm"
922
+ },
923
+ "episode": {
924
+ "gold": "10",
925
+ "pred": "1"
926
+ }
927
+ },
928
+ "gold": {
929
+ "group": "Moozzi2",
930
+ "title": "Onegai My Melody",
931
+ "season": null,
932
+ "episode": 10,
933
+ "resolution": "1024x768",
934
+ "source": "BD",
935
+ "special": "CM - 01"
936
+ },
937
+ "pred": {
938
+ "group": "Moozzi2",
939
+ "title": "Onegai My Melody [SP10] Kuromi Naration TV-CM",
940
+ "season": null,
941
+ "episode": 1,
942
+ "resolution": "1024x768",
943
+ "source": "BD",
944
+ "special": "CM - 01"
945
+ }
946
+ },
947
+ {
948
+ "filename": "[DBD-Raws][Kuzu no Honkai][PV][02][1080P][BDRip][HEVC-10bit][FLAC]",
949
+ "errors": {
950
+ "episode": {
951
+ "gold": null,
952
+ "pred": "2"
953
+ }
954
+ },
955
+ "gold": {
956
+ "group": "DBD-Raws",
957
+ "title": "Kuzu no Honkai",
958
+ "season": null,
959
+ "episode": null,
960
+ "resolution": "1080P",
961
+ "source": "BDRip",
962
+ "special": "02"
963
+ },
964
+ "pred": {
965
+ "group": "DBD-Raws",
966
+ "title": "Kuzu no Honkai",
967
+ "season": null,
968
+ "episode": 2,
969
+ "resolution": "1080P",
970
+ "source": "BDRip",
971
+ "special": "02"
972
+ }
973
+ },
974
+ {
975
+ "filename": "[DBD-Raws][One Piece Wano Arc][Soushuuhen][03][1080P][BDRip][HEVC-10bit][FLAC]",
976
+ "errors": {
977
+ "title": {
978
+ "gold": "one piece wano arc soushuuhen",
979
+ "pred": "one piece wano arc"
980
+ }
981
+ },
982
+ "gold": {
983
+ "group": "DBD-Raws",
984
+ "title": "One Piece Wano Arc Soushuuhen",
985
+ "season": null,
986
+ "episode": 3,
987
+ "resolution": "1080P",
988
+ "source": "BDRip",
989
+ "special": null
990
+ },
991
+ "pred": {
992
+ "group": "DBD-Raws",
993
+ "title": "One Piece Wano Arc",
994
+ "season": null,
995
+ "episode": 3,
996
+ "resolution": "1080P",
997
+ "source": "BDRip",
998
+ "special": null
999
+ }
1000
+ },
1001
+ {
1002
+ "filename": "[LAC][Gintama][196][GB][R10]",
1003
+ "errors": {
1004
+ "group": {
1005
+ "gold": null,
1006
+ "pred": "lac"
1007
+ },
1008
+ "title": {
1009
+ "gold": "lac gintama 196 gb r",
1010
+ "pred": "gintama"
1011
+ },
1012
+ "episode": {
1013
+ "gold": "10",
1014
+ "pred": "196"
1015
+ },
1016
+ "source": {
1017
+ "gold": null,
1018
+ "pred": "gb"
1019
+ }
1020
+ },
1021
+ "gold": {
1022
+ "group": null,
1023
+ "title": "LAC Gintama 196 GB R",
1024
+ "season": null,
1025
+ "episode": 10,
1026
+ "resolution": null,
1027
+ "source": null,
1028
+ "special": null
1029
+ },
1030
+ "pred": {
1031
+ "group": "LAC",
1032
+ "title": "Gintama",
1033
+ "season": null,
1034
+ "episode": 196,
1035
+ "resolution": null,
1036
+ "source": "GB",
1037
+ "special": null
1038
+ }
1039
+ },
1040
+ {
1041
+ "filename": "[DBD-Raws][Date a Live][Director's Cut][PV][07][1080P][BDRip][HEVC-10bit][FLAC]",
1042
+ "errors": {
1043
+ "title": {
1044
+ "gold": "date a live director's cut",
1045
+ "pred": "date a live"
1046
+ },
1047
+ "episode": {
1048
+ "gold": null,
1049
+ "pred": "7"
1050
+ }
1051
+ },
1052
+ "gold": {
1053
+ "group": "DBD-Raws",
1054
+ "title": "Date a Live Director's Cut",
1055
+ "season": null,
1056
+ "episode": null,
1057
+ "resolution": "1080P",
1058
+ "source": "BDRip",
1059
+ "special": "07"
1060
+ },
1061
+ "pred": {
1062
+ "group": "DBD-Raws",
1063
+ "title": "Date a Live",
1064
+ "season": null,
1065
+ "episode": 7,
1066
+ "resolution": "1080P",
1067
+ "source": "BDRip",
1068
+ "special": "07"
1069
+ }
1070
+ },
1071
+ {
1072
+ "filename": "[DBD-Raws][Nageki no Bourei wa Intai Shitai][PV][09][1080P][BDRip][HEVC-10bit][FLAC]",
1073
+ "errors": {
1074
+ "episode": {
1075
+ "gold": null,
1076
+ "pred": "9"
1077
+ }
1078
+ },
1079
+ "gold": {
1080
+ "group": "DBD-Raws",
1081
+ "title": "Nageki no Bourei wa Intai Shitai",
1082
+ "season": null,
1083
+ "episode": null,
1084
+ "resolution": "1080P",
1085
+ "source": "BDRip",
1086
+ "special": "09"
1087
+ },
1088
+ "pred": {
1089
+ "group": "DBD-Raws",
1090
+ "title": "Nageki no Bourei wa Intai Shitai",
1091
+ "season": null,
1092
+ "episode": 9,
1093
+ "resolution": "1080P",
1094
+ "source": "BDRip",
1095
+ "special": "09"
1096
+ }
1097
+ },
1098
+ {
1099
+ "filename": "[RUELL-Next] Fruits Basket NCOP 1 (DVD 768x576 x264 AC3 384K) [FF1CA8EF]",
1100
+ "errors": {
1101
+ "title": {
1102
+ "gold": "fruits basket",
1103
+ "pred": "fruits basket ncop 1"
1104
+ },
1105
+ "special": {
1106
+ "gold": "ncop 1",
1107
+ "pred": "ncop1"
1108
+ }
1109
+ },
1110
+ "gold": {
1111
+ "group": "RUELL-Next",
1112
+ "title": "Fruits Basket",
1113
+ "season": null,
1114
+ "episode": null,
1115
+ "resolution": "768x576",
1116
+ "source": "DVD",
1117
+ "special": "NCOP 1"
1118
+ },
1119
+ "pred": {
1120
+ "group": "RUELL-Next",
1121
+ "title": "Fruits Basket NCOP 1",
1122
+ "season": null,
1123
+ "episode": null,
1124
+ "resolution": "768x576",
1125
+ "source": "DVD",
1126
+ "special": "NCOP1"
1127
+ }
1128
+ },
1129
+ {
1130
+ "filename": "[アニメ DVD] ミスター味っ子 第69話 「島巡り磯鍋競争!7包丁人・大石老師登場」 (640x480 WMV9)",
1131
+ "errors": {
1132
+ "source": {
1133
+ "gold": null,
1134
+ "pred": "dvd"
1135
+ }
1136
+ },
1137
+ "gold": {
1138
+ "group": "アニメ DVD",
1139
+ "title": "ミスター味っ子",
1140
+ "season": null,
1141
+ "episode": 69,
1142
+ "resolution": "640x480",
1143
+ "source": null,
1144
+ "special": null
1145
+ },
1146
+ "pred": {
1147
+ "group": "アニメ DVD",
1148
+ "title": "ミスター味っ子",
1149
+ "season": null,
1150
+ "episode": 69,
1151
+ "resolution": "640x480",
1152
+ "source": "DVD",
1153
+ "special": null
1154
+ }
1155
  }
1156
+ ]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1157
  }
1158
+ }
1159
  }
run_metadata.json CHANGED
@@ -1,6 +1,6 @@
1
  {
2
- "experiment_name": "dmhy-char-special-focus2",
3
- "data_file": "data/repair_focus_char.jsonl",
4
  "tokenizer_variant": "char",
5
  "vocab_file": "datasets/AnimeName/vocab.char.json",
6
  "vocab_size": 6199,
@@ -9,15 +9,15 @@
9
  "num_hidden_layers": 4,
10
  "num_attention_heads": 8,
11
  "intermediate_size": 1024,
12
- "train_samples": 68939,
13
- "eval_samples": 3629,
14
- "epochs": 1.0,
15
- "batch_size": 64,
16
- "learning_rate": 3e-05,
17
- "warmup_steps": 50,
18
- "seed": 75,
19
- "device": "cpu",
20
- "fp16": false,
21
  "gradient_accumulation_steps": 1,
22
- "dataloader_num_workers": 0
23
  }
 
1
  {
2
+ "experiment_name": "dmhy-char-thin-hardfocus",
3
+ "data_file": "data/thin_hard_focus_char.jsonl",
4
  "tokenizer_variant": "char",
5
  "vocab_file": "datasets/AnimeName/vocab.char.json",
6
  "vocab_size": 6199,
 
9
  "num_hidden_layers": 4,
10
  "num_attention_heads": 8,
11
  "intermediate_size": 1024,
12
+ "train_samples": 117089,
13
+ "eval_samples": 6163,
14
+ "epochs": 2.0,
15
+ "batch_size": 256,
16
+ "learning_rate": 4e-05,
17
+ "warmup_steps": 80,
18
+ "seed": 58,
19
+ "device": "cuda",
20
+ "fp16": true,
21
  "gradient_accumulation_steps": 1,
22
+ "dataloader_num_workers": 4
23
  }
train.py CHANGED
@@ -230,6 +230,8 @@ def parse_exact_metrics(
230
  id2label: Dict[int, str],
231
  max_length: int,
232
  limit: Optional[int],
 
 
233
  ) -> Dict:
234
  """Evaluate end-to-end field exact match on filenames, not just token loss."""
235
  fields = ["group", "title", "season", "episode", "resolution", "source", "special"]
@@ -247,7 +249,7 @@ def parse_exact_metrics(
247
  available = max(0, max_length - 2)
248
  tokens = tokens[:available]
249
  gold_labels = gold_labels[:available]
250
- gold = postprocess(tokens, gold_labels, tokenizer=tokenizer, filename=filename, use_rules=True)
251
  gold_entities = {label.split("-", 1)[1] for label in gold_labels if label.startswith(("B-", "I-"))}
252
  for optional_field, entity in (("episode", "EPISODE"), ("season", "SEASON")):
253
  if entity not in gold_entities:
@@ -259,8 +261,8 @@ def parse_exact_metrics(
259
  id2label,
260
  max_length=max_length,
261
  debug=False,
262
- use_rules=True,
263
- constrain_bio=True,
264
  )
265
 
266
  full_match = True
@@ -296,6 +298,8 @@ def parse_exact_metrics(
296
  total = counter.get("full_total", 0)
297
  correct = counter.get("full_correct", 0)
298
  return {
 
 
299
  "sample_count": total,
300
  "field_accuracy": field_accuracy,
301
  "field_correct": {field: counter.get(f"{field}_correct", 0) for field in fields},
@@ -307,6 +311,37 @@ def parse_exact_metrics(
307
  }
308
 
309
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
310
  def remap_token_embeddings(
311
  model: BertForTokenClassification,
312
  old_vocab: Dict[str, int],
@@ -610,7 +645,7 @@ def main():
610
 
611
  if args.parse_eval_limit != 0:
612
  parse_limit = args.parse_eval_limit if args.parse_eval_limit and args.parse_eval_limit > 0 else None
613
- parse_metrics = parse_exact_metrics(
614
  eval_data,
615
  trainer.model,
616
  tokenizer,
@@ -621,38 +656,35 @@ def main():
621
  with open(os.path.join(final_save_path, "parse_eval_metrics.json"), "w", encoding="utf-8") as f:
622
  json.dump(parse_metrics, f, ensure_ascii=False, indent=2)
623
  print("\nParse exact-match evaluation:")
624
- print(
625
- f" full_match: {parse_metrics['full_match_correct']}/"
626
- f"{parse_metrics['full_match_total']} ({parse_metrics['full_match_accuracy']:.4f})"
627
- )
628
- for field, accuracy in parse_metrics["field_accuracy"].items():
629
- correct = parse_metrics["field_correct"][field]
630
- total = parse_metrics["field_total"][field]
631
- print(f" {field}: {correct}/{total} ({accuracy:.4f})")
632
 
633
  if not args.no_case_eval:
634
  if args.case_eval_file and os.path.isfile(args.case_eval_file):
635
- from evaluate_parser_cases import evaluate_cases
636
 
637
- case_metrics = evaluate_cases(
638
  model_dir=final_save_path,
639
  case_file=args.case_eval_file,
640
  tokenizer_variant=tokenizer_variant,
641
  max_length=config.max_seq_length,
642
- use_rules=True,
643
- constrain_bio=True,
644
  )
645
  case_output = args.case_eval_output or os.path.join(final_save_path, "case_metrics.json")
646
  os.makedirs(os.path.dirname(case_output) or ".", exist_ok=True)
647
  with open(case_output, "w", encoding="utf-8") as f:
648
  json.dump(case_metrics, f, ensure_ascii=False, indent=2)
649
  print("\nFixed case regression evaluation:")
650
- print(
651
- f" full_match: {case_metrics['full_correct']}/"
652
- f"{case_metrics['case_count']} ({case_metrics['full_accuracy']:.4f})"
653
- )
654
- if case_metrics["failures"]:
655
- print(f" failures: {len(case_metrics['failures'])} (see {case_output})")
 
 
656
  elif args.case_eval_file:
657
  print(f"\nSkipping fixed case regression evaluation; file not found: {args.case_eval_file}")
658
 
 
230
  id2label: Dict[int, str],
231
  max_length: int,
232
  limit: Optional[int],
233
+ use_rules: bool = False,
234
+ constrain_bio: bool = True,
235
  ) -> Dict:
236
  """Evaluate end-to-end field exact match on filenames, not just token loss."""
237
  fields = ["group", "title", "season", "episode", "resolution", "source", "special"]
 
249
  available = max(0, max_length - 2)
250
  tokens = tokens[:available]
251
  gold_labels = gold_labels[:available]
252
+ gold = postprocess(tokens, gold_labels, tokenizer=tokenizer, filename=filename, use_rules=False)
253
  gold_entities = {label.split("-", 1)[1] for label in gold_labels if label.startswith(("B-", "I-"))}
254
  for optional_field, entity in (("episode", "EPISODE"), ("season", "SEASON")):
255
  if entity not in gold_entities:
 
261
  id2label,
262
  max_length=max_length,
263
  debug=False,
264
+ use_rules=use_rules,
265
+ constrain_bio=constrain_bio,
266
  )
267
 
268
  full_match = True
 
298
  total = counter.get("full_total", 0)
299
  correct = counter.get("full_correct", 0)
300
  return {
301
+ "use_rules": use_rules,
302
+ "constrain_bio": constrain_bio,
303
  "sample_count": total,
304
  "field_accuracy": field_accuracy,
305
  "field_correct": {field: counter.get(f"{field}_correct", 0) for field in fields},
 
311
  }
312
 
313
 
314
+ def parse_exact_metrics_all_modes(
315
+ samples: List[Dict],
316
+ model: BertForTokenClassification,
317
+ tokenizer: AnimeTokenizer,
318
+ id2label: Dict[int, str],
319
+ max_length: int,
320
+ limit: Optional[int],
321
+ ) -> Dict:
322
+ modes = {
323
+ "model_only": {"use_rules": False, "constrain_bio": False},
324
+ "normalized_only": {"use_rules": False, "constrain_bio": True},
325
+ "rule_assisted": {"use_rules": True, "constrain_bio": True},
326
+ }
327
+ return {
328
+ "primary_metric": "normalized_only",
329
+ "modes": {
330
+ name: parse_exact_metrics(
331
+ samples,
332
+ model,
333
+ tokenizer,
334
+ id2label,
335
+ max_length,
336
+ limit,
337
+ use_rules=settings["use_rules"],
338
+ constrain_bio=settings["constrain_bio"],
339
+ )
340
+ for name, settings in modes.items()
341
+ },
342
+ }
343
+
344
+
345
  def remap_token_embeddings(
346
  model: BertForTokenClassification,
347
  old_vocab: Dict[str, int],
 
645
 
646
  if args.parse_eval_limit != 0:
647
  parse_limit = args.parse_eval_limit if args.parse_eval_limit and args.parse_eval_limit > 0 else None
648
+ parse_metrics = parse_exact_metrics_all_modes(
649
  eval_data,
650
  trainer.model,
651
  tokenizer,
 
656
  with open(os.path.join(final_save_path, "parse_eval_metrics.json"), "w", encoding="utf-8") as f:
657
  json.dump(parse_metrics, f, ensure_ascii=False, indent=2)
658
  print("\nParse exact-match evaluation:")
659
+ for mode_name, mode_metrics in parse_metrics["modes"].items():
660
+ print(
661
+ f" {mode_name}: {mode_metrics['full_match_correct']}/"
662
+ f"{mode_metrics['full_match_total']} ({mode_metrics['full_match_accuracy']:.4f})"
663
+ )
 
 
 
664
 
665
  if not args.no_case_eval:
666
  if args.case_eval_file and os.path.isfile(args.case_eval_file):
667
+ from evaluate_parser_cases import evaluate_case_modes
668
 
669
+ case_metrics = evaluate_case_modes(
670
  model_dir=final_save_path,
671
  case_file=args.case_eval_file,
672
  tokenizer_variant=tokenizer_variant,
673
  max_length=config.max_seq_length,
 
 
674
  )
675
  case_output = args.case_eval_output or os.path.join(final_save_path, "case_metrics.json")
676
  os.makedirs(os.path.dirname(case_output) or ".", exist_ok=True)
677
  with open(case_output, "w", encoding="utf-8") as f:
678
  json.dump(case_metrics, f, ensure_ascii=False, indent=2)
679
  print("\nFixed case regression evaluation:")
680
+ for mode_name, mode_metrics in case_metrics["modes"].items():
681
+ print(
682
+ f" {mode_name}: {mode_metrics['full_correct']}/"
683
+ f"{mode_metrics['case_count']} ({mode_metrics['full_accuracy']:.4f})"
684
+ )
685
+ primary = case_metrics["modes"][case_metrics["primary_metric"]]
686
+ if primary["failures"]:
687
+ print(f" primary failures: {len(primary['failures'])} (see {case_output})")
688
  elif args.case_eval_file:
689
  print(f"\nSkipping fixed case regression evaluation; file not found: {args.case_eval_file}")
690
 
trainer_eval_metrics.json CHANGED
@@ -1,11 +1,11 @@
1
  {
2
- "eval_loss": 0.03365034610033035,
3
- "eval_precision": 0.9612760834670947,
4
- "eval_recall": 0.9719629960236955,
5
- "eval_f1": 0.9665900012105072,
6
- "eval_accuracy": 0.990421109705404,
7
- "eval_runtime": 13.2008,
8
- "eval_samples_per_second": 274.908,
9
- "eval_steps_per_second": 4.318,
10
- "epoch": 1.0
11
  }
 
1
  {
2
+ "eval_loss": 0.001824101316742599,
3
+ "eval_precision": 0.996635213225075,
4
+ "eval_recall": 0.9977786457061953,
5
+ "eval_f1": 0.9972066016906769,
6
+ "eval_accuracy": 0.9994733938512463,
7
+ "eval_runtime": 30.171,
8
+ "eval_samples_per_second": 204.269,
9
+ "eval_steps_per_second": 0.829,
10
+ "epoch": 2.0
11
  }
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b23b375ad7f991bc460e29c07b8250afa09ec2d62bad255e0fc6125f0982c56d
3
  size 5265
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d04646d7f2d38f632195034cf95c225cde16df9933d0fc48ecbdf375ab0e05b3
3
  size 5265