Token Classification
Transformers
ONNX
Safetensors
English
Japanese
Chinese
bert
anime
filename-parsing
Eval Results (legacy)
Instructions to use ModerRAS/AniFileBERT with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ModerRAS/AniFileBERT with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("token-classification", model="ModerRAS/AniFileBERT")# Load model directly from transformers import AutoTokenizer, AutoModelForTokenClassification tokenizer = AutoTokenizer.from_pretrained("ModerRAS/AniFileBERT") model = AutoModelForTokenClassification.from_pretrained("ModerRAS/AniFileBERT") - Notebooks
- Google Colab
- Kaggle
Fix GM-Team bilingual title parsing
Browse files- MAINTENANCE.md +10 -10
- README.md +12 -12
- build_repair_focus_dataset.py +36 -0
- case_metrics.json +37 -15
- data/parser_regression_cases.json +12 -0
- datasets/AnimeName +1 -1
- dmhy_dataset.py +57 -2
- exports/anime_filename_parser.metadata.json +1 -1
- exports/anime_filename_parser.onnx +1 -1
- inference.py +63 -0
- model.safetensors +1 -1
- parse_eval_metrics.json +290 -321
- run_metadata.json +2 -2
- trainer_eval_metrics.json +8 -8
- training_args.bin +1 -1
MAINTENANCE.md
CHANGED
|
@@ -50,7 +50,7 @@ uv run python train.py \
|
|
| 50 |
--tokenizer char \
|
| 51 |
--data-file datasets/AnimeName/dmhy_weak_char.jsonl \
|
| 52 |
--vocab-file datasets/AnimeName/vocab.char.json \
|
| 53 |
-
--save-dir checkpoints/dmhy-char-
|
| 54 |
--init-model-dir . \
|
| 55 |
--epochs 2 \
|
| 56 |
--batch-size 256 \
|
|
@@ -59,7 +59,7 @@ uv run python train.py \
|
|
| 59 |
--max-seq-length 128 \
|
| 60 |
--checkpoint-steps 1000 \
|
| 61 |
--parse-eval-limit 2048 \
|
| 62 |
-
--seed
|
| 63 |
```
|
| 64 |
|
| 65 |
## Publish a New Checkpoint
|
|
@@ -67,15 +67,15 @@ uv run python train.py \
|
|
| 67 |
Copy the final checkpoint to the repository root:
|
| 68 |
|
| 69 |
```powershell
|
| 70 |
-
Copy-Item checkpoints/dmhy-char-
|
| 71 |
-
Copy-Item checkpoints/dmhy-char-
|
| 72 |
-
Copy-Item checkpoints/dmhy-char-
|
| 73 |
-
Copy-Item checkpoints/dmhy-char-
|
| 74 |
-
Copy-Item checkpoints/dmhy-char-
|
| 75 |
Copy-Item datasets/AnimeName/vocab.char.json .\vocab.char.json -Force
|
| 76 |
-
Copy-Item checkpoints/dmhy-char-
|
| 77 |
-
Copy-Item checkpoints/dmhy-char-
|
| 78 |
-
Copy-Item checkpoints/dmhy-char-
|
| 79 |
```
|
| 80 |
|
| 81 |
There is no tracked `model/` duplicate. The root checkpoint is the publishing
|
|
|
|
| 50 |
--tokenizer char \
|
| 51 |
--data-file datasets/AnimeName/dmhy_weak_char.jsonl \
|
| 52 |
--vocab-file datasets/AnimeName/vocab.char.json \
|
| 53 |
+
--save-dir checkpoints/dmhy-char-guoman-relabel \
|
| 54 |
--init-model-dir . \
|
| 55 |
--epochs 2 \
|
| 56 |
--batch-size 256 \
|
|
|
|
| 59 |
--max-seq-length 128 \
|
| 60 |
--checkpoint-steps 1000 \
|
| 61 |
--parse-eval-limit 2048 \
|
| 62 |
+
--seed 52
|
| 63 |
```
|
| 64 |
|
| 65 |
## Publish a New Checkpoint
|
|
|
|
| 67 |
Copy the final checkpoint to the repository root:
|
| 68 |
|
| 69 |
```powershell
|
| 70 |
+
Copy-Item checkpoints/dmhy-char-guoman-relabel/final/config.json . -Force
|
| 71 |
+
Copy-Item checkpoints/dmhy-char-guoman-relabel/final/model.safetensors . -Force
|
| 72 |
+
Copy-Item checkpoints/dmhy-char-guoman-relabel/final/tokenizer_config.json . -Force
|
| 73 |
+
Copy-Item checkpoints/dmhy-char-guoman-relabel/final/training_args.bin . -Force
|
| 74 |
+
Copy-Item checkpoints/dmhy-char-guoman-relabel/final/vocab.json . -Force
|
| 75 |
Copy-Item datasets/AnimeName/vocab.char.json .\vocab.char.json -Force
|
| 76 |
+
Copy-Item checkpoints/dmhy-char-guoman-relabel/final/run_metadata.json . -Force
|
| 77 |
+
Copy-Item checkpoints/dmhy-char-guoman-relabel/final/trainer_eval_metrics.json . -Force
|
| 78 |
+
Copy-Item checkpoints/dmhy-char-guoman-relabel/final/parse_eval_metrics.json . -Force
|
| 79 |
```
|
| 80 |
|
| 81 |
There is no tracked `model/` duplicate. The root checkpoint is the publishing
|
README.md
CHANGED
|
@@ -59,21 +59,21 @@ dataset relabeling and diagnostics, but the root checkpoint loads as `char`.
|
|
| 59 |
## Evaluation
|
| 60 |
|
| 61 |
Final full-relabel char training (`632002` DMHY rows, 2 epochs, batch size 256,
|
| 62 |
-
seed
|
| 63 |
|
| 64 |
| Metric | Value |
|
| 65 |
|--------|-------|
|
| 66 |
-
| Eval loss | 0.
|
| 67 |
-
| Entity precision | 0.
|
| 68 |
-
| Entity recall | 0.
|
| 69 |
-
| Entity F1 | 0.
|
| 70 |
-
| Token accuracy | 0.
|
| 71 |
-
| Held-out parse full match |
|
| 72 |
-
| Fixed regression full match |
|
| 73 |
|
| 74 |
The fixed regression set includes second-season aliases such as `Ni`,
|
| 75 |
-
`Ni no Sara`, `貳`, and `弐ノ章`, plus
|
| 76 |
-
blocks.
|
| 77 |
|
| 78 |
## Usage
|
| 79 |
|
|
@@ -121,13 +121,13 @@ uv run python convert_to_char_dataset.py \
|
|
| 121 |
uv run python train.py --tokenizer char \
|
| 122 |
--data-file datasets/AnimeName/dmhy_weak_char.jsonl \
|
| 123 |
--vocab-file datasets/AnimeName/vocab.char.json \
|
| 124 |
-
--save-dir checkpoints/dmhy-char-
|
| 125 |
--init-model-dir . \
|
| 126 |
--epochs 2 --batch-size 256 \
|
| 127 |
--learning-rate 0.00008 --warmup-steps 300 \
|
| 128 |
--checkpoint-steps 1000 --save-total-limit 3 \
|
| 129 |
--parse-eval-limit 2048 \
|
| 130 |
-
--max-seq-length 128 --seed
|
| 131 |
```
|
| 132 |
|
| 133 |
The converter keeps source metadata and adds `tokenizer_variant`, source token
|
|
|
|
| 59 |
## Evaluation
|
| 60 |
|
| 61 |
Final full-relabel char training (`632002` DMHY rows, 2 epochs, batch size 256,
|
| 62 |
+
seed 52):
|
| 63 |
|
| 64 |
| Metric | Value |
|
| 65 |
|--------|-------|
|
| 66 |
+
| Eval loss | 0.0058 |
|
| 67 |
+
| Entity precision | 0.9922 |
|
| 68 |
+
| Entity recall | 0.9946 |
|
| 69 |
+
| Entity F1 | 0.9934 |
|
| 70 |
+
| Token accuracy | 0.9981 |
|
| 71 |
+
| Held-out parse full match | 2029/2048 (0.9907) |
|
| 72 |
+
| Fixed regression full match | 22/22 (1.0000) |
|
| 73 |
|
| 74 |
The fixed regression set includes second-season aliases such as `Ni`,
|
| 75 |
+
`Ni no Sara`, `貳`, and `弐ノ章`, plus GM-Team bilingual Chinese animation
|
| 76 |
+
bracket layouts, long-running episode IDs, and dense meta blocks.
|
| 77 |
|
| 78 |
## Usage
|
| 79 |
|
|
|
|
| 121 |
uv run python train.py --tokenizer char \
|
| 122 |
--data-file datasets/AnimeName/dmhy_weak_char.jsonl \
|
| 123 |
--vocab-file datasets/AnimeName/vocab.char.json \
|
| 124 |
+
--save-dir checkpoints/dmhy-char-guoman-relabel \
|
| 125 |
--init-model-dir . \
|
| 126 |
--epochs 2 --batch-size 256 \
|
| 127 |
--learning-rate 0.00008 --warmup-steps 300 \
|
| 128 |
--checkpoint-steps 1000 --save-total-limit 3 \
|
| 129 |
--parse-eval-limit 2048 \
|
| 130 |
+
--max-seq-length 128 --seed 52
|
| 131 |
```
|
| 132 |
|
| 133 |
The converter keeps source metadata and adds `tokenizer_variant`, source token
|
build_repair_focus_dataset.py
CHANGED
|
@@ -88,6 +88,42 @@ def manual_cases() -> Iterable[dict]:
|
|
| 88 |
("FLAC", "SOURCE"),
|
| 89 |
],
|
| 90 |
)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 91 |
|
| 92 |
|
| 93 |
def main() -> None:
|
|
|
|
| 88 |
("FLAC", "SOURCE"),
|
| 89 |
],
|
| 90 |
)
|
| 91 |
+
yield char_item(
|
| 92 |
+
"[GM-Team][国漫][逆天邪神 第2季][Against the Gods Ⅱ][2026][04][HEVC][GB][4K].mp4",
|
| 93 |
+
[
|
| 94 |
+
("GM-Team", "GROUP"),
|
| 95 |
+
("逆天邪神", "TITLE"),
|
| 96 |
+
("第2季", "SEASON"),
|
| 97 |
+
("04", "EPISODE"),
|
| 98 |
+
("HEVC", "SOURCE"),
|
| 99 |
+
("GB", "SOURCE"),
|
| 100 |
+
("4K", "RESOLUTION"),
|
| 101 |
+
],
|
| 102 |
+
)
|
| 103 |
+
yield char_item(
|
| 104 |
+
"[GM-Team][国漫][剑来 第2季][Sword of Coming Ⅱ][2025][04][HEVC][GB][4K]",
|
| 105 |
+
[
|
| 106 |
+
("GM-Team", "GROUP"),
|
| 107 |
+
("剑来", "TITLE"),
|
| 108 |
+
("第2季", "SEASON"),
|
| 109 |
+
("04", "EPISODE"),
|
| 110 |
+
("HEVC", "SOURCE"),
|
| 111 |
+
("GB", "SOURCE"),
|
| 112 |
+
("4K", "RESOLUTION"),
|
| 113 |
+
],
|
| 114 |
+
)
|
| 115 |
+
yield char_item(
|
| 116 |
+
"[GM-Team][国漫][大主宰 第2季][The Great Ruler Ⅱ][2026][04][HEVC][GB][4K]",
|
| 117 |
+
[
|
| 118 |
+
("GM-Team", "GROUP"),
|
| 119 |
+
("大主宰", "TITLE"),
|
| 120 |
+
("第2季", "SEASON"),
|
| 121 |
+
("04", "EPISODE"),
|
| 122 |
+
("HEVC", "SOURCE"),
|
| 123 |
+
("GB", "SOURCE"),
|
| 124 |
+
("4K", "RESOLUTION"),
|
| 125 |
+
],
|
| 126 |
+
)
|
| 127 |
|
| 128 |
|
| 129 |
def main() -> None:
|
case_metrics.json
CHANGED
|
@@ -1,29 +1,29 @@
|
|
| 1 |
{
|
| 2 |
"model_dir": ".",
|
| 3 |
-
"case_file": "data
|
| 4 |
"tokenizer_variant": "char",
|
| 5 |
"max_length": 128,
|
| 6 |
"use_rules": true,
|
| 7 |
"constrain_bio": true,
|
| 8 |
-
"case_count":
|
| 9 |
-
"full_correct":
|
| 10 |
"full_accuracy": 1.0,
|
| 11 |
"field_correct": {
|
| 12 |
-
"group":
|
| 13 |
-
"title":
|
| 14 |
-
"episode":
|
| 15 |
-
"resolution":
|
| 16 |
-
"source":
|
| 17 |
-
"season":
|
| 18 |
"special": 1
|
| 19 |
},
|
| 20 |
"field_total": {
|
| 21 |
-
"group":
|
| 22 |
-
"title":
|
| 23 |
-
"episode":
|
| 24 |
-
"resolution":
|
| 25 |
-
"source":
|
| 26 |
-
"season":
|
| 27 |
"special": 1
|
| 28 |
},
|
| 29 |
"field_accuracy": {
|
|
@@ -454,6 +454,28 @@
|
|
| 454 |
"season": 2,
|
| 455 |
"title": "炎炎の消防隊"
|
| 456 |
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 457 |
}
|
| 458 |
]
|
| 459 |
}
|
|
|
|
| 1 |
{
|
| 2 |
"model_dir": ".",
|
| 3 |
+
"case_file": "data/parser_regression_cases.json",
|
| 4 |
"tokenizer_variant": "char",
|
| 5 |
"max_length": 128,
|
| 6 |
"use_rules": true,
|
| 7 |
"constrain_bio": true,
|
| 8 |
+
"case_count": 22,
|
| 9 |
+
"full_correct": 22,
|
| 10 |
"full_accuracy": 1.0,
|
| 11 |
"field_correct": {
|
| 12 |
+
"group": 19,
|
| 13 |
+
"title": 22,
|
| 14 |
+
"episode": 22,
|
| 15 |
+
"resolution": 22,
|
| 16 |
+
"source": 15,
|
| 17 |
+
"season": 9,
|
| 18 |
"special": 1
|
| 19 |
},
|
| 20 |
"field_total": {
|
| 21 |
+
"group": 19,
|
| 22 |
+
"title": 22,
|
| 23 |
+
"episode": 22,
|
| 24 |
+
"resolution": 22,
|
| 25 |
+
"source": 15,
|
| 26 |
+
"season": 9,
|
| 27 |
"special": 1
|
| 28 |
},
|
| 29 |
"field_accuracy": {
|
|
|
|
| 454 |
"season": 2,
|
| 455 |
"title": "炎炎の消防隊"
|
| 456 |
}
|
| 457 |
+
},
|
| 458 |
+
{
|
| 459 |
+
"id": "gm_team_guoman_bilingual_s2",
|
| 460 |
+
"filename": "[GM-Team][国漫][逆天邪神 第2季][Against the Gods Ⅱ][2026][04][HEVC][GB][4K].mp4",
|
| 461 |
+
"ok": true,
|
| 462 |
+
"errors": {},
|
| 463 |
+
"expected": {
|
| 464 |
+
"group": "GM-Team",
|
| 465 |
+
"title": "逆天邪神",
|
| 466 |
+
"season": 2,
|
| 467 |
+
"episode": 4,
|
| 468 |
+
"resolution": "4K",
|
| 469 |
+
"source": "GB"
|
| 470 |
+
},
|
| 471 |
+
"pred": {
|
| 472 |
+
"episode": 4,
|
| 473 |
+
"group": "GM-Team",
|
| 474 |
+
"resolution": "4K",
|
| 475 |
+
"season": 2,
|
| 476 |
+
"source": "GB",
|
| 477 |
+
"title": "逆天邪神"
|
| 478 |
+
}
|
| 479 |
}
|
| 480 |
]
|
| 481 |
}
|
data/parser_regression_cases.json
CHANGED
|
@@ -228,5 +228,17 @@
|
|
| 228 |
"episode": 13,
|
| 229 |
"resolution": "1920x1080"
|
| 230 |
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 231 |
}
|
| 232 |
]
|
|
|
|
| 228 |
"episode": 13,
|
| 229 |
"resolution": "1920x1080"
|
| 230 |
}
|
| 231 |
+
},
|
| 232 |
+
{
|
| 233 |
+
"id": "gm_team_guoman_bilingual_s2",
|
| 234 |
+
"filename": "[GM-Team][国漫][逆天邪神 第2季][Against the Gods Ⅱ][2026][04][HEVC][GB][4K].mp4",
|
| 235 |
+
"expected": {
|
| 236 |
+
"group": "GM-Team",
|
| 237 |
+
"title": "逆天邪神",
|
| 238 |
+
"season": 2,
|
| 239 |
+
"episode": 4,
|
| 240 |
+
"resolution": "4K",
|
| 241 |
+
"source": "GB"
|
| 242 |
+
}
|
| 243 |
}
|
| 244 |
]
|
datasets/AnimeName
CHANGED
|
@@ -1 +1 @@
|
|
| 1 |
-
Subproject commit
|
|
|
|
| 1 |
+
Subproject commit 004a8c08628b6820fb2d1b59a80fdcfe925ef095
|
dmhy_dataset.py
CHANGED
|
@@ -35,6 +35,10 @@ NOISE_BRACKETS = {
|
|
| 35 |
"tc", "sc", "gb", "big5", "cht", "chs", "jpn", "jp", "jap", "eng",
|
| 36 |
"繁中", "简中", "繁日", "简日", "日语", "日文", "外挂", "内封", "字幕",
|
| 37 |
}
|
|
|
|
|
|
|
|
|
|
|
|
|
| 38 |
|
| 39 |
SPECIAL_RE = re.compile(r"^(?:ova\d*|oad\d*|sp\d*|movie|the\s*movie|op|ed|pv|cm|ncop|nced|剧场版|劇場版|特别篇|特別篇)$", re.I)
|
| 40 |
SPECIAL_SEARCH_RE = re.compile(r"^(?:檢索|检索|搜索|搜寻|搜尋|别名|別名|alias|search|keyword)\s*[::].+", re.I)
|
|
@@ -186,7 +190,8 @@ def is_source(token: str) -> bool:
|
|
| 186 |
return True
|
| 187 |
if has_wrapping_brackets(token):
|
| 188 |
parts = [part for part in re.split(r"[\s&+/,._-]+", clean) if part]
|
| 189 |
-
|
|
|
|
| 190 |
return False
|
| 191 |
|
| 192 |
|
|
@@ -195,6 +200,11 @@ def is_special(token: str) -> bool:
|
|
| 195 |
return bool(SPECIAL_RE.match(clean) or SPECIAL_SEARCH_RE.match(clean))
|
| 196 |
|
| 197 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 198 |
def is_noise_bracket(token: str) -> bool:
|
| 199 |
clean = clean_bracket(token)
|
| 200 |
if not clean:
|
|
@@ -202,6 +212,8 @@ def is_noise_bracket(token: str) -> bool:
|
|
| 202 |
normalized = re.sub(r"[\s._-]+", "", clean).lower()
|
| 203 |
if normalized in NOISE_BRACKETS:
|
| 204 |
return True
|
|
|
|
|
|
|
| 205 |
if DATE_RE.match(clean) or HASH_RE.match(clean):
|
| 206 |
return True
|
| 207 |
return False
|
|
@@ -335,6 +347,42 @@ def label_context_season_tokens(
|
|
| 335 |
categories[idx] = "season"
|
| 336 |
|
| 337 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 338 |
def embedded_bracket_episode(token: str) -> Optional[tuple[str, str, str]]:
|
| 339 |
"""Split malformed tokens such as '[Group}Title[658]' into title + episode."""
|
| 340 |
if episode_number(token) is not None:
|
|
@@ -390,6 +438,10 @@ def finalize_weak_sample(
|
|
| 390 |
continue
|
| 391 |
if is_explicit_season(token):
|
| 392 |
expanded_categories[idx] = "season"
|
|
|
|
|
|
|
|
|
|
|
|
|
| 393 |
|
| 394 |
labels = assign_iob2(expanded_categories)
|
| 395 |
if len(expanded_tokens) != len(labels):
|
|
@@ -699,7 +751,9 @@ def weak_label_filename(filename: str, tokenizer: AnimeTokenizer) -> Optional[di
|
|
| 699 |
for idx, token in enumerate(tokens):
|
| 700 |
if categories[idx] == "group":
|
| 701 |
continue
|
| 702 |
-
if
|
|
|
|
|
|
|
| 703 |
categories[idx] = "resolution"
|
| 704 |
elif is_source(token):
|
| 705 |
categories[idx] = "source"
|
|
@@ -715,6 +769,7 @@ def weak_label_filename(filename: str, tokenizer: AnimeTokenizer) -> Optional[di
|
|
| 715 |
return fallback_embedded_episode_sample(tokens, tokenizer) or fallback_no_episode_sample(tokens, tokenizer)
|
| 716 |
categories[episode_idx] = "episode"
|
| 717 |
label_context_season_tokens(tokens, categories, episode_idx)
|
|
|
|
| 718 |
|
| 719 |
# S01E07 is tokenized as S01 + E07 after tokenizer changes. If an older
|
| 720 |
# token slips through, expand_tokens_and_categories will split it.
|
|
|
|
| 35 |
"tc", "sc", "gb", "big5", "cht", "chs", "jpn", "jp", "jap", "eng",
|
| 36 |
"繁中", "简中", "繁日", "简日", "日语", "日文", "外挂", "内封", "字幕",
|
| 37 |
}
|
| 38 |
+
CATEGORY_BRACKETS = {
|
| 39 |
+
"国漫", "國漫", "国产", "國產", "国产动漫", "國產動漫", "国产动画", "國產動畫",
|
| 40 |
+
"国创", "國創", "中国动漫", "中國動漫", "中国动画", "中國動畫",
|
| 41 |
+
}
|
| 42 |
|
| 43 |
SPECIAL_RE = re.compile(r"^(?:ova\d*|oad\d*|sp\d*|movie|the\s*movie|op|ed|pv|cm|ncop|nced|剧场版|劇場版|特别篇|特別篇)$", re.I)
|
| 44 |
SPECIAL_SEARCH_RE = re.compile(r"^(?:檢索|检索|搜索|搜寻|搜尋|别名|別名|alias|search|keyword)\s*[::].+", re.I)
|
|
|
|
| 190 |
return True
|
| 191 |
if has_wrapping_brackets(token):
|
| 192 |
parts = [part for part in re.split(r"[\s&+/,._-]+", clean) if part]
|
| 193 |
+
has_source_part = any(SOURCE_RE.match(part) for part in parts)
|
| 194 |
+
return has_source_part and all(SOURCE_RE.match(part) or is_noise_bracket(part) for part in parts)
|
| 195 |
return False
|
| 196 |
|
| 197 |
|
|
|
|
| 200 |
return bool(SPECIAL_RE.match(clean) or SPECIAL_SEARCH_RE.match(clean))
|
| 201 |
|
| 202 |
|
| 203 |
+
def is_category_bracket(token: str) -> bool:
|
| 204 |
+
clean = re.sub(r"[\s._-]+", "", clean_bracket(token))
|
| 205 |
+
return has_wrapping_brackets(token) and clean in CATEGORY_BRACKETS
|
| 206 |
+
|
| 207 |
+
|
| 208 |
def is_noise_bracket(token: str) -> bool:
|
| 209 |
clean = clean_bracket(token)
|
| 210 |
if not clean:
|
|
|
|
| 212 |
normalized = re.sub(r"[\s._-]+", "", clean).lower()
|
| 213 |
if normalized in NOISE_BRACKETS:
|
| 214 |
return True
|
| 215 |
+
if is_category_bracket(token):
|
| 216 |
+
return True
|
| 217 |
if DATE_RE.match(clean) or HASH_RE.match(clean):
|
| 218 |
return True
|
| 219 |
return False
|
|
|
|
| 347 |
categories[idx] = "season"
|
| 348 |
|
| 349 |
|
| 350 |
+
def repair_structured_bracket_title_aliases(
|
| 351 |
+
tokens: Sequence[str],
|
| 352 |
+
categories: List[str],
|
| 353 |
+
episode_idx: int,
|
| 354 |
+
) -> None:
|
| 355 |
+
"""Keep the primary title in category-prefixed bracket series.
|
| 356 |
+
|
| 357 |
+
GM-Team-style rows often look like:
|
| 358 |
+
[GROUP][国漫][中文标题 第2季][English Alias Ⅱ][2026][04][meta]
|
| 359 |
+
The category, alias, and year brackets are metadata for parsing purposes;
|
| 360 |
+
the first real title bracket after the category is the canonical title.
|
| 361 |
+
"""
|
| 362 |
+
if not any(is_category_bracket(tokens[idx]) for idx in range(min(episode_idx, len(tokens)))):
|
| 363 |
+
return
|
| 364 |
+
|
| 365 |
+
title_candidates = [
|
| 366 |
+
idx
|
| 367 |
+
for idx in range(episode_idx)
|
| 368 |
+
if categories[idx] == "title"
|
| 369 |
+
and has_wrapping_brackets(tokens[idx])
|
| 370 |
+
and is_title_token(tokens[idx])
|
| 371 |
+
]
|
| 372 |
+
if not title_candidates:
|
| 373 |
+
return
|
| 374 |
+
|
| 375 |
+
primary_idx = title_candidates[0]
|
| 376 |
+
for idx in title_candidates[1:]:
|
| 377 |
+
categories[idx] = "sep"
|
| 378 |
+
|
| 379 |
+
for idx in range(episode_idx):
|
| 380 |
+
if idx == primary_idx:
|
| 381 |
+
continue
|
| 382 |
+
if is_category_bracket(tokens[idx]) or DATE_RE.match(clean_bracket(tokens[idx])):
|
| 383 |
+
categories[idx] = "sep"
|
| 384 |
+
|
| 385 |
+
|
| 386 |
def embedded_bracket_episode(token: str) -> Optional[tuple[str, str, str]]:
|
| 387 |
"""Split malformed tokens such as '[Group}Title[658]' into title + episode."""
|
| 388 |
if episode_number(token) is not None:
|
|
|
|
| 438 |
continue
|
| 439 |
if is_explicit_season(token):
|
| 440 |
expanded_categories[idx] = "season"
|
| 441 |
+
prev_idx = idx - 1
|
| 442 |
+
while prev_idx >= 0 and is_separator_token(expanded_tokens[prev_idx]) and expanded_categories[prev_idx] == "title":
|
| 443 |
+
expanded_categories[prev_idx] = "sep"
|
| 444 |
+
prev_idx -= 1
|
| 445 |
|
| 446 |
labels = assign_iob2(expanded_categories)
|
| 447 |
if len(expanded_tokens) != len(labels):
|
|
|
|
| 751 |
for idx, token in enumerate(tokens):
|
| 752 |
if categories[idx] == "group":
|
| 753 |
continue
|
| 754 |
+
if is_category_bracket(token):
|
| 755 |
+
categories[idx] = "sep"
|
| 756 |
+
elif is_resolution(token):
|
| 757 |
categories[idx] = "resolution"
|
| 758 |
elif is_source(token):
|
| 759 |
categories[idx] = "source"
|
|
|
|
| 769 |
return fallback_embedded_episode_sample(tokens, tokenizer) or fallback_no_episode_sample(tokens, tokenizer)
|
| 770 |
categories[episode_idx] = "episode"
|
| 771 |
label_context_season_tokens(tokens, categories, episode_idx)
|
| 772 |
+
repair_structured_bracket_title_aliases(tokens, categories, episode_idx)
|
| 773 |
|
| 774 |
# S01E07 is tokenized as S01 + E07 after tokenizer changes. If an older
|
| 775 |
# token slips through, expand_tokens_and_categories will split it.
|
exports/anime_filename_parser.metadata.json
CHANGED
|
@@ -8,5 +8,5 @@
|
|
| 8 |
128,
|
| 9 |
15
|
| 10 |
],
|
| 11 |
-
"max_abs_diff":
|
| 12 |
}
|
|
|
|
| 8 |
128,
|
| 9 |
15
|
| 10 |
],
|
| 11 |
+
"max_abs_diff": 5.65648078918457e-05
|
| 12 |
}
|
exports/anime_filename_parser.onnx
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 19633926
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:6d967c5c2305e6737c9e791956a174655deebef2cfa477e081890ebddd56e004
|
| 3 |
size 19633926
|
inference.py
CHANGED
|
@@ -330,6 +330,11 @@ NOISE_META_RE = re.compile(
|
|
| 330 |
r"Opus|ASS.*|CHS|CHT|BIG5|GB|JPN?|MP4|MKV|繁中|简中|内封|外挂)$",
|
| 331 |
re.I,
|
| 332 |
)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 333 |
|
| 334 |
|
| 335 |
def cn_number_to_int(text: str) -> Optional[int]:
|
|
@@ -372,8 +377,11 @@ def looks_like_episode_or_meta(text: str) -> bool:
|
|
| 372 |
if not text:
|
| 373 |
return False
|
| 374 |
clean = text.strip()
|
|
|
|
| 375 |
return bool(
|
| 376 |
re.fullmatch(r"(?:EP?|#)?\d{1,4}(?:v\d+)?", clean, re.I)
|
|
|
|
|
|
|
| 377 |
or RESOLUTION_RE.search(clean)
|
| 378 |
or SOURCE_TAG_RE.fullmatch(clean)
|
| 379 |
or SOURCE_RE.search(clean)
|
|
@@ -492,6 +500,10 @@ def apply_rule_assists(filename: str, result: Dict) -> Dict:
|
|
| 492 |
if repaired_title:
|
| 493 |
repaired["title"] = repaired_title
|
| 494 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 495 |
if repaired.get("title") and repaired.get("season") is not None:
|
| 496 |
repaired["title"] = strip_trailing_season_from_title(repaired["title"], repaired["season"])
|
| 497 |
|
|
@@ -584,6 +596,56 @@ def source_candidates(filename: str) -> List[str]:
|
|
| 584 |
return [value for _priority, _neg_start, value in sorted(deduped.values(), reverse=True)]
|
| 585 |
|
| 586 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 587 |
def best_structural_episode(filename: str) -> Optional[int]:
|
| 588 |
priorities = {
|
| 589 |
"season_episode": 1000,
|
|
@@ -635,6 +697,7 @@ def strip_trailing_season_from_title(title: str, season: int) -> str:
|
|
| 635 |
rf"\s+[Ss]0*{season_text}$",
|
| 636 |
rf"\s+Season\s*0*{season_text}$",
|
| 637 |
rf"\s+0*{season_text}$",
|
|
|
|
| 638 |
]
|
| 639 |
cleaned = title
|
| 640 |
for pattern in patterns:
|
|
|
|
| 330 |
r"Opus|ASS.*|CHS|CHT|BIG5|GB|JPN?|MP4|MKV|繁中|简中|内封|外挂)$",
|
| 331 |
re.I,
|
| 332 |
)
|
| 333 |
+
DATE_RE = re.compile(r"^(?:19|20)\d{2}(?:[.\-_年]?(?:0?[1-9]|1[0-2]))?(?:[.\-_月]?(?:0?[1-9]|[12]\d|3[01]))?日?$")
|
| 334 |
+
CATEGORY_BRACKETS = {
|
| 335 |
+
"国漫", "國漫", "国产", "國產", "国产动漫", "國產動漫", "国产动画", "國產動畫",
|
| 336 |
+
"国创", "國創", "中国动漫", "中國動漫", "中国动画", "中國動畫",
|
| 337 |
+
}
|
| 338 |
|
| 339 |
|
| 340 |
def cn_number_to_int(text: str) -> Optional[int]:
|
|
|
|
| 377 |
if not text:
|
| 378 |
return False
|
| 379 |
clean = text.strip()
|
| 380 |
+
normalized = re.sub(r"[\s._-]+", "", clean)
|
| 381 |
return bool(
|
| 382 |
re.fullmatch(r"(?:EP?|#)?\d{1,4}(?:v\d+)?", clean, re.I)
|
| 383 |
+
or DATE_RE.fullmatch(clean)
|
| 384 |
+
or normalized in CATEGORY_BRACKETS
|
| 385 |
or RESOLUTION_RE.search(clean)
|
| 386 |
or SOURCE_TAG_RE.fullmatch(clean)
|
| 387 |
or SOURCE_RE.search(clean)
|
|
|
|
| 500 |
if repaired_title:
|
| 501 |
repaired["title"] = repaired_title
|
| 502 |
|
| 503 |
+
structured_title = infer_structured_bracket_title(filename, group, repaired.get("episode"))
|
| 504 |
+
if structured_title:
|
| 505 |
+
repaired["title"] = structured_title
|
| 506 |
+
|
| 507 |
if repaired.get("title") and repaired.get("season") is not None:
|
| 508 |
repaired["title"] = strip_trailing_season_from_title(repaired["title"], repaired["season"])
|
| 509 |
|
|
|
|
| 596 |
return [value for _priority, _neg_start, value in sorted(deduped.values(), reverse=True)]
|
| 597 |
|
| 598 |
|
| 599 |
+
def is_category_text(text: str) -> bool:
|
| 600 |
+
normalized = re.sub(r"[\s._-]+", "", text.strip())
|
| 601 |
+
return normalized in CATEGORY_BRACKETS
|
| 602 |
+
|
| 603 |
+
|
| 604 |
+
def infer_structured_bracket_title(
|
| 605 |
+
filename: str,
|
| 606 |
+
group: Optional[str],
|
| 607 |
+
episode: Optional[int],
|
| 608 |
+
) -> Optional[str]:
|
| 609 |
+
"""Pick the primary title from [group][category][title][alias][year][episode] rows."""
|
| 610 |
+
brackets = bracket_parts(filename)
|
| 611 |
+
if len(brackets) < 4 or episode is None:
|
| 612 |
+
return None
|
| 613 |
+
|
| 614 |
+
start_index = 0
|
| 615 |
+
if group and brackets and brackets[0][0] == group:
|
| 616 |
+
start_index = 1
|
| 617 |
+
|
| 618 |
+
search = brackets[start_index:]
|
| 619 |
+
if not search or not any(is_category_text(text) for text, _start, _end in search[:2]):
|
| 620 |
+
return None
|
| 621 |
+
|
| 622 |
+
episode_index = None
|
| 623 |
+
for idx, (text, _start, _end) in enumerate(brackets):
|
| 624 |
+
if re.fullmatch(rf"(?:EP?|#)?0*{episode}(?:v\d+)?", text.strip(), re.I):
|
| 625 |
+
episode_index = idx
|
| 626 |
+
break
|
| 627 |
+
if episode_index is None:
|
| 628 |
+
return None
|
| 629 |
+
|
| 630 |
+
candidates: List[Tuple[int, str]] = []
|
| 631 |
+
for idx in range(start_index, episode_index):
|
| 632 |
+
text = brackets[idx][0].strip()
|
| 633 |
+
if not text or looks_like_episode_or_meta(text):
|
| 634 |
+
continue
|
| 635 |
+
score = 0
|
| 636 |
+
if SEASON_RE.search(text) or TRAILING_SEQUEL_MARKER_RE.search(text):
|
| 637 |
+
score += 50
|
| 638 |
+
if re.search(r"[\u3400-\u9fff]", text):
|
| 639 |
+
score += 20
|
| 640 |
+
if idx > start_index:
|
| 641 |
+
score += 10
|
| 642 |
+
candidates.append((score, text))
|
| 643 |
+
|
| 644 |
+
if not candidates:
|
| 645 |
+
return None
|
| 646 |
+
return max(candidates, key=lambda item: item[0])[1]
|
| 647 |
+
|
| 648 |
+
|
| 649 |
def best_structural_episode(filename: str) -> Optional[int]:
|
| 650 |
priorities = {
|
| 651 |
"season_episode": 1000,
|
|
|
|
| 697 |
rf"\s+[Ss]0*{season_text}$",
|
| 698 |
rf"\s+Season\s*0*{season_text}$",
|
| 699 |
rf"\s+0*{season_text}$",
|
| 700 |
+
rf"\s+第(?:0*{season_text}|{season_text})[季期部章]$",
|
| 701 |
]
|
| 702 |
cleaned = title
|
| 703 |
for pattern in patterns:
|
model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 19142604
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:347b2f619fd63a71804c4742a069b20acd0cde870fc03cc2ac0f175b06586b72
|
| 3 |
size 19142604
|
parse_eval_metrics.json
CHANGED
|
@@ -1,22 +1,22 @@
|
|
| 1 |
{
|
| 2 |
"sample_count": 2048,
|
| 3 |
"field_accuracy": {
|
| 4 |
-
"group":
|
| 5 |
-
"title": 0.
|
| 6 |
-
"season": 0.
|
| 7 |
-
"episode": 0.
|
| 8 |
-
"resolution":
|
| 9 |
-
"source": 0.
|
| 10 |
-
"special": 0.
|
| 11 |
},
|
| 12 |
"field_correct": {
|
| 13 |
-
"group":
|
| 14 |
-
"title":
|
| 15 |
-
"season":
|
| 16 |
-
"episode":
|
| 17 |
-
"resolution":
|
| 18 |
-
"source":
|
| 19 |
-
"special":
|
| 20 |
},
|
| 21 |
"field_total": {
|
| 22 |
"group": 2048,
|
|
@@ -27,487 +27,487 @@
|
|
| 27 |
"source": 2048,
|
| 28 |
"special": 2048
|
| 29 |
},
|
| 30 |
-
"full_match_accuracy": 0.
|
| 31 |
-
"full_match_correct":
|
| 32 |
"full_match_total": 2048,
|
| 33 |
"failures": [
|
| 34 |
{
|
| 35 |
-
"filename": "[
|
| 36 |
"errors": {
|
| 37 |
-
"
|
| 38 |
-
"gold":
|
| 39 |
-
"pred":
|
| 40 |
}
|
| 41 |
},
|
| 42 |
"gold": {
|
| 43 |
-
"group": "
|
| 44 |
-
"title": "
|
| 45 |
"season": null,
|
| 46 |
-
"episode":
|
| 47 |
-
"resolution": "
|
| 48 |
-
"source": "
|
| 49 |
"special": null
|
| 50 |
},
|
| 51 |
"pred": {
|
| 52 |
-
"group": "
|
| 53 |
-
"title": "
|
| 54 |
-
"season":
|
| 55 |
-
"episode":
|
| 56 |
-
"resolution": "
|
| 57 |
-
"source": "
|
| 58 |
"special": null
|
| 59 |
}
|
| 60 |
},
|
| 61 |
{
|
| 62 |
-
"filename": "[
|
| 63 |
"errors": {
|
| 64 |
-
"
|
| 65 |
-
"gold":
|
| 66 |
-
"pred":
|
| 67 |
}
|
| 68 |
},
|
| 69 |
"gold": {
|
| 70 |
-
"group": "
|
| 71 |
-
"title": "
|
| 72 |
"season": null,
|
| 73 |
"episode": 9,
|
| 74 |
-
"resolution": "
|
| 75 |
-
"source": "
|
| 76 |
-
"special":
|
| 77 |
},
|
| 78 |
"pred": {
|
| 79 |
-
"group": "
|
| 80 |
-
"title": "
|
| 81 |
-
"season":
|
| 82 |
"episode": 9,
|
| 83 |
-
"resolution": "
|
| 84 |
-
"source": "
|
| 85 |
"special": null
|
| 86 |
}
|
| 87 |
},
|
| 88 |
{
|
| 89 |
-
"filename": "
|
| 90 |
"errors": {
|
| 91 |
-
"
|
| 92 |
-
"gold": "
|
| 93 |
-
"pred": "
|
|
|
|
|
|
|
|
|
|
|
|
|
| 94 |
}
|
| 95 |
},
|
| 96 |
"gold": {
|
| 97 |
-
"group":
|
| 98 |
-
"title": "
|
| 99 |
"season": null,
|
| 100 |
-
"episode":
|
| 101 |
-
"resolution": "
|
| 102 |
-
"source": "
|
| 103 |
-
"special":
|
| 104 |
},
|
| 105 |
"pred": {
|
| 106 |
-
"group":
|
| 107 |
-
"title": "
|
| 108 |
"season": null,
|
| 109 |
-
"episode":
|
| 110 |
-
"resolution": "
|
| 111 |
-
"source": "
|
| 112 |
-
"special":
|
| 113 |
}
|
| 114 |
},
|
| 115 |
{
|
| 116 |
-
"filename": "[
|
| 117 |
"errors": {
|
| 118 |
"source": {
|
| 119 |
-
"gold": "
|
| 120 |
-
"pred": "
|
| 121 |
}
|
| 122 |
},
|
| 123 |
"gold": {
|
| 124 |
-
"group": "
|
| 125 |
-
"title": "
|
| 126 |
"season": null,
|
| 127 |
-
"episode":
|
| 128 |
-
"resolution": "
|
| 129 |
-
"source": "
|
| 130 |
"special": null
|
| 131 |
},
|
| 132 |
"pred": {
|
| 133 |
-
"group": "
|
| 134 |
-
"title": "
|
| 135 |
"season": null,
|
| 136 |
-
"episode":
|
| 137 |
-
"resolution": "
|
| 138 |
-
"source": "
|
| 139 |
"special": null
|
| 140 |
}
|
| 141 |
},
|
| 142 |
{
|
| 143 |
-
"filename": "[
|
| 144 |
"errors": {
|
| 145 |
-
"
|
| 146 |
-
"gold":
|
| 147 |
-
"pred": "
|
| 148 |
}
|
| 149 |
},
|
| 150 |
"gold": {
|
| 151 |
-
"group": "
|
| 152 |
-
"title": "
|
| 153 |
"season": null,
|
| 154 |
-
"episode":
|
| 155 |
"resolution": "1080p",
|
| 156 |
-
"source": "
|
| 157 |
-
"special":
|
| 158 |
},
|
| 159 |
"pred": {
|
| 160 |
-
"group": "
|
| 161 |
-
"title": "
|
| 162 |
-
"season":
|
| 163 |
-
"episode":
|
| 164 |
"resolution": "1080p",
|
| 165 |
-
"source": "
|
| 166 |
-
"special":
|
| 167 |
}
|
| 168 |
},
|
| 169 |
{
|
| 170 |
-
"filename": "
|
| 171 |
"errors": {
|
| 172 |
-
"
|
| 173 |
-
"gold": "
|
| 174 |
-
"pred": "
|
| 175 |
-
},
|
| 176 |
-
"season": {
|
| 177 |
-
"gold": null,
|
| 178 |
-
"pred": "4"
|
| 179 |
-
},
|
| 180 |
-
"episode": {
|
| 181 |
-
"gold": "7",
|
| 182 |
-
"pred": "2"
|
| 183 |
}
|
| 184 |
},
|
| 185 |
"gold": {
|
| 186 |
"group": null,
|
| 187 |
-
"title": "
|
| 188 |
"season": null,
|
| 189 |
-
"episode":
|
| 190 |
-
"resolution":
|
| 191 |
-
"source":
|
| 192 |
"special": null
|
| 193 |
},
|
| 194 |
"pred": {
|
| 195 |
"group": null,
|
| 196 |
-
"title": "
|
| 197 |
-
"season":
|
| 198 |
-
"episode":
|
| 199 |
-
"resolution":
|
| 200 |
-
"source":
|
| 201 |
"special": null
|
| 202 |
}
|
| 203 |
},
|
| 204 |
{
|
| 205 |
-
"filename": "[
|
| 206 |
"errors": {
|
| 207 |
-
"
|
| 208 |
-
"gold": "
|
| 209 |
-
"pred": "
|
| 210 |
}
|
| 211 |
},
|
| 212 |
"gold": {
|
| 213 |
-
"group": "
|
| 214 |
-
"title": "
|
| 215 |
"season": null,
|
| 216 |
-
"episode":
|
| 217 |
-
"resolution":
|
| 218 |
-
"source":
|
| 219 |
-
"special":
|
| 220 |
},
|
| 221 |
"pred": {
|
| 222 |
-
"group": "
|
| 223 |
-
"title": "
|
| 224 |
"season": null,
|
| 225 |
-
"episode":
|
| 226 |
-
"resolution":
|
| 227 |
-
"source":
|
| 228 |
-
"special":
|
| 229 |
}
|
| 230 |
},
|
| 231 |
{
|
| 232 |
-
"filename": "
|
| 233 |
"errors": {
|
| 234 |
-
"
|
| 235 |
-
"gold": "
|
| 236 |
-
"pred": "
|
| 237 |
}
|
| 238 |
},
|
| 239 |
"gold": {
|
| 240 |
-
"group":
|
| 241 |
-
"title": "
|
| 242 |
"season": null,
|
| 243 |
-
"episode":
|
| 244 |
-
"resolution":
|
| 245 |
-
"source":
|
| 246 |
-
"special":
|
| 247 |
},
|
| 248 |
"pred": {
|
| 249 |
-
"group":
|
| 250 |
-
"title": "
|
| 251 |
"season": null,
|
| 252 |
-
"episode":
|
| 253 |
-
"resolution":
|
| 254 |
-
"source":
|
| 255 |
-
"special":
|
| 256 |
}
|
| 257 |
},
|
| 258 |
{
|
| 259 |
-
"filename": "
|
| 260 |
"errors": {
|
| 261 |
-
"
|
| 262 |
-
"gold":
|
| 263 |
-
"pred": "
|
| 264 |
}
|
| 265 |
},
|
| 266 |
"gold": {
|
| 267 |
-
"group":
|
| 268 |
-
"title": "
|
| 269 |
"season": null,
|
| 270 |
-
"episode":
|
| 271 |
-
"resolution":
|
| 272 |
-
"source":
|
| 273 |
-
"special":
|
| 274 |
},
|
| 275 |
"pred": {
|
| 276 |
-
"group":
|
| 277 |
-
"title": "
|
| 278 |
-
"season":
|
| 279 |
-
"episode":
|
| 280 |
-
"resolution":
|
| 281 |
-
"source":
|
| 282 |
-
"special":
|
| 283 |
}
|
| 284 |
},
|
| 285 |
{
|
| 286 |
-
"filename": "
|
| 287 |
"errors": {
|
| 288 |
"season": {
|
| 289 |
-
"gold":
|
| 290 |
"pred": "1"
|
| 291 |
}
|
| 292 |
},
|
| 293 |
"gold": {
|
| 294 |
-
"group":
|
| 295 |
-
"title": "
|
| 296 |
-
"season":
|
| 297 |
-
"episode":
|
| 298 |
-
"resolution": "
|
| 299 |
-
"source": "
|
| 300 |
"special": null
|
| 301 |
},
|
| 302 |
"pred": {
|
| 303 |
-
"group":
|
| 304 |
-
"title": "
|
| 305 |
"season": 1,
|
| 306 |
-
"episode":
|
| 307 |
-
"resolution": "
|
| 308 |
-
"source": "
|
| 309 |
"special": null
|
| 310 |
}
|
| 311 |
},
|
| 312 |
{
|
| 313 |
-
"filename": "[
|
| 314 |
"errors": {
|
| 315 |
-
"
|
| 316 |
-
"gold":
|
| 317 |
-
"pred": "
|
| 318 |
},
|
| 319 |
-
"
|
| 320 |
-
"gold": "
|
| 321 |
-
"pred":
|
| 322 |
}
|
| 323 |
},
|
| 324 |
"gold": {
|
| 325 |
-
"group":
|
| 326 |
-
"title": "
|
| 327 |
-
"season":
|
| 328 |
"episode": 14,
|
| 329 |
-
"resolution": "
|
| 330 |
-
"source": "
|
| 331 |
"special": null
|
| 332 |
},
|
| 333 |
"pred": {
|
| 334 |
-
"group": "
|
| 335 |
-
"title": "
|
| 336 |
"season": null,
|
| 337 |
"episode": 14,
|
| 338 |
-
"resolution": "
|
| 339 |
-
"source": "
|
| 340 |
"special": null
|
| 341 |
}
|
| 342 |
},
|
| 343 |
{
|
| 344 |
-
"filename": "
|
| 345 |
"errors": {
|
|
|
|
|
|
|
|
|
|
|
|
|
| 346 |
"episode": {
|
| 347 |
-
"gold":
|
| 348 |
-
"pred": "
|
| 349 |
}
|
| 350 |
},
|
| 351 |
"gold": {
|
| 352 |
-
"group":
|
| 353 |
-
"title": "
|
| 354 |
"season": null,
|
| 355 |
-
"episode":
|
| 356 |
-
"resolution": "
|
| 357 |
"source": "BD",
|
| 358 |
"special": null
|
| 359 |
},
|
| 360 |
"pred": {
|
| 361 |
-
"group":
|
| 362 |
-
"title": "
|
| 363 |
"season": null,
|
| 364 |
-
"episode":
|
| 365 |
-
"resolution": "
|
| 366 |
"source": "BD",
|
| 367 |
"special": null
|
| 368 |
}
|
| 369 |
},
|
| 370 |
{
|
| 371 |
-
"filename": "
|
| 372 |
"errors": {
|
| 373 |
-
"
|
| 374 |
-
"gold":
|
| 375 |
-
"pred": "
|
| 376 |
}
|
| 377 |
},
|
| 378 |
"gold": {
|
| 379 |
-
"group":
|
| 380 |
-
"title": "
|
| 381 |
"season": null,
|
| 382 |
-
"episode":
|
| 383 |
"resolution": null,
|
| 384 |
-
"source":
|
| 385 |
"special": null
|
| 386 |
},
|
| 387 |
"pred": {
|
| 388 |
-
"group":
|
| 389 |
-
"title": "
|
| 390 |
"season": null,
|
| 391 |
-
"episode":
|
| 392 |
"resolution": null,
|
| 393 |
-
"source": "
|
| 394 |
"special": null
|
| 395 |
}
|
| 396 |
},
|
| 397 |
{
|
| 398 |
-
"filename": "(アニメ)
|
| 399 |
"errors": {
|
| 400 |
-
"
|
| 401 |
"gold": null,
|
| 402 |
-
"pred": "
|
| 403 |
}
|
| 404 |
},
|
| 405 |
"gold": {
|
| 406 |
"group": "アニメ",
|
| 407 |
-
"title": "
|
| 408 |
"season": null,
|
| 409 |
-
"episode":
|
| 410 |
-
"resolution":
|
| 411 |
-
"source":
|
| 412 |
"special": null
|
| 413 |
},
|
| 414 |
"pred": {
|
| 415 |
"group": "アニメ",
|
| 416 |
-
"title": "
|
| 417 |
-
"season":
|
| 418 |
-
"episode":
|
| 419 |
"resolution": "640x480",
|
| 420 |
-
"source":
|
| 421 |
"special": null
|
| 422 |
}
|
| 423 |
},
|
| 424 |
{
|
| 425 |
-
"filename": "
|
| 426 |
"errors": {
|
| 427 |
-
"
|
| 428 |
-
"gold": "
|
| 429 |
-
"pred":
|
| 430 |
}
|
| 431 |
},
|
| 432 |
"gold": {
|
| 433 |
-
"group":
|
| 434 |
-
"title": "
|
| 435 |
"season": null,
|
| 436 |
-
"episode":
|
| 437 |
-
"resolution":
|
| 438 |
-
"source": "
|
| 439 |
"special": null
|
| 440 |
},
|
| 441 |
"pred": {
|
| 442 |
-
"group":
|
| 443 |
-
"title": "
|
| 444 |
"season": null,
|
| 445 |
"episode": 6,
|
| 446 |
-
"resolution":
|
| 447 |
-
"source":
|
| 448 |
"special": null
|
| 449 |
}
|
| 450 |
},
|
| 451 |
{
|
| 452 |
-
"filename": "[
|
| 453 |
"errors": {
|
|
|
|
|
|
|
|
|
|
|
|
|
| 454 |
"season": {
|
| 455 |
-
"gold":
|
| 456 |
"pred": "1"
|
| 457 |
}
|
| 458 |
},
|
| 459 |
"gold": {
|
| 460 |
-
"group": "
|
| 461 |
-
"title": "
|
| 462 |
-
"season":
|
| 463 |
-
"episode":
|
| 464 |
"resolution": "1080p",
|
| 465 |
-
"source": "
|
| 466 |
"special": null
|
| 467 |
},
|
| 468 |
"pred": {
|
| 469 |
-
"group": "
|
| 470 |
-
"title": "
|
| 471 |
"season": 1,
|
| 472 |
-
"episode":
|
| 473 |
"resolution": "1080p",
|
| 474 |
-
"source": "
|
| 475 |
"special": null
|
| 476 |
}
|
| 477 |
},
|
| 478 |
{
|
| 479 |
-
"filename": "
|
| 480 |
"errors": {
|
| 481 |
-
"
|
| 482 |
-
"gold": null,
|
| 483 |
-
"pred": "1080p"
|
| 484 |
-
},
|
| 485 |
-
"source": {
|
| 486 |
"gold": null,
|
| 487 |
-
"pred": "
|
| 488 |
}
|
| 489 |
},
|
| 490 |
"gold": {
|
| 491 |
-
"group":
|
| 492 |
-
"title": "
|
| 493 |
"season": null,
|
| 494 |
-
"episode":
|
| 495 |
"resolution": null,
|
| 496 |
-
"source":
|
| 497 |
"special": null
|
| 498 |
},
|
| 499 |
"pred": {
|
| 500 |
-
"group":
|
| 501 |
-
"title": "
|
| 502 |
-
"season":
|
| 503 |
-
"episode":
|
| 504 |
-
"resolution":
|
| 505 |
-
"source": "
|
| 506 |
"special": null
|
| 507 |
}
|
| 508 |
},
|
| 509 |
{
|
| 510 |
-
"filename": "
|
| 511 |
"errors": {
|
| 512 |
"season": {
|
| 513 |
"gold": null,
|
|
@@ -515,80 +515,49 @@
|
|
| 515 |
}
|
| 516 |
},
|
| 517 |
"gold": {
|
| 518 |
-
"group": "
|
| 519 |
-
"title": "
|
| 520 |
"season": null,
|
| 521 |
-
"episode":
|
| 522 |
-
"resolution": "
|
| 523 |
-
"source":
|
| 524 |
"special": null
|
| 525 |
},
|
| 526 |
"pred": {
|
| 527 |
-
"group": "
|
| 528 |
-
"title": "
|
| 529 |
"season": 1,
|
| 530 |
-
"episode":
|
| 531 |
-
"resolution": "
|
| 532 |
-
"source":
|
| 533 |
"special": null
|
| 534 |
}
|
| 535 |
},
|
| 536 |
{
|
| 537 |
-
"filename": "[
|
| 538 |
"errors": {
|
| 539 |
-
"title": {
|
| 540 |
-
"gold": "dungeon ni deai wo motomeru no wa machigatteiru darou ka",
|
| 541 |
-
"pred": "dungeon ni deai wo motomeru no wa machigatteiru darou ka iv"
|
| 542 |
-
},
|
| 543 |
"season": {
|
| 544 |
-
"gold":
|
| 545 |
-
"pred":
|
| 546 |
-
}
|
| 547 |
-
},
|
| 548 |
-
"gold": {
|
| 549 |
-
"group": "KTXP",
|
| 550 |
-
"title": "Dungeon ni Deai wo Motomeru no wa Machigatteiru Darou ka",
|
| 551 |
-
"season": 4,
|
| 552 |
-
"episode": 13,
|
| 553 |
-
"resolution": "720P",
|
| 554 |
-
"source": "BIG5",
|
| 555 |
-
"special": null
|
| 556 |
-
},
|
| 557 |
-
"pred": {
|
| 558 |
-
"group": "KTXP",
|
| 559 |
-
"title": "Dungeon ni Deai wo Motomeru no wa Machigatteiru Darou ka IV",
|
| 560 |
-
"season": null,
|
| 561 |
-
"episode": 13,
|
| 562 |
-
"resolution": "720P",
|
| 563 |
-
"source": "BIG5",
|
| 564 |
-
"special": null
|
| 565 |
-
}
|
| 566 |
-
},
|
| 567 |
-
{
|
| 568 |
-
"filename": "[JyFanSub][Fate_Apocrypha][15][GB][1080]p",
|
| 569 |
-
"errors": {
|
| 570 |
-
"episode": {
|
| 571 |
-
"gold": "1080",
|
| 572 |
-
"pred": "15"
|
| 573 |
}
|
| 574 |
},
|
| 575 |
"gold": {
|
| 576 |
-
"group": "
|
| 577 |
-
"title": "
|
| 578 |
"season": null,
|
| 579 |
-
"episode":
|
| 580 |
-
"resolution":
|
| 581 |
-
"source": "
|
| 582 |
-
"special":
|
| 583 |
},
|
| 584 |
"pred": {
|
| 585 |
-
"group": "
|
| 586 |
-
"title": "
|
| 587 |
-
"season":
|
| 588 |
-
"episode":
|
| 589 |
-
"resolution":
|
| 590 |
-
"source": "
|
| 591 |
-
"special":
|
| 592 |
}
|
| 593 |
}
|
| 594 |
]
|
|
|
|
| 1 |
{
|
| 2 |
"sample_count": 2048,
|
| 3 |
"field_accuracy": {
|
| 4 |
+
"group": 0.99951171875,
|
| 5 |
+
"title": 0.99755859375,
|
| 6 |
+
"season": 0.99609375,
|
| 7 |
+
"episode": 0.998046875,
|
| 8 |
+
"resolution": 1.0,
|
| 9 |
+
"source": 0.99853515625,
|
| 10 |
+
"special": 0.9990234375
|
| 11 |
},
|
| 12 |
"field_correct": {
|
| 13 |
+
"group": 2047,
|
| 14 |
+
"title": 2043,
|
| 15 |
+
"season": 2040,
|
| 16 |
+
"episode": 2044,
|
| 17 |
+
"resolution": 2048,
|
| 18 |
+
"source": 2045,
|
| 19 |
+
"special": 2046
|
| 20 |
},
|
| 21 |
"field_total": {
|
| 22 |
"group": 2048,
|
|
|
|
| 27 |
"source": 2048,
|
| 28 |
"special": 2048
|
| 29 |
},
|
| 30 |
+
"full_match_accuracy": 0.99072265625,
|
| 31 |
+
"full_match_correct": 2029,
|
| 32 |
"full_match_total": 2048,
|
| 33 |
"failures": [
|
| 34 |
{
|
| 35 |
+
"filename": "[ig]Itai no wa Iya nano de Bougyoryoku ni Kyokufuri Shitai to Omoimasu[WebRip 1920x1080 AVC YUV420 8Bit 1080p AAC].03.TC",
|
| 36 |
"errors": {
|
| 37 |
+
"episode": {
|
| 38 |
+
"gold": "3",
|
| 39 |
+
"pred": null
|
| 40 |
}
|
| 41 |
},
|
| 42 |
"gold": {
|
| 43 |
+
"group": "ig",
|
| 44 |
+
"title": "Itai no wa Iya nano de Bougyoryoku ni Kyokufuri Shitai to Omoimasu",
|
| 45 |
"season": null,
|
| 46 |
+
"episode": 3,
|
| 47 |
+
"resolution": "1080p",
|
| 48 |
+
"source": "WebRip",
|
| 49 |
"special": null
|
| 50 |
},
|
| 51 |
"pred": {
|
| 52 |
+
"group": "ig",
|
| 53 |
+
"title": "Itai no wa Iya nano de Bougyoryoku ni Kyokufuri Shitai to Omoimasu",
|
| 54 |
+
"season": null,
|
| 55 |
+
"episode": null,
|
| 56 |
+
"resolution": "1080p",
|
| 57 |
+
"source": "WebRip",
|
| 58 |
"special": null
|
| 59 |
}
|
| 60 |
},
|
| 61 |
{
|
| 62 |
+
"filename": "[YYDM-11FANS][Nanana's Buried Treasure][preview][09][BDrip][720P][X264-10bit_AAC][34D29ED6]",
|
| 63 |
"errors": {
|
| 64 |
+
"special": {
|
| 65 |
+
"gold": "ed",
|
| 66 |
+
"pred": null
|
| 67 |
}
|
| 68 |
},
|
| 69 |
"gold": {
|
| 70 |
+
"group": "YYDM-11FANS",
|
| 71 |
+
"title": "Nanana's Buried Treasure",
|
| 72 |
"season": null,
|
| 73 |
"episode": 9,
|
| 74 |
+
"resolution": "720P",
|
| 75 |
+
"source": "BDrip",
|
| 76 |
+
"special": "ED"
|
| 77 |
},
|
| 78 |
"pred": {
|
| 79 |
+
"group": "YYDM-11FANS",
|
| 80 |
+
"title": "Nanana's Buried Treasure",
|
| 81 |
+
"season": null,
|
| 82 |
"episode": 9,
|
| 83 |
+
"resolution": "720P",
|
| 84 |
+
"source": "BDrip",
|
| 85 |
"special": null
|
| 86 |
}
|
| 87 |
},
|
| 88 |
{
|
| 89 |
+
"filename": "[Moozzi2] Madou King Granzort Saigo no Magical Taisen OVA - 01 [ 1990 ] (BD 1440x1080 x.264 Flac)",
|
| 90 |
"errors": {
|
| 91 |
+
"title": {
|
| 92 |
+
"gold": "madou king granzort saigo no magical taisen ova",
|
| 93 |
+
"pred": "madou king granzort saigo no magical taisen ova - 01 [ 1990"
|
| 94 |
+
},
|
| 95 |
+
"episode": {
|
| 96 |
+
"gold": "1",
|
| 97 |
+
"pred": "1990"
|
| 98 |
}
|
| 99 |
},
|
| 100 |
"gold": {
|
| 101 |
+
"group": "Moozzi2",
|
| 102 |
+
"title": "Madou King Granzort Saigo no Magical Taisen OVA",
|
| 103 |
"season": null,
|
| 104 |
+
"episode": 1,
|
| 105 |
+
"resolution": "1440x1080",
|
| 106 |
+
"source": "BD",
|
| 107 |
+
"special": "OVA"
|
| 108 |
},
|
| 109 |
"pred": {
|
| 110 |
+
"group": "Moozzi2",
|
| 111 |
+
"title": "Madou King Granzort Saigo no Magical Taisen OVA - 01 [ 1990 ",
|
| 112 |
"season": null,
|
| 113 |
+
"episode": 1990,
|
| 114 |
+
"resolution": "1440x1080",
|
| 115 |
+
"source": "BD",
|
| 116 |
+
"special": "OVA"
|
| 117 |
}
|
| 118 |
},
|
| 119 |
{
|
| 120 |
+
"filename": "[64bitsub][Tensui no Sakuna-hime][08][BDRIP_1920x1080][AVC_FLAC_SUP]",
|
| 121 |
"errors": {
|
| 122 |
"source": {
|
| 123 |
+
"gold": "flac",
|
| 124 |
+
"pred": "avc-flac"
|
| 125 |
}
|
| 126 |
},
|
| 127 |
"gold": {
|
| 128 |
+
"group": "64bitsub",
|
| 129 |
+
"title": "Tensui no Sakuna-hime",
|
| 130 |
"season": null,
|
| 131 |
+
"episode": 8,
|
| 132 |
+
"resolution": "1920x1080",
|
| 133 |
+
"source": "FLAC",
|
| 134 |
"special": null
|
| 135 |
},
|
| 136 |
"pred": {
|
| 137 |
+
"group": "64bitsub",
|
| 138 |
+
"title": "Tensui no Sakuna-hime",
|
| 139 |
"season": null,
|
| 140 |
+
"episode": 8,
|
| 141 |
+
"resolution": "1920x1080",
|
| 142 |
+
"source": "AVC_FLAC",
|
| 143 |
"special": null
|
| 144 |
}
|
| 145 |
},
|
| 146 |
{
|
| 147 |
+
"filename": "[VCB-Studio] Shingeki no Kyojin Movie 3 Kakusei no Houkou [Teaser_S3][Ma10p_1080p][x265_flac]",
|
| 148 |
"errors": {
|
| 149 |
+
"season": {
|
| 150 |
+
"gold": null,
|
| 151 |
+
"pred": "3"
|
| 152 |
}
|
| 153 |
},
|
| 154 |
"gold": {
|
| 155 |
+
"group": "VCB-Studio",
|
| 156 |
+
"title": "Shingeki no Kyojin Movie 3 Kakusei no Houkou [Teaser_S3",
|
| 157 |
"season": null,
|
| 158 |
+
"episode": 3,
|
| 159 |
"resolution": "1080p",
|
| 160 |
+
"source": "x265_flac",
|
| 161 |
+
"special": "Movie"
|
| 162 |
},
|
| 163 |
"pred": {
|
| 164 |
+
"group": "VCB-Studio",
|
| 165 |
+
"title": "Shingeki no Kyojin Movie 3 Kakusei no Houkou [Teaser_S3",
|
| 166 |
+
"season": 3,
|
| 167 |
+
"episode": 3,
|
| 168 |
"resolution": "1080p",
|
| 169 |
+
"source": "x265_flac",
|
| 170 |
+
"special": "Movie"
|
| 171 |
}
|
| 172 |
},
|
| 173 |
{
|
| 174 |
+
"filename": "FF:U ファイナルファンタジー:アンリミテッド ~異界の章~ #15 「ジェーン~うごきだすうみパズル」(DVD 640x480 DivX5 QB98 120fps lameVBR)[CRC_5FA44899]",
|
| 175 |
"errors": {
|
| 176 |
+
"source": {
|
| 177 |
+
"gold": "cr",
|
| 178 |
+
"pred": "dvd"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 179 |
}
|
| 180 |
},
|
| 181 |
"gold": {
|
| 182 |
"group": null,
|
| 183 |
+
"title": "FF:U ファイナルファンタジー:アンリミテッド ~異界の章~",
|
| 184 |
"season": null,
|
| 185 |
+
"episode": 15,
|
| 186 |
+
"resolution": "640x480",
|
| 187 |
+
"source": "CR",
|
| 188 |
"special": null
|
| 189 |
},
|
| 190 |
"pred": {
|
| 191 |
"group": null,
|
| 192 |
+
"title": "FF:U ファイナルファンタジー:アンリミテッド ~異界の章~",
|
| 193 |
+
"season": null,
|
| 194 |
+
"episode": 15,
|
| 195 |
+
"resolution": "640x480",
|
| 196 |
+
"source": "DVD",
|
| 197 |
"special": null
|
| 198 |
}
|
| 199 |
},
|
| 200 |
{
|
| 201 |
+
"filename": "[OVA]GALLFORCE ガルフォース2 宇宙章 vol2 [DESTRUCTION]",
|
| 202 |
"errors": {
|
| 203 |
+
"title": {
|
| 204 |
+
"gold": "gallforce ガルフォース2 宇宙章 vol",
|
| 205 |
+
"pred": "gallforce ガルフォース2 宇宙"
|
| 206 |
}
|
| 207 |
},
|
| 208 |
"gold": {
|
| 209 |
+
"group": "OVA",
|
| 210 |
+
"title": "GALLFORCE ガルフォース2 宇宙章 vol",
|
| 211 |
"season": null,
|
| 212 |
+
"episode": 2,
|
| 213 |
+
"resolution": null,
|
| 214 |
+
"source": null,
|
| 215 |
+
"special": "OVA"
|
| 216 |
},
|
| 217 |
"pred": {
|
| 218 |
+
"group": "OVA",
|
| 219 |
+
"title": "GALLFORCE ガルフォース2 宇宙",
|
| 220 |
"season": null,
|
| 221 |
+
"episode": 2,
|
| 222 |
+
"resolution": null,
|
| 223 |
+
"source": null,
|
| 224 |
+
"special": "OVA"
|
| 225 |
}
|
| 226 |
},
|
| 227 |
{
|
| 228 |
+
"filename": "[病毒].[Fosky_Fansub][Virus_Buster_Serge][DVDrip][12][H264_AAC][640x480][GB&BIG5][F77551D0](ED2000.COM)",
|
| 229 |
"errors": {
|
| 230 |
+
"special": {
|
| 231 |
+
"gold": "ed",
|
| 232 |
+
"pred": "e"
|
| 233 |
}
|
| 234 |
},
|
| 235 |
"gold": {
|
| 236 |
+
"group": "病毒",
|
| 237 |
+
"title": "Fosky_Fansub",
|
| 238 |
"season": null,
|
| 239 |
+
"episode": 12,
|
| 240 |
+
"resolution": "640x480",
|
| 241 |
+
"source": "DVDrip",
|
| 242 |
+
"special": "ED"
|
| 243 |
},
|
| 244 |
"pred": {
|
| 245 |
+
"group": "病毒",
|
| 246 |
+
"title": "Fosky_Fansub",
|
| 247 |
"season": null,
|
| 248 |
+
"episode": 12,
|
| 249 |
+
"resolution": "640x480",
|
| 250 |
+
"source": "DVDrip",
|
| 251 |
+
"special": "E"
|
| 252 |
}
|
| 253 |
},
|
| 254 |
{
|
| 255 |
+
"filename": "[DBD-Raws][Shadows House S1][Gekijou][18][1080P][BDRip][HEVC-10bit][FLAC]",
|
| 256 |
"errors": {
|
| 257 |
+
"season": {
|
| 258 |
+
"gold": null,
|
| 259 |
+
"pred": "1"
|
| 260 |
}
|
| 261 |
},
|
| 262 |
"gold": {
|
| 263 |
+
"group": "DBD-Raws",
|
| 264 |
+
"title": "Shadows House",
|
| 265 |
"season": null,
|
| 266 |
+
"episode": 18,
|
| 267 |
+
"resolution": "1080P",
|
| 268 |
+
"source": "BDRip",
|
| 269 |
+
"special": null
|
| 270 |
},
|
| 271 |
"pred": {
|
| 272 |
+
"group": "DBD-Raws",
|
| 273 |
+
"title": "Shadows House",
|
| 274 |
+
"season": 1,
|
| 275 |
+
"episode": 18,
|
| 276 |
+
"resolution": "1080P",
|
| 277 |
+
"source": "BDRip",
|
| 278 |
+
"special": null
|
| 279 |
}
|
| 280 |
},
|
| 281 |
{
|
| 282 |
+
"filename": "Girls und Panzer - 10.5 (BD 1280x720 AVC AACx2)",
|
| 283 |
"errors": {
|
| 284 |
"season": {
|
| 285 |
+
"gold": "10",
|
| 286 |
"pred": "1"
|
| 287 |
}
|
| 288 |
},
|
| 289 |
"gold": {
|
| 290 |
+
"group": null,
|
| 291 |
+
"title": "Girls und Panzer - 10.5",
|
| 292 |
+
"season": 10,
|
| 293 |
+
"episode": 5,
|
| 294 |
+
"resolution": "1280x720",
|
| 295 |
+
"source": "BD",
|
| 296 |
"special": null
|
| 297 |
},
|
| 298 |
"pred": {
|
| 299 |
+
"group": null,
|
| 300 |
+
"title": "Girls und Panzer - 10.5",
|
| 301 |
"season": 1,
|
| 302 |
+
"episode": 5,
|
| 303 |
+
"resolution": "1280x720",
|
| 304 |
+
"source": "BD",
|
| 305 |
"special": null
|
| 306 |
}
|
| 307 |
},
|
| 308 |
{
|
| 309 |
+
"filename": "[POPGO&SumiSora&TxxZ] Ginga Eiyuu Densetsu Die Neue These - Seiran 14 (BDRip 1080P X265 Main10p TrueHDX2 Chap)[A4E18C32]",
|
| 310 |
"errors": {
|
| 311 |
+
"group": {
|
| 312 |
+
"gold": null,
|
| 313 |
+
"pred": "popgo&sumisora&txxz"
|
| 314 |
},
|
| 315 |
+
"title": {
|
| 316 |
+
"gold": "popgo&sumisora&txxz",
|
| 317 |
+
"pred": "ginga eiyuu densetsu die neue these - seiran 14"
|
| 318 |
}
|
| 319 |
},
|
| 320 |
"gold": {
|
| 321 |
+
"group": null,
|
| 322 |
+
"title": "POPGO&SumiSora&TxxZ",
|
| 323 |
+
"season": null,
|
| 324 |
"episode": 14,
|
| 325 |
+
"resolution": "1080P",
|
| 326 |
+
"source": "BDRip",
|
| 327 |
"special": null
|
| 328 |
},
|
| 329 |
"pred": {
|
| 330 |
+
"group": "POPGO&SumiSora&TxxZ",
|
| 331 |
+
"title": "Ginga Eiyuu Densetsu Die Neue These - Seiran 14",
|
| 332 |
"season": null,
|
| 333 |
"episode": 14,
|
| 334 |
+
"resolution": "1080P",
|
| 335 |
+
"source": "BDRip",
|
| 336 |
"special": null
|
| 337 |
}
|
| 338 |
},
|
| 339 |
{
|
| 340 |
+
"filename": "[アニメ BD] Serial Experiments Lain 映像特典 「trailer 01」 (1440x1080 x264 AAC 2ch)",
|
| 341 |
"errors": {
|
| 342 |
+
"title": {
|
| 343 |
+
"gold": "serial experiments lain 映像特典 「trailer 01」",
|
| 344 |
+
"pred": "serial experiments lain 映像特典 「trailer"
|
| 345 |
+
},
|
| 346 |
"episode": {
|
| 347 |
+
"gold": "2",
|
| 348 |
+
"pred": "1"
|
| 349 |
}
|
| 350 |
},
|
| 351 |
"gold": {
|
| 352 |
+
"group": "アニメ BD",
|
| 353 |
+
"title": "Serial Experiments Lain 映像特典 「trailer 01」",
|
| 354 |
"season": null,
|
| 355 |
+
"episode": 2,
|
| 356 |
+
"resolution": "1440x1080",
|
| 357 |
"source": "BD",
|
| 358 |
"special": null
|
| 359 |
},
|
| 360 |
"pred": {
|
| 361 |
+
"group": "アニメ BD",
|
| 362 |
+
"title": "Serial Experiments Lain 映像特典 「trailer",
|
| 363 |
"season": null,
|
| 364 |
+
"episode": 1,
|
| 365 |
+
"resolution": "1440x1080",
|
| 366 |
"source": "BD",
|
| 367 |
"special": null
|
| 368 |
}
|
| 369 |
},
|
| 370 |
{
|
| 371 |
+
"filename": "[AJZ&BLU][God Eater][05][BIG5][v2] (2)",
|
| 372 |
"errors": {
|
| 373 |
+
"episode": {
|
| 374 |
+
"gold": "2",
|
| 375 |
+
"pred": "5"
|
| 376 |
}
|
| 377 |
},
|
| 378 |
"gold": {
|
| 379 |
+
"group": "AJZ&BLU",
|
| 380 |
+
"title": "God Eater",
|
| 381 |
"season": null,
|
| 382 |
+
"episode": 2,
|
| 383 |
"resolution": null,
|
| 384 |
+
"source": "BIG5",
|
| 385 |
"special": null
|
| 386 |
},
|
| 387 |
"pred": {
|
| 388 |
+
"group": "AJZ&BLU",
|
| 389 |
+
"title": "God Eater",
|
| 390 |
"season": null,
|
| 391 |
+
"episode": 5,
|
| 392 |
"resolution": null,
|
| 393 |
+
"source": "BIG5",
|
| 394 |
"special": null
|
| 395 |
}
|
| 396 |
},
|
| 397 |
{
|
| 398 |
+
"filename": "(アニメ) YAT安心!宇宙旅行 第1期 第07話 「サバイバル!野生のカネア」 (LD 640x480 WMV9 QB90 24fps)",
|
| 399 |
"errors": {
|
| 400 |
+
"season": {
|
| 401 |
"gold": null,
|
| 402 |
+
"pred": "1"
|
| 403 |
}
|
| 404 |
},
|
| 405 |
"gold": {
|
| 406 |
"group": "アニメ",
|
| 407 |
+
"title": "YAT安心!宇宙旅行",
|
| 408 |
"season": null,
|
| 409 |
+
"episode": 7,
|
| 410 |
+
"resolution": "640x480",
|
| 411 |
+
"source": null,
|
| 412 |
"special": null
|
| 413 |
},
|
| 414 |
"pred": {
|
| 415 |
"group": "アニメ",
|
| 416 |
+
"title": "YAT安心!宇宙旅行",
|
| 417 |
+
"season": 1,
|
| 418 |
+
"episode": 7,
|
| 419 |
"resolution": "640x480",
|
| 420 |
+
"source": null,
|
| 421 |
"special": null
|
| 422 |
}
|
| 423 |
},
|
| 424 |
{
|
| 425 |
+
"filename": "Lord El-Melloi II-sei no Jikenbo 06 [1AAC021C]",
|
| 426 |
"errors": {
|
| 427 |
+
"source": {
|
| 428 |
+
"gold": "aac",
|
| 429 |
+
"pred": null
|
| 430 |
}
|
| 431 |
},
|
| 432 |
"gold": {
|
| 433 |
+
"group": null,
|
| 434 |
+
"title": "Lord El-Melloi II-sei no Jikenbo",
|
| 435 |
"season": null,
|
| 436 |
+
"episode": 6,
|
| 437 |
+
"resolution": null,
|
| 438 |
+
"source": "AAC",
|
| 439 |
"special": null
|
| 440 |
},
|
| 441 |
"pred": {
|
| 442 |
+
"group": null,
|
| 443 |
+
"title": "Lord El-Melloi II-sei no Jikenbo",
|
| 444 |
"season": null,
|
| 445 |
"episode": 6,
|
| 446 |
+
"resolution": null,
|
| 447 |
+
"source": null,
|
| 448 |
"special": null
|
| 449 |
}
|
| 450 |
},
|
| 451 |
{
|
| 452 |
+
"filename": "[Skymoon-Raws] Mashle 2nd Season - 01(13) [ViuTV][WEB-DL][1080p][AVC AAC]",
|
| 453 |
"errors": {
|
| 454 |
+
"title": {
|
| 455 |
+
"gold": "mashle 2nd season - 01",
|
| 456 |
+
"pred": "mashle 2nd season"
|
| 457 |
+
},
|
| 458 |
"season": {
|
| 459 |
+
"gold": "2",
|
| 460 |
"pred": "1"
|
| 461 |
}
|
| 462 |
},
|
| 463 |
"gold": {
|
| 464 |
+
"group": "Skymoon-Raws",
|
| 465 |
+
"title": "Mashle 2nd Season - 01",
|
| 466 |
+
"season": 2,
|
| 467 |
+
"episode": 13,
|
| 468 |
"resolution": "1080p",
|
| 469 |
+
"source": "WEB-DL",
|
| 470 |
"special": null
|
| 471 |
},
|
| 472 |
"pred": {
|
| 473 |
+
"group": "Skymoon-Raws",
|
| 474 |
+
"title": "Mashle 2nd Season",
|
| 475 |
"season": 1,
|
| 476 |
+
"episode": 13,
|
| 477 |
"resolution": "1080p",
|
| 478 |
+
"source": "WEB-DL",
|
| 479 |
"special": null
|
| 480 |
}
|
| 481 |
},
|
| 482 |
{
|
| 483 |
+
"filename": "【CXRAW】【S17】【Power Rangers RPM】【30】【End Game】【x264 Hi10p AAC】【MP4】",
|
| 484 |
"errors": {
|
| 485 |
+
"season": {
|
|
|
|
|
|
|
|
|
|
|
|
|
| 486 |
"gold": null,
|
| 487 |
+
"pred": "17"
|
| 488 |
}
|
| 489 |
},
|
| 490 |
"gold": {
|
| 491 |
+
"group": "CXRAW",
|
| 492 |
+
"title": "S17",
|
| 493 |
"season": null,
|
| 494 |
+
"episode": 30,
|
| 495 |
"resolution": null,
|
| 496 |
+
"source": "AAC",
|
| 497 |
"special": null
|
| 498 |
},
|
| 499 |
"pred": {
|
| 500 |
+
"group": "CXRAW",
|
| 501 |
+
"title": "S17",
|
| 502 |
+
"season": 17,
|
| 503 |
+
"episode": 30,
|
| 504 |
+
"resolution": null,
|
| 505 |
+
"source": "AAC",
|
| 506 |
"special": null
|
| 507 |
}
|
| 508 |
},
|
| 509 |
{
|
| 510 |
+
"filename": "(アニメ) YAT安心!宇宙旅行 第1期 第24話 「モーレツ!かあちゃん珍道中」 (LD 640x480 WMV9 QB90 24fps)",
|
| 511 |
"errors": {
|
| 512 |
"season": {
|
| 513 |
"gold": null,
|
|
|
|
| 515 |
}
|
| 516 |
},
|
| 517 |
"gold": {
|
| 518 |
+
"group": "アニメ",
|
| 519 |
+
"title": "YAT安心!宇宙旅行",
|
| 520 |
"season": null,
|
| 521 |
+
"episode": 24,
|
| 522 |
+
"resolution": "640x480",
|
| 523 |
+
"source": null,
|
| 524 |
"special": null
|
| 525 |
},
|
| 526 |
"pred": {
|
| 527 |
+
"group": "アニメ",
|
| 528 |
+
"title": "YAT安心!宇宙旅行",
|
| 529 |
"season": 1,
|
| 530 |
+
"episode": 24,
|
| 531 |
+
"resolution": "640x480",
|
| 532 |
+
"source": null,
|
| 533 |
"special": null
|
| 534 |
}
|
| 535 |
},
|
| 536 |
{
|
| 537 |
+
"filename": "[Snow-Raws] アイドルマスター シンデレラガールズ劇場 第2期 SP17 (DVD 1280x720 HEVC-YUV420P10 FLAC)",
|
| 538 |
"errors": {
|
|
|
|
|
|
|
|
|
|
|
|
|
| 539 |
"season": {
|
| 540 |
+
"gold": null,
|
| 541 |
+
"pred": "2"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 542 |
}
|
| 543 |
},
|
| 544 |
"gold": {
|
| 545 |
+
"group": "Snow-Raws",
|
| 546 |
+
"title": "アイドルマスター シンデレラガールズ劇場 第2期 SP17",
|
| 547 |
"season": null,
|
| 548 |
+
"episode": 17,
|
| 549 |
+
"resolution": "1280x720",
|
| 550 |
+
"source": "DVD",
|
| 551 |
+
"special": "SP"
|
| 552 |
},
|
| 553 |
"pred": {
|
| 554 |
+
"group": "Snow-Raws",
|
| 555 |
+
"title": "アイドルマスター シンデレラガールズ劇場 第2期 SP17",
|
| 556 |
+
"season": 2,
|
| 557 |
+
"episode": 17,
|
| 558 |
+
"resolution": "1280x720",
|
| 559 |
+
"source": "DVD",
|
| 560 |
+
"special": "SP"
|
| 561 |
}
|
| 562 |
}
|
| 563 |
]
|
run_metadata.json
CHANGED
|
@@ -1,5 +1,5 @@
|
|
| 1 |
{
|
| 2 |
-
"experiment_name": "dmhy-char-
|
| 3 |
"data_file": "datasets/AnimeName/dmhy_weak_char.jsonl",
|
| 4 |
"tokenizer_variant": "char",
|
| 5 |
"vocab_file": "datasets/AnimeName/vocab.char.json",
|
|
@@ -15,7 +15,7 @@
|
|
| 15 |
"batch_size": 256,
|
| 16 |
"learning_rate": 8e-05,
|
| 17 |
"warmup_steps": 300,
|
| 18 |
-
"seed":
|
| 19 |
"device": "cuda",
|
| 20 |
"fp16": true,
|
| 21 |
"gradient_accumulation_steps": 1,
|
|
|
|
| 1 |
{
|
| 2 |
+
"experiment_name": "dmhy-char-guoman-relabel",
|
| 3 |
"data_file": "datasets/AnimeName/dmhy_weak_char.jsonl",
|
| 4 |
"tokenizer_variant": "char",
|
| 5 |
"vocab_file": "datasets/AnimeName/vocab.char.json",
|
|
|
|
| 15 |
"batch_size": 256,
|
| 16 |
"learning_rate": 8e-05,
|
| 17 |
"warmup_steps": 300,
|
| 18 |
+
"seed": 52,
|
| 19 |
"device": "cuda",
|
| 20 |
"fp16": true,
|
| 21 |
"gradient_accumulation_steps": 1,
|
trainer_eval_metrics.json
CHANGED
|
@@ -1,11 +1,11 @@
|
|
| 1 |
{
|
| 2 |
-
"eval_loss": 0.
|
| 3 |
-
"eval_precision": 0.
|
| 4 |
-
"eval_recall": 0.
|
| 5 |
-
"eval_f1": 0.
|
| 6 |
-
"eval_accuracy": 0.
|
| 7 |
-
"eval_runtime":
|
| 8 |
-
"eval_samples_per_second":
|
| 9 |
-
"eval_steps_per_second": 1.
|
| 10 |
"epoch": 2.0
|
| 11 |
}
|
|
|
|
| 1 |
{
|
| 2 |
+
"eval_loss": 0.005763721186667681,
|
| 3 |
+
"eval_precision": 0.9921522239605195,
|
| 4 |
+
"eval_recall": 0.9946191314105016,
|
| 5 |
+
"eval_f1": 0.9933841461473317,
|
| 6 |
+
"eval_accuracy": 0.9980711558885925,
|
| 7 |
+
"eval_runtime": 45.558,
|
| 8 |
+
"eval_samples_per_second": 277.471,
|
| 9 |
+
"eval_steps_per_second": 1.098,
|
| 10 |
"epoch": 2.0
|
| 11 |
}
|
training_args.bin
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 5265
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:f01503ec029ec161063c2d78a00732c80072525b8d258c7c717b2e21f4f55d93
|
| 3 |
size 5265
|