Instructions to use ModerRAS/AniFileBERT with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ModerRAS/AniFileBERT with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("token-classification", model="ModerRAS/AniFileBERT")# Load model directly from transformers import AutoTokenizer, AutoModelForTokenClassification tokenizer = AutoTokenizer.from_pretrained("ModerRAS/AniFileBERT") model = AutoModelForTokenClassification.from_pretrained("ModerRAS/AniFileBERT") - Notebooks
- Google Colab
- Kaggle
ONNX Usage / ONNX 使用说明
AniFileBERT exports a static-shape ONNX graph for Android and local inference.
AniFileBERT 导出静态 shape 的 ONNX 图,用于 Android 和本地推理。
1. What ONNX Contains / ONNX 包含什么
The ONNX graph contains only the BERT token-classification forward pass:
ONNX 图只包含 BERT token-classification 前向计算:
input_ids int64[1,128]
attention_mask int64[1,128]
logits float32[1,128,15]
It does not contain:
它不包含:
- filename tokenization / 文件名分词
- token-to-id conversion / token 到 id 的转换
- constrained BIO decoding / 约束 BIO 解码
- field aggregation / 字段聚合
- thin string and number normalization / 薄字符串和数字规范化
Those steps must stay aligned with anifilebert/tokenizer.py, anifilebert/inference.py, config.json,
and vocab.json.
这些步骤必须与 anifilebert/tokenizer.py、anifilebert/inference.py、config.json、vocab.json
保持一致。
2. Export / 导出
uv run python -m tools.export_onnx --model-dir . --output exports/anime_filename_parser.onnx --max-length 128
The exporter also writes:
导出器还会写入:
exports/anime_filename_parser.metadata.json
The metadata records the sample filename, output shape, and PyTorch/ONNX max absolute logits difference.
metadata 会记录样本文件名、输出 shape、PyTorch/ONNX logits 最大绝对误差。
3. Local ONNX Inference / 本地 ONNX 推理
Use python -m tools.onnx_inference as the minimal runnable reference.
使用 python -m tools.onnx_inference 作为最小可运行参考实现。
uv run python -m tools.onnx_inference "[GM-Team][国漫][神印王座][Throne of Seal][2022][200][AVC][GB][1080P].mp4"
Expected:
期望输出:
{"title":"神印王座","season":null,"episode":200,"group":"GM-Team","resolution":"1080P","source":"GB","special":null}
Special-code example:
特典编号示例:
uv run python -m tools.onnx_inference "[YYDM&VCB-Studio] Shinsekai Yori [NCED02][Ma10p_1080p][x265_flac].mkv"
Expected:
期望输出:
{"title":"Shinsekai Yori","season":null,"episode":null,"group":"YYDM&VCB-Studio","resolution":"1080p","source":"x265_flac","special":"NCED02"}
4. Implementation Steps / 实现步骤
The runtime parser should do this:
运行时解析器应按以下步骤实现:
- Tokenize filename with the custom character tokenizer. 使用自定义字符 tokenizer 对文件名分词。
- Add
[CLS]and[SEP], truncate tomax_length - 2. 添加[CLS]和[SEP],截断到max_length - 2。 - Convert tokens to ids with
vocab.json. 使用vocab.json转换 token id。 - Pad
input_idsandattention_maskto exactly128. 将input_ids和attention_maskpadding 到固定128。 - Run ONNX Runtime. 执行 ONNX Runtime。
- Slice logits back to real token count, excluding
[CLS]and[SEP]. 去掉[CLS]/[SEP],只保留真实 token 的 logits。 - Decode labels with constrained BIO transitions. 使用约束 BIO transition 解码标签。
- Aggregate labels into parser fields. 聚合标签为结构化字段。
- Apply thin normalization only: trim brackets, normalize source text, and convert numeric fields. 只做薄层规范化:裁剪括号/扩展名并转换数字字段。
The ONNX reference runtime intentionally matches the Python thin runtime. It does not include structural filename regex assists.
ONNX 参考运行时有意与 Python 薄层运行时保持一致,不包含结构化文件名正则辅助。
5. Android Notes / Android 注意事项
Android must bundle these files together:
Android 端必须同时打包:
anime_filename_parser.onnx
vocab.json
config.json
When changing any of them, update all of them in the same commit.
只要其中任意一个变化,三者必须在同一次提交中一起更新。
6. Common Mistakes / 常见错误
Using a standard Hugging Face tokenizer
误用标准 Hugging Face tokenizer
This model uses AnimeTokenizer, not WordPiece/BPE.
本模型使用 AnimeTokenizer,不是 WordPiece/BPE。
Treating ONNX output as final fields
把 ONNX 输出当成最终字段
ONNX returns token logits. You still need BIO decode and field aggregation.
ONNX 返回 token logits,仍然需要 BIO 解码和字段聚合。
Changing max length without updating Android
改 max length 但没有同步 Android
The exported graph is static. Runtime arrays must match [1,128].
导出的图是静态 shape,运行时数组必须匹配 [1,128]。
7. Benchmark / 性能基准
Run:
运行:
uv run python -m tools.benchmark_inference --model-dir . --onnx exports/anime_filename_parser.onnx --case-file data/parser_regression_cases.json --repeat 20 --warmup 20 --torch-threads 1 --ort-threads 1 --output reports/benchmark_results.json
Local single-thread CPU result, measured on 26 real-world regression cases with the default thin runtime:
本地 CPU 单线程结果,使用 26 条真实回归 case 和默认薄层运行时:
| Backend / 后端 | Load ms / 加载 ms | Avg ms / 平均 ms | P50 ms | P95 ms | P99 ms | files/s |
|---|---|---|---|---|---|---|
| PyTorch | 46.35 | 15.36 | 14.25 | 22.27 | 29.75 | 65.1 |
| ONNX Runtime | 50.92 | 12.04 | 11.90 | 13.81 | 15.38 | 83.1 |
The benchmark includes tokenization, model/session forward, constrained BIO decode, entity aggregation, and thin normalization. It does not include repeatedly constructing the ONNX Runtime session inside the loop.
该基准包含 tokenizer、模型/session 前向、约束 BIO 解码、实体聚合和薄层规范化; 循环内不会重复创建 ONNX Runtime session。