Token Classification
Transformers
ONNX
Safetensors
English
Japanese
Chinese
bert
anime
filename-parsing
Eval Results (legacy)
Instructions to use ModerRAS/AniFileBERT with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ModerRAS/AniFileBERT with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("token-classification", model="ModerRAS/AniFileBERT")# Load model directly from transformers import AutoTokenizer, AutoModelForTokenClassification tokenizer = AutoTokenizer.from_pretrained("ModerRAS/AniFileBERT") model = AutoModelForTokenClassification.from_pretrained("ModerRAS/AniFileBERT") - Notebooks
- Google Colab
- Kaggle
File size: 5,886 Bytes
376db19 f712f4b 376db19 8c50d16 376db19 8c50d16 376db19 8c50d16 376db19 8c50d16 376db19 8c50d16 376db19 8c50d16 376db19 8c50d16 376db19 116c87c f712f4b 116c87c f712f4b 116c87c 376db19 ce3a60d 8c50d16 ce3a60d f712f4b ce3a60d f712f4b ce3a60d 359ff82 ce3a60d f712f4b ce3a60d f712f4b 8c50d16 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 | # ONNX Usage / ONNX 使用说明
AniFileBERT exports a static-shape ONNX graph for Android and local inference.
AniFileBERT 导出静态 shape 的 ONNX 图,用于 Android 和本地推理。
## 1. What ONNX Contains / ONNX 包含什么
The ONNX graph contains only the BERT token-classification forward pass:
ONNX 图只包含 BERT token-classification 前向计算:
```text
input_ids int64[1,128]
attention_mask int64[1,128]
logits float32[1,128,15]
```
It does **not** contain:
它**不包含**:
- filename tokenization / 文件名分词
- token-to-id conversion / token 到 id 的转换
- constrained BIO decoding / 约束 BIO 解码
- field aggregation / 字段聚合
- thin string and number normalization / 薄字符串和数字规范化
Those steps must stay aligned with `anifilebert/tokenizer.py`, `anifilebert/inference.py`, `config.json`,
and `vocab.json`.
这些步骤必须与 `anifilebert/tokenizer.py`、`anifilebert/inference.py`、`config.json`、`vocab.json`
保持一致。
## 2. Export / 导出
```powershell
uv run python -m tools.export_onnx --model-dir . --output exports/anime_filename_parser.onnx --max-length 128
```
The exporter also writes:
导出器还会写入:
```text
exports/anime_filename_parser.metadata.json
```
The metadata records the sample filename, output shape, and PyTorch/ONNX max
absolute logits difference.
metadata 会记录样本文件名、输出 shape、PyTorch/ONNX logits 最大绝对误差。
## 3. Local ONNX Inference / 本地 ONNX 推理
Use `python -m tools.onnx_inference` as the minimal runnable reference.
使用 `python -m tools.onnx_inference` 作为最小可运行参考实现。
```powershell
uv run python -m tools.onnx_inference "[GM-Team][国漫][神印王座][Throne of Seal][2022][200][AVC][GB][1080P].mp4"
```
Expected:
期望输出:
```json
{"title":"神印王座","season":null,"episode":200,"group":"GM-Team","resolution":"1080P","source":"GB","special":null}
```
Special-code example:
特典编号示例:
```powershell
uv run python -m tools.onnx_inference "[YYDM&VCB-Studio] Shinsekai Yori [NCED02][Ma10p_1080p][x265_flac].mkv"
```
Expected:
期望输出:
```json
{"title":"Shinsekai Yori","season":null,"episode":null,"group":"YYDM&VCB-Studio","resolution":"1080p","source":"x265_flac","special":"NCED02"}
```
## 4. Implementation Steps / 实现步骤
The runtime parser should do this:
运行时解析器应按以下步骤实现:
1. Tokenize filename with the custom character tokenizer.
使用自定义字符 tokenizer 对文件名分词。
2. Add `[CLS]` and `[SEP]`, truncate to `max_length - 2`.
添加 `[CLS]` 和 `[SEP]`,截断到 `max_length - 2`。
3. Convert tokens to ids with `vocab.json`.
使用 `vocab.json` 转换 token id。
4. Pad `input_ids` and `attention_mask` to exactly `128`.
将 `input_ids` 和 `attention_mask` padding 到固定 `128`。
5. Run ONNX Runtime.
执行 ONNX Runtime。
6. Slice logits back to real token count, excluding `[CLS]` and `[SEP]`.
去掉 `[CLS]` / `[SEP]`,只保留真实 token 的 logits。
7. Decode labels with constrained BIO transitions.
使用约束 BIO transition 解码标签。
8. Aggregate labels into parser fields.
聚合标签为结构化字段。
9. Apply thin normalization only: trim brackets, normalize source text, and
convert numeric fields.
只做薄层规范化:裁剪括号/扩展名并转换数字字段。
The ONNX reference runtime intentionally matches the Python thin runtime. It
does not include structural filename regex assists.
ONNX 参考运行时有意与 Python 薄层运行时保持一致,不包含结构化文件名正则辅助。
## 5. Android Notes / Android 注意事项
Android must bundle these files together:
Android 端必须同时打包:
```text
anime_filename_parser.onnx
vocab.json
config.json
```
When changing any of them, update all of them in the same commit.
只要其中任意一个变化,三者必须在同一次提交中一起更新。
## 6. Common Mistakes / 常见错误
**Using a standard Hugging Face tokenizer**
**误用标准 Hugging Face tokenizer**
This model uses `AnimeTokenizer`, not WordPiece/BPE.
本模型使用 `AnimeTokenizer`,不是 WordPiece/BPE。
**Treating ONNX output as final fields**
**把 ONNX 输出当成最终字段**
ONNX returns token logits. You still need BIO decode and field aggregation.
ONNX 返回 token logits,仍然需要 BIO 解码和字段聚合。
**Changing max length without updating Android**
**改 max length 但没有同步 Android**
The exported graph is static. Runtime arrays must match `[1,128]`.
导出的图是静态 shape,运行时数组必须匹配 `[1,128]`。
## 7. Benchmark / 性能基准
Run:
运行:
```powershell
uv run python -m tools.benchmark_inference --model-dir . --onnx exports/anime_filename_parser.onnx --case-file data/parser_regression_cases.json --repeat 20 --warmup 20 --torch-threads 1 --ort-threads 1 --output reports/benchmark_results.json
```
Local single-thread CPU result, measured on 26 real-world regression cases with
the default thin runtime:
本地 CPU 单线程结果,使用 26 条真实回归 case 和默认薄层运行时:
| Backend / 后端 | Load ms / 加载 ms | Avg ms / 平均 ms | P50 ms | P95 ms | P99 ms | files/s |
| --- | ---: | ---: | ---: | ---: | ---: | ---: |
| PyTorch | 46.35 | 15.36 | 14.25 | 22.27 | 29.75 | 65.1 |
| ONNX Runtime | 50.92 | 12.04 | 11.90 | 13.81 | 15.38 | 83.1 |
The benchmark includes tokenization, model/session forward, constrained BIO
decode, entity aggregation, and thin normalization. It does not include
repeatedly constructing the ONNX Runtime session inside the loop.
该基准包含 tokenizer、模型/session 前向、约束 BIO 解码、实体聚合和薄层规范化;
循环内不会重复创建 ONNX Runtime session。
|