# Android Export and Runtime / Android 导出与运行时 AniFileBERT is used by MiruPlay as a Git submodule at `tools/anime_parser`. AniFileBERT 在 MiruPlay 中作为 `tools/anime_parser` 子模块使用。 ## Export / 导出 From this repository root, export the published root checkpoint: 在本仓库根目录导出当前发布 checkpoint: ```powershell uv sync uv run python -m tools.export_onnx --model-dir . --max-length 128 --android-assets-dir ../../scraper/src/main/assets/anime_parser ``` The exporter writes: 导出器会写入: - `exports/anime_filename_parser.onnx` - `exports/anime_filename_parser.metadata.json` - `scraper/src/main/assets/anime_parser/anime_filename_parser.onnx` - `scraper/src/main/assets/anime_parser/vocab.json` - `scraper/src/main/assets/anime_parser/config.json` ## Static Graph Shape / 静态图 Shape ```text input_ids int64[1,128] attention_mask int64[1,128] logits float32[1,128,15] ``` The current export is verified against PyTorch, with max absolute logits difference recorded in `exports/anime_filename_parser.metadata.json`. 当前导出会和 PyTorch 做数值对齐,最大 logits 误差记录在 `exports/anime_filename_parser.metadata.json`。 ## Local ONNX Smoke Test / 本地 ONNX 冒烟测试 ```powershell uv run python -m tools.onnx_inference "[GM-Team][国漫][神印王座][Throne of Seal][2022][200][AVC][GB][1080P].mp4" ``` Expected fields / 期望字段: ```text title=神印王座, episode=200, group=GM-Team, resolution=1080P, source=GB ``` Special-code example / 特典编号示例: ```powershell uv run python -m tools.onnx_inference "[YYDM&VCB-Studio] Shinsekai Yori [NCED02][Ma10p_1080p][x265_flac].mkv" ``` Expected fields / 期望字段: ```text title=Shinsekai Yori, episode=null, group=YYDM&VCB-Studio, special=NCED02 ``` ## Runtime Contract / 运行时契约 The ONNX graph returns token logits only. Android must implement the same: ONNX 图只返回 token logits。Android 必须实现同一套: - custom character tokenizer / 自定义字符 tokenizer - token id lookup from `vocab.json` / 使用 `vocab.json` 查 token id - fixed-length padding to 128 / padding 到固定长度 128 - constrained BIO decoding / 约束 BIO 解码 - field aggregation / 字段聚合 - thin string/number normalization / 轻量字符串和数字规范化 The Android runtime implementation lives in MiruPlay: Android 运行时实现位于 MiruPlay: ```text scraper/src/main/kotlin/com/miruplay/tv/scraper/filename/AnimeFilenameParser.kt ``` The app exposes it through `FilenameMetadataParser` in `core:model`. During a scan, `ScanCoordinator` passes that parser into `VideoDirectoryClassifier`. 应用通过 `core:model` 的 `FilenameMetadataParser` 暴露解析能力。扫描时, `ScanCoordinator` 会把解析器传给 `VideoDirectoryClassifier`。 ## Asset Update Rule / 资产更新规则 When updating the parser, keep these files in sync: 更新解析器时,以下文件必须同步: ```text anime_filename_parser.onnx vocab.json config.json ``` Do not update only the ONNX file. Token ids, label ids, and max length are part of the runtime contract. 不要只更新 ONNX。token id、label id 和 max length 都是运行时契约的一部分。 ## More Details / 更多说明 See [`onnx.md`](onnx.md) for a minimal Python ONNX Runtime reference. 最小 Python ONNX Runtime 参考见 [`onnx.md`](onnx.md)。