AniFileBERT / ANDROID.md
ModerRAS's picture
Document AniFileBERT maintenance workflow
3197202
|
raw
history blame
1.87 kB

Android export and runtime

This repository is used by MiruPlay as a Git submodule at tools/anime_parser. It contains the Python training pipeline plus an ONNX export path for Android.

For the full scanner integration notes, file-vs-folder behavior, and device test procedure, see MiruPlay's docs/anime-filename-parser.md.

Export

From tools/anime_parser:

python -m pip install -r requirements.txt
python export_onnx.py --model-dir checkpoints/dmhy-finetune/final --android-assets-dir ../../scraper/src/main/assets/anime_parser

The exporter writes:

  • exports/anime_filename_parser.onnx
  • exports/anime_filename_parser.metadata.json
  • scraper/src/main/assets/anime_parser/anime_filename_parser.onnx
  • scraper/src/main/assets/anime_parser/vocab.json
  • scraper/src/main/assets/anime_parser/config.json

The ONNX graph uses fixed Android inputs:

  • input_ids: int64[1,64]
  • attention_mask: int64[1,64]
  • logits: float32[1,64,15]

The current export was verified against PyTorch with max absolute logits difference 1.621246337890625e-05.

Runtime

Android runs the exported graph through ONNX Runtime Android. Tokenization and BIO postprocessing are implemented in:

scraper/src/main/kotlin/com/miruplay/tv/scraper/filename/AnimeFilenameParser.kt

The app exposes it through FilenameMetadataParser in core:model. During a scan, ScanCoordinator passes that parser into VideoDirectoryClassifier; the classifier keeps the existing release/folder regexes first and lazily calls the model only when those heuristics are missing title, season, or episode data.

Example Kotlin usage:

val parsed = animeFilenameParser.parse("[ANi] ่‘ฌ้€็š„่Š™่މ่Žฒ S2 - 03 [1080P][WEB-DL]")

Expected fields:

title=่‘ฌ้€็š„่Š™่މ่Žฒ, season=2, episode=3, group=ANi, resolution=1080P, source=WEB-DL