Token Classification
Transformers
ONNX
Safetensors
English
Japanese
Chinese
bert
anime
filename-parsing
Eval Results (legacy)
Instructions to use ModerRAS/AniFileBERT with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ModerRAS/AniFileBERT with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("token-classification", model="ModerRAS/AniFileBERT")# Load model directly from transformers import AutoTokenizer, AutoModelForTokenClassification tokenizer = AutoTokenizer.from_pretrained("ModerRAS/AniFileBERT") model = AutoModelForTokenClassification.from_pretrained("ModerRAS/AniFileBERT") - Notebooks
- Google Colab
- Kaggle
| # Android export and runtime | |
| This repository is used by MiruPlay as a Git submodule at | |
| `tools/anime_parser`. It contains the Python training pipeline plus an ONNX | |
| export path for Android. | |
| For the full scanner integration notes, file-vs-folder behavior, and device | |
| test procedure, see MiruPlay's `docs/anime-filename-parser.md`. | |
| ## Export | |
| From `tools/anime_parser`: | |
| ```bash | |
| python -m pip install -r requirements.txt | |
| python export_onnx.py --model-dir checkpoints/dmhy-finetune/final --android-assets-dir ../../scraper/src/main/assets/anime_parser | |
| ``` | |
| The exporter writes: | |
| - `exports/anime_filename_parser.onnx` | |
| - `exports/anime_filename_parser.metadata.json` | |
| - `scraper/src/main/assets/anime_parser/anime_filename_parser.onnx` | |
| - `scraper/src/main/assets/anime_parser/vocab.json` | |
| - `scraper/src/main/assets/anime_parser/config.json` | |
| The ONNX graph uses fixed Android inputs: | |
| - `input_ids`: `int64[1,64]` | |
| - `attention_mask`: `int64[1,64]` | |
| - `logits`: `float32[1,64,15]` | |
| The current export was verified against PyTorch with max absolute logits | |
| difference `1.621246337890625e-05`. | |
| ## Runtime | |
| Android runs the exported graph through ONNX Runtime Android. Tokenization and | |
| BIO postprocessing are implemented in: | |
| `scraper/src/main/kotlin/com/miruplay/tv/scraper/filename/AnimeFilenameParser.kt` | |
| The app exposes it through `FilenameMetadataParser` in `core:model`. During a | |
| scan, `ScanCoordinator` passes that parser into `VideoDirectoryClassifier`; the | |
| classifier keeps the existing release/folder regexes first and lazily calls the | |
| model only when those heuristics are missing title, season, or episode data. | |
| Example Kotlin usage: | |
| ```kotlin | |
| val parsed = animeFilenameParser.parse("[ANi] θ¬ιηθθθ² S2 - 03 [1080P][WEB-DL]") | |
| ``` | |
| Expected fields: | |
| ```text | |
| title=θ¬ιηθθθ², season=2, episode=3, group=ANi, resolution=1080P, source=WEB-DL | |
| ``` | |