File size: 1,865 Bytes
be5f706
 
3197202
 
 
be5f706
 
3197202
be5f706
 
 
 
 
 
 
3197202
be5f706
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3197202
be5f706
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
# Android export and runtime

This repository is used by MiruPlay as a Git submodule at
`tools/anime_parser`. It contains the Python training pipeline plus an ONNX
export path for Android.

For the full scanner integration notes, file-vs-folder behavior, and device
test procedure, see MiruPlay's `docs/anime-filename-parser.md`.

## Export

From `tools/anime_parser`:

```bash
python -m pip install -r requirements.txt
python export_onnx.py --model-dir checkpoints/dmhy-finetune/final --android-assets-dir ../../scraper/src/main/assets/anime_parser
```

The exporter writes:

- `exports/anime_filename_parser.onnx`
- `exports/anime_filename_parser.metadata.json`
- `scraper/src/main/assets/anime_parser/anime_filename_parser.onnx`
- `scraper/src/main/assets/anime_parser/vocab.json`
- `scraper/src/main/assets/anime_parser/config.json`

The ONNX graph uses fixed Android inputs:

- `input_ids`: `int64[1,64]`
- `attention_mask`: `int64[1,64]`
- `logits`: `float32[1,64,15]`

The current export was verified against PyTorch with max absolute logits
difference `1.621246337890625e-05`.

## Runtime

Android runs the exported graph through ONNX Runtime Android. Tokenization and
BIO postprocessing are implemented in:

`scraper/src/main/kotlin/com/miruplay/tv/scraper/filename/AnimeFilenameParser.kt`

The app exposes it through `FilenameMetadataParser` in `core:model`. During a
scan, `ScanCoordinator` passes that parser into `VideoDirectoryClassifier`; the
classifier keeps the existing release/folder regexes first and lazily calls the
model only when those heuristics are missing title, season, or episode data.

Example Kotlin usage:

```kotlin
val parsed = animeFilenameParser.parse("[ANi] ่‘ฌ้€็š„่Š™่މ่Žฒ S2 - 03 [1080P][WEB-DL]")
```

Expected fields:

```text
title=่‘ฌ้€็š„่Š™่މ่Žฒ, season=2, episode=3, group=ANi, resolution=1080P, source=WEB-DL
```