Token Classification
Transformers
ONNX
Safetensors
English
Japanese
Chinese
bert
anime
filename-parsing
Eval Results (legacy)
Instructions to use ModerRAS/AniFileBERT with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ModerRAS/AniFileBERT with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("token-classification", model="ModerRAS/AniFileBERT")# Load model directly from transformers import AutoTokenizer, AutoModelForTokenClassification tokenizer = AutoTokenizer.from_pretrained("ModerRAS/AniFileBERT") model = AutoModelForTokenClassification.from_pretrained("ModerRAS/AniFileBERT") - Notebooks
- Google Colab
- Kaggle
Document Rust DMHY template recipe schema
Browse files
tools/rust_dmhy_template_apply/README.md
CHANGED
|
@@ -65,10 +65,12 @@ Optional controls:
|
|
| 65 |
--keep-encoding-noise
|
| 66 |
```
|
| 67 |
|
| 68 |
-
The output
|
| 69 |
-
|
| 70 |
-
|
| 71 |
-
`
|
|
|
|
|
|
|
| 72 |
|
| 73 |
For low-frequency templates (`count <= --audit-max-count`, default `50`), apply
|
| 74 |
uses a conservative gate: records with `no_title`, `multiple_title_spans`,
|
|
|
|
| 65 |
--keep-encoding-noise
|
| 66 |
```
|
| 67 |
|
| 68 |
+
The output record schema is `filename`, `tokens`, `labels`, `template_id`, and
|
| 69 |
+
`template`, plus optional `source_filename`, `path_trimmed`, and
|
| 70 |
+
`dropped_title_candidate_positions`. Clustered recipe rows also include
|
| 71 |
+
`title_spans` and `title_boundary_decisions` metadata so downstream synthetic
|
| 72 |
+
augmentation can distinguish one logical title span from repeated/path title
|
| 73 |
+
slots.
|
| 74 |
|
| 75 |
For low-frequency templates (`count <= --audit-max-count`, default `50`), apply
|
| 76 |
uses a conservative gate: records with `no_title`, `multiple_title_spans`,
|