pruizf/distilcamembert-base-ft-AS13_stgdir-100
This model is a fine-tuned version of cmarkea/distilcamembert-base on the dataset described below. It achieves the following results on the evaluation set:
- Loss: 0.5002
- Accuracy: 0.8802
Model description
Fine-tuned for stage direction classification in French, using the dataset at https://nakala.fr/10.34847/nkl.fde37ug3.
The categorization scheme and rationale are described in the following publication:
Schneider, Alexia., & Ruiz Fabo, Pablo. (2024). Stage direction classification in French theater: Transfer learning experiments. In Proceedings of the 8th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature (LaTeCH-CLfL 2024) (pp. 278–286). Association for Computational Linguistics. https://aclanthology.org/2024.latechclfl-1.28/
Intended uses & limitations
Stage direction classification in French.
Training and evaluation data
Stage direction dataset annotated with 13 categories by Alexia Schneider & Pablo Ruiz.
The categories were derived from those available at FreDraCor (and originally in the Théâtre Classique platform).
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 10
Training results
On held-out data:
| Label | Precision | Recall | F1-score | Support |
|---|---|---|---|---|
| action | 0.8636 | 0.8601 | 0.8619 | 486 |
| aggression | 0.7273 | 0.7467 | 0.7368 | 75 |
| aparte | 0.0000 | 0.0000 | 0.0000 | 14 |
| delivery | 0.8227 | 0.8498 | 0.8360 | 213 |
| entrance | 0.8750 | 0.6562 | 0.7500 | 128 |
| exit | 0.8036 | 0.9132 | 0.8549 | 242 |
| interaction | 0.8280 | 0.7549 | 0.7897 | 102 |
| movement | 0.7091 | 0.6555 | 0.6812 | 119 |
| music | 0.9572 | 0.9688 | 0.9630 | 577 |
| narration | 0.7769 | 0.7833 | 0.7801 | 120 |
| object | 0.7892 | 0.8462 | 0.8167 | 208 |
| setting | 0.8624 | 0.8579 | 0.8602 | 190 |
| toward | 0.9756 | 0.9800 | 0.9778 | 449 |
| Accuracy | — | — | 0.8714 | 2923 |
| Macro avg | 0.7685 | 0.7594 | 0.7622 | 2923 |
| Weighted avg | 0.8677 | 0.8714 | 0.8684 | 2923 |
Training details:
| Training Loss | Epoch | Step | Validation Loss | Accuracy |
|---|---|---|---|---|
| 1.0963 | 1.0 | 585 | 0.5121 | 0.8520 |
| 0.4684 | 2.0 | 1170 | 0.4489 | 0.8772 |
| 0.3481 | 3.0 | 1755 | 0.4511 | 0.8772 |
| 0.2825 | 4.0 | 2340 | 0.4705 | 0.8828 |
| 0.2281 | 5.0 | 2925 | 0.5002 | 0.8802 |
Framework versions
- Transformers 4.57.2
- Pytorch 2.9.0+cu126
- Datasets 4.0.0
- Tokenizers 0.22.1
- Downloads last month
- 20
Model tree for pruizf/distilcamembert-base-ft-AS13_stgdir-100
Base model
cmarkea/distilcamembert-base