|
|
--- |
|
|
library_name: transformers |
|
|
license: cc0-1.0 |
|
|
base_model: google-t5/t5-large |
|
|
tags: |
|
|
- generated_from_trainer |
|
|
metrics: |
|
|
- rouge |
|
|
model-index: |
|
|
- name: PreDA_t5-large |
|
|
results: [] |
|
|
language: |
|
|
- en |
|
|
--- |
|
|
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
|
|
 |
|
|
|
|
|
# PreDA-large (Prefix-Based Dream Reports Annotation) |
|
|
|
|
|
This model is a fine-tuned version of [google-t5/t5-large](https://huggingface.co/google-t5/t5-large) on the annotated [Dreambank.net](https://dreambank.net/) dataset.It achieves the following results on the evaluation set: |
|
|
|
|
|
## Intended uses & limitations |
|
|
|
|
|
This model is designed for research purposes. See the disclaimer for more details. |
|
|
|
|
|
## Training procedure |
|
|
|
|
|
The overall idea of our approach is to disentangle each dream report from its annotation as a whole and to create an augmented set of (dream report; single |
|
|
feature annotation). To make sure that, given the same report, the model would produce a specific HVDC feature, we simply append at |
|
|
the beginning of each report a string of the form ``HVDC-Feature:'', in a manner that closely mimics T5 task-specific prefix fine-tuning. |
|
|
|
|
|
After this procedure to the original dataset (\~1.8K) we obtain approximately 6.6K items. In the present study, we focused on a subset of six HVDC features: |
|
|
Characters, Activities, Emotion, Friendliness, Misfortune, and Good Fortune. |
|
|
This selection was made to exclude features that represented less than 10\% of the total instances. Notably, Good Fortune would have been excluded under this |
|
|
criterion, but we intentionally retained this feature to control against potential |
|
|
memorisation effects and to provide a counterbalance to the Misfortune feature. After filtering out instances whose annotation |
|
|
feature is not one of the six selected features, we are left with \~5.3K |
|
|
dream reports. We then generate a random split of 80\%-20\% for the training (i.e., 4,311 reports) and testing (i.e. 1,078 reports) sets. |
|
|
|
|
|
### Training |
|
|
|
|
|
#### Hyperparameters |
|
|
|
|
|
The following hyperparameters were used during training: |
|
|
- learning_rate: 0.001 |
|
|
- train_batch_size: 8 |
|
|
- eval_batch_size: 8 |
|
|
- seed: 42 |
|
|
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
|
|
- lr_scheduler_type: linear |
|
|
- num_epochs: 20 |
|
|
- label_smoothing_factor: 0.1 |
|
|
|
|
|
### Training results |
|
|
|
|
|
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | |
|
|
|:-------------:|:-----:|:-----:|:---------------:|:------:|:------:|:------:|:---------:| |
|
|
| 1.9478 | 1.0 | 539 | 1.9524 | 0.3298 | 0.1797 | 0.3121 | 0.3113 | |
|
|
| 1.9141 | 2.0 | 1078 | 1.9039 | 0.3665 | 0.1942 | 0.3495 | 0.3489 | |
|
|
| 1.914 | 3.0 | 1617 | 1.8993 | 0.4076 | 0.2223 | 0.3873 | 0.3870 | |
|
|
| 1.9264 | 4.0 | 2156 | 1.8725 | 0.3454 | 0.1843 | 0.3306 | 0.3302 | |
|
|
| 1.9018 | 5.0 | 2695 | 1.8669 | 0.3494 | 0.1814 | 0.3345 | 0.3347 | |
|
|
| 1.889 | 6.0 | 3234 | 1.8872 | 0.3387 | 0.1609 | 0.3211 | 0.3208 | |
|
|
| 1.8511 | 7.0 | 3773 | 1.8412 | 0.4200 | 0.2403 | 0.4065 | 0.4065 | |
|
|
| 1.8756 | 8.0 | 4312 | 1.8191 | 0.4735 | 0.2705 | 0.4467 | 0.4469 | |
|
|
| 1.8483 | 9.0 | 4851 | 1.7966 | 0.4915 | 0.2996 | 0.4662 | 0.4665 | |
|
|
| 1.8182 | 10.0 | 5390 | 1.7787 | 0.5071 | 0.3169 | 0.4857 | 0.4860 | |
|
|
| 1.7715 | 11.0 | 5929 | 1.7709 | 0.5017 | 0.3182 | 0.4767 | 0.4767 | |
|
|
| 1.7955 | 12.0 | 6468 | 1.7557 | 0.4772 | 0.3015 | 0.4544 | 0.4549 | |
|
|
| 1.7391 | 13.0 | 7007 | 1.7279 | 0.5644 | 0.3693 | 0.5270 | 0.5281 | |
|
|
| 1.7013 | 14.0 | 7546 | 1.7054 | 0.5484 | 0.3694 | 0.5222 | 0.5221 | |
|
|
| 1.7364 | 15.0 | 8085 | 1.6900 | 0.5607 | 0.3778 | 0.5349 | 0.5350 | |
|
|
| 1.6592 | 16.0 | 8624 | 1.6643 | 0.6010 | 0.4191 | 0.5691 | 0.5688 | |
|
|
| 1.645 | 17.0 | 9163 | 1.6448 | 0.6160 | 0.4440 | 0.5854 | 0.5863 | |
|
|
| 1.6245 | 18.0 | 9702 | 1.6264 | 0.6301 | 0.4640 | 0.6015 | 0.6018 | |
|
|
| 1.616 | 19.0 | 10241 | 1.6145 | 0.6578 | 0.4933 | 0.6253 | 0.6251 | |
|
|
| 1.5914 | 20.0 | 10780 | 1.6073 | 0.6587 | 0.4979 | 0.6269 | 0.6270 | |
|
|
|
|
|
|
|
|
### Framework versions |
|
|
|
|
|
- Transformers 4.44.2 |
|
|
- Pytorch 2.1.0+cu118 |
|
|
- Datasets 3.0.1 |
|
|
- Tokenizers 0.19.1 |
|
|
|
|
|
# Usage |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM |
|
|
|
|
|
model_id = "jrc-ai/PreDA-large" |
|
|
device = "cpu" |
|
|
encoder_max_length = 100 |
|
|
decoder_max_length = 50 |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
|
model = AutoModelForSeq2SeqLM.from_pretrained(model_id) |
|
|
|
|
|
dream = "I was talking with my brother about my birthday dinner. I was feeling sad." |
|
|
prefixes = ["Emotion", "Activities", "Characters"] |
|
|
text_inputs = ["{} : {}".format(p, dream) for p in prefixes] |
|
|
|
|
|
inputs = tokenizer( |
|
|
text_inputs, |
|
|
max_length=encoder_max_length, |
|
|
truncation=True, |
|
|
padding=True, |
|
|
return_tensors="pt" |
|
|
) |
|
|
|
|
|
output = model.generate( |
|
|
**inputs.to(device), |
|
|
do_sample=False, |
|
|
max_length=decoder_max_length, |
|
|
) |
|
|
|
|
|
for decode_dream in output: |
|
|
print(tokenizer.decode(decode_dream, skip_special_tokens=True)) |
|
|
``` |
|
|
|
|
|
# Dual-Use Implication |
|
|
Upon evaluation we identified no dual-use implication for the present model. The model parameters, including the weights are available under CC0 1.0 Public Domain Dedication. |
|
|
|
|
|
# Cite |
|
|
Please note that the paper referring to this model, titled PreDA: Prefix-Based Dream Reports Annotation |
|
|
with Generative Language Models, has been accepted for publication at LOD 2025 conference and will appear in the conference proceedings. |