File size: 6,811 Bytes

ac32d7e
f8c2f78
ac32d7e
 
 
 
 
f8c2f78
ac32d7e
 
 
 
 
 
 
 
 
f8c2f78
 
ac32d7e
56f2abe
ac32d7e
 
 
 
f8c2f78
ac32d7e
 
 
 
 
 
 
f8c2f78
ac32d7e
f8c2f78
 
 
 
 
 
 
ac32d7e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
827d75a
ac32d7e
f8c2f78
ac32d7e
 
 
2c4ca7f
 
f8c2f78
 
 
 
 
ac32d7e
 
 
 
827d75a
f8c2f78
827d75a
f8c2f78

---
library_name: transformers
license: apache-2.0
base_model: allenai/longformer-base-4096
tags:
- generated_from_trainer
datasets:
- stab-gurevych-essays
metrics:
- accuracy
model-index:
- name: longformer-spans
  results:
  - task:
      name: Token Classification
      type: token-classification
    dataset:
      name: stab-gurevych-essays
      type: stab-gurevych-essays
      config: spans
      split: train[0%:20%]
      args: spans
    metrics:
    - name: Accuracy
      type: accuracy
      value: 0.9739926865110556
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# longformer-spans

This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on the stab-gurevych-essays dataset.
It achieves the following results on the evaluation set:
- Loss: 0.0856
- B: {'precision': 0.8861301369863014, 'recall': 0.913503971756399, 'f1-score': 0.8996088657105606, 'support': 1133.0}
- I: {'precision': 0.9856182499448976, 'recall': 0.978661705969251, 'f1-score': 0.9821276595744681, 'support': 18277.0}
- O: {'precision': 0.963097033685269, 'recall': 0.9722870774540656, 'f1-score': 0.9676702364113963, 'support': 9851.0}
- Accuracy: 0.9740
- Macro avg: {'precision': 0.9449484735388226, 'recall': 0.9548175850599052, 'f1-score': 0.9498022538988083, 'support': 29261.0}
- Weighted avg: {'precision': 0.9741840360302777, 'recall': 0.9739926865110556, 'f1-score': 0.9740652601681857, 'support': 29261.0}

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 5

### Training results

| Training Loss | Epoch | Step | Validation Loss | B                                                                                                                  | I                                                                                                                   | O                                                                                                                  | Accuracy | Macro avg                                                                                                           | Weighted avg                                                                                                        |
|:-------------:|:-----:|:----:|:---------------:|:------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------:|:--------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|
| No log        | 1.0   | 41   | 0.2075          | {'precision': 0.8258258258258259, 'recall': 0.7281553398058253, 'f1-score': 0.773921200750469, 'support': 1133.0}  | {'precision': 0.9305934158104424, 'recall': 0.9712753734201456, 'f1-score': 0.9504992905522983, 'support': 18277.0} | {'precision': 0.9388199433921185, 'recall': 0.8754441173484926, 'f1-score': 0.9060251089982665, 'support': 9851.0} | 0.9296   | {'precision': 0.8984130616761289, 'recall': 0.8582916101914878, 'f1-score': 0.8768152001003445, 'support': 29261.0} | {'precision': 0.9293063047668868, 'recall': 0.9295991251153413, 'f1-score': 0.9286894365406706, 'support': 29261.0} |
| No log        | 2.0   | 82   | 0.1039          | {'precision': 0.7817781043350478, 'recall': 0.93909973521624, 'f1-score': 0.8532477947072975, 'support': 1133.0}   | {'precision': 0.9750846901977925, 'recall': 0.9764184494173004, 'f1-score': 0.9757511140271741, 'support': 18277.0} | {'precision': 0.9661387789122734, 'recall': 0.9413257537305857, 'f1-score': 0.9535708776800864, 'support': 9851.0} | 0.9632   | {'precision': 0.9076671911483712, 'recall': 0.9522813127880422, 'f1-score': 0.927523262138186, 'support': 29261.0}  | {'precision': 0.9645880382085871, 'recall': 0.9631591538224941, 'f1-score': 0.9635405344487392, 'support': 29261.0} |
| No log        | 3.0   | 123  | 0.0875          | {'precision': 0.8751054852320675, 'recall': 0.9152691968225949, 'f1-score': 0.8947368421052632, 'support': 1133.0} | {'precision': 0.9870288248337029, 'recall': 0.9742299064397877, 'f1-score': 0.9805876036016191, 'support': 18277.0} | {'precision': 0.9561578318055002, 'recall': 0.9741143031164349, 'f1-score': 0.9650525468899281, 'support': 9851.0} | 0.9719   | {'precision': 0.9394307139570902, 'recall': 0.9545378021262724, 'f1-score': 0.9467923308656035, 'support': 29261.0} | {'precision': 0.972302079469926, 'recall': 0.9719080004101022, 'f1-score': 0.9720333929990342, 'support': 29261.0}  |
| No log        | 4.0   | 164  | 0.0825          | {'precision': 0.8817021276595745, 'recall': 0.9143865842894969, 'f1-score': 0.8977469670710572, 'support': 1133.0} | {'precision': 0.9845409033393849, 'recall': 0.9791541281391913, 'f1-score': 0.9818401272837, 'support': 18277.0}    | {'precision': 0.9638712281764052, 'recall': 0.9695462389605116, 'f1-score': 0.9667004048582996, 'support': 9851.0} | 0.9734   | {'precision': 0.9433714197251216, 'recall': 0.9543623171297333, 'f1-score': 0.9487624997376857, 'support': 29261.0} | {'precision': 0.9736002894548376, 'recall': 0.9734117084173474, 'f1-score': 0.9734870649777793, 'support': 29261.0} |
| No log        | 5.0   | 205  | 0.0856          | {'precision': 0.8861301369863014, 'recall': 0.913503971756399, 'f1-score': 0.8996088657105606, 'support': 1133.0}  | {'precision': 0.9856182499448976, 'recall': 0.978661705969251, 'f1-score': 0.9821276595744681, 'support': 18277.0}  | {'precision': 0.963097033685269, 'recall': 0.9722870774540656, 'f1-score': 0.9676702364113963, 'support': 9851.0}  | 0.9740   | {'precision': 0.9449484735388226, 'recall': 0.9548175850599052, 'f1-score': 0.9498022538988083, 'support': 29261.0} | {'precision': 0.9741840360302777, 'recall': 0.9739926865110556, 'f1-score': 0.9740652601681857, 'support': 29261.0} |


### Framework versions

- Transformers 4.46.0
- Pytorch 2.5.0+cu124
- Datasets 3.0.2
- Tokenizers 0.20.1