rvv-karma
/

BASH-Coder-Flan-T5-base

+---
+license: apache-2.0
+base_model: google/flan-t5-base
+tags:
+- generated_from_trainer
+datasets:
+- tldr
+metrics:
+- rouge
+model-index:
+- name: BASH-Coder-Flan-T5-base
+  results:
+  - task:
+      name: Sequence-to-sequence Language Modeling
+      type: text2text-generation
+    dataset:
+      name: tldr
+      type: tldr
+      config: data
+      split: validation
+      args: data
+    metrics:
+    - name: Rouge1
+      type: rouge
+      value: 27.0741
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# BASH-Coder-Flan-T5-base
+This model is a fine-tuned version of [google/flan-t5-base](https://huggingface.co/google/flan-t5-base) on the tldr dataset.
+It achieves the following results on the evaluation set:
+- Loss: 3.3608
+- Rouge1: 27.0741
+- Rouge2: 9.3824
+- Rougel: 26.133
+- Rougelsum: 26.1559
+- Gen Len: 15.5767
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5e-05
+- train_batch_size: 8
+- eval_batch_size: 16
+- seed: 42
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 100
+- num_epochs: 10
+- label_smoothing_factor: 0.1
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Rouge1  | Rouge2 | Rougel  | Rougelsum | Gen Len |
+|:-------------:|:-----:|:----:|:---------------:|:-------:|:------:|:-------:|:---------:|:-------:|
+| 4.3554        | 1.0   | 802  | 3.5928          | 22.7234 | 6.7951 | 22.0647 | 22.0744   | 15.2363 |
+| 3.5335        | 2.0   | 1604 | 3.4654          | 25.7842 | 8.5847 | 24.8207 | 24.8808   | 15.168  |
+| 3.3341        | 3.0   | 2406 | 3.4078          | 25.5756 | 8.4456 | 24.706  | 24.7207   | 15.6472 |
+| 3.2011        | 4.0   | 3208 | 3.3789          | 26.0638 | 8.6853 | 25.0862 | 25.1223   | 16.2748 |
+| 3.1059        | 5.0   | 4010 | 3.3622          | 26.7254 | 9.1138 | 25.7985 | 25.8521   | 15.7366 |
+| 3.0336        | 6.0   | 4812 | 3.3662          | 26.4655 | 9.1283 | 25.4587 | 25.5112   | 16.548  |
+| 2.9727        | 7.0   | 5614 | 3.3593          | 26.8211 | 9.3045 | 25.8497 | 25.8772   | 15.5431 |
+| 2.9298        | 8.0   | 6416 | 3.3643          | 26.8932 | 9.3537 | 25.9444 | 26.0088   | 15.916  |
+| 2.9005        | 9.0   | 7218 | 3.3606          | 27.1732 | 9.5661 | 26.1198 | 26.1515   | 15.71   |
+| 2.8846        | 10.0  | 8020 | 3.3608          | 27.0741 | 9.3824 | 26.133  | 26.1559   | 15.5767 |
+### Framework versions
+- Transformers 4.37.0.dev0
+- Pytorch 2.1.0+cu121
+- Datasets 2.15.0
+- Tokenizers 0.15.0

generation_config.json ADDED Viewed

	@@ -0,0 +1,8 @@

+{
+  "bos_token_id": 2,
+  "decoder_start_token_id": 2,
+  "eos_token_id": 1,
+  "max_length": 256,
+  "pad_token_id": 0,
+  "transformers_version": "4.37.0.dev0"
+}

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:cc04afbd53d6cc05b5cfd4b6dc1c951c62fd13df24a81edfcf70d08a0e78d3ea
 size 990345064

 version https://git-lfs.github.com/spec/v1
+oid sha256:e308dfd3e63c1101832f6d3bb8ecc9df567333608538bb2d95d7f8303104164b
 size 990345064