Update README.md

Browse files

Files changed (1) hide show

README.md +20 -8

README.md CHANGED Viewed

@@ -27,14 +27,14 @@ tags:
 # 🚀 Falcon-7b-QueAns
-Falcon-7b-QueAns is a chatbot-like model for Question and Answering. It was built by fine-tuning [Falcon-7B](https://huggingface.co/tiiuae/falcon-7b) on the [SQuAD](https://huggingface.co/datasets/squad) dataset. This repo only includes the QLoRA adapters from fine-tuning with 🤗's [peft](https://github.com/huggingface/peft) package.
 ## Model Summary
 - **Model Type:** Causal decoder-only
 - **Language(s):** English
 - **Base Model:** Falcon-7B (License: Apache 2.0)
-- **Dataset:** [SQuAD](https://huggingface.co/datasets/squad) (License: cc-by-4.0)
 - **License(s):** Apache 2.0 inherited from "Base Model" and "Dataset"
@@ -51,21 +51,33 @@ Falcon-7b-QueAns is a chatbot-like model for Question and Answering. It was buil
 ## Model Details
-The model was fine-tuned in 4-bit precision using 🤗 `peft` adapters, `transformers`, and `bitsandbytes`. Training relied on a method called "Low Rank Adapters" ([LoRA](https://arxiv.org/pdf/2106.09685.pdf)), specifically the [QLoRA](https://arxiv.org/abs/2305.14314) variant. The run took approximately 4 hours and was executed on a workstation with a single T4 NVIDIA GPU with 15 GB of available memory. See attached [Colab Notebook] used to train the model.
 ### Model Date
-July 06, 2023
-Open source falcon 7b large language model fine tuned on SQuAD dataset for question and answering.
 QLoRA technique used for fine tuning the model on consumer grade GPU
 SFTTrainer is also used.
 Dataset used: SQuAD
-Dataset Size: 87278
-Training Steps: 500

 # 🚀 Falcon-7b-QueAns
+Falcon-7b-QueAns is a chatbot-like model for Question and Answering. It was built by fine-tuning [Falcon-7B](https://huggingface.co/tiiuae/falcon-7b) on the [SQuAD](https://huggingface.co/datasets/squad), [Adversarial_qa](https://huggingface.co/datasets/adversarial_qa), Trimpixel (Self-Made) datasets. This repo only includes the QLoRA adapters from fine-tuning with 🤗's [peft](https://github.com/huggingface/peft) package.
 ## Model Summary
 - **Model Type:** Causal decoder-only
 - **Language(s):** English
 - **Base Model:** Falcon-7B (License: Apache 2.0)
+- **Dataset:** [SQuAD](https://huggingface.co/datasets/squad) (License: cc-by-4.0), [Adversarial_qa](https://huggingface.co/datasets/adversarial_qa) (License: cc-by-sa-4.0), [Falcon-RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb) (odc-by), Trimpixel (Self-Made)
 - **License(s):** Apache 2.0 inherited from "Base Model" and "Dataset"
 ## Model Details
+The model was fine-tuned in 4-bit precision using 🤗 `peft` adapters, `transformers`, and `bitsandbytes`. Training relied on a method called "Low Rank Adapters" ([LoRA](https://arxiv.org/pdf/2106.09685.pdf)), specifically the [QLoRA](https://arxiv.org/abs/2305.14314) variant. The run took approximately 12 hours and was executed on a workstation with a single T4 NVIDIA GPU with 25 GB of available memory. See attached [Colab Notebook] used to train the model.
 ### Model Date
+July 13, 2023
+Open source falcon 7b large language model fine tuned on SQuAD, Adversarial_qa, Trimpixel datasets for question and answering.
 QLoRA technique used for fine tuning the model on consumer grade GPU
 SFTTrainer is also used.
+## Datasets
+1.
 Dataset used: SQuAD
+Dataset Size: 87599
+Training Steps: 350
+2.
+Dataset used: Adversarial_qa
+Dataset Size: 30000
+Training Steps: 400
+3.
+Dataset used: Trimpixel
+Dataset Size: 1757
+Training Steps: 400