Update README.md
Browse files
README.md
CHANGED
|
@@ -27,9 +27,7 @@ datasets:
|
|
| 27 |
|
| 28 |
## Overview
|
| 29 |
|
| 30 |
-
This repository contains a fine-tuned model for generating high-quality product descriptions. The model is based on the `t5-base` and has 223 million parameters. It has been fine-tuned on the Amazon Product Dataset, which contains 10 million examples, with the cleaned version having 0.5 million examples.
|
| 31 |
-
|
| 32 |
-
Developed by team at https://exnrt.com
|
| 33 |
|
| 34 |
## Usage
|
| 35 |
|
|
@@ -91,12 +89,7 @@ generate_description(title)
|
|
| 91 |
## Features
|
| 92 |
|
| 93 |
- **Architecture**: t5-base (223M parameters)
|
| 94 |
-
- **Dataset**:
|
| 95 |
-
- **Original**: 10 million examples
|
| 96 |
-
- **Cleaned**: 0.5 million examples
|
| 97 |
-
- **Training**:
|
| 98 |
-
- **Current Version**: Trained on 0.1 million cleaned examples
|
| 99 |
-
- **Upcoming Update**: Will be trained on 0.5 million cleaned examples
|
| 100 |
- **Training Time**:
|
| 101 |
- **Hardware**: Colab T4 GPU
|
| 102 |
- **Speed**: 4.91 iterations/second
|
|
@@ -110,8 +103,8 @@ generate_description(title)
|
|
| 110 |
|
| 111 |
## Data Preparation
|
| 112 |
|
| 113 |
-
- **Training Data**: First
|
| 114 |
-
- **
|
| 115 |
- **Source Max Token Length**: 50
|
| 116 |
- **Target Max Token Length**: 300
|
| 117 |
- **Batch Size**: 1
|
|
|
|
| 27 |
|
| 28 |
## Overview
|
| 29 |
|
| 30 |
+
This repository contains a fine-tuned model for generating high-quality product descriptions. The model is based on the `t5-base` and has 223 million parameters. It has been fine-tuned on the Amazon Product Dataset, which contains 10 million examples, with the cleaned version having 0.5 million examples.
|
|
|
|
|
|
|
| 31 |
|
| 32 |
## Usage
|
| 33 |
|
|
|
|
| 89 |
## Features
|
| 90 |
|
| 91 |
- **Architecture**: t5-base (223M parameters)
|
| 92 |
+
- **Training Dataset**: Trained on 0.5 million cleaned examples
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 93 |
- **Training Time**:
|
| 94 |
- **Hardware**: Colab T4 GPU
|
| 95 |
- **Speed**: 4.91 iterations/second
|
|
|
|
| 103 |
|
| 104 |
## Data Preparation
|
| 105 |
|
| 106 |
+
- **Training Data**: First 250,000 examples for `train`
|
| 107 |
+
- **Validation Data**: First 40,000 examples for `validation`
|
| 108 |
- **Source Max Token Length**: 50
|
| 109 |
- **Target Max Token Length**: 300
|
| 110 |
- **Batch Size**: 1
|