Create readme.md
Browse files
readme.md
ADDED
|
@@ -0,0 +1,61 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
project: Multitask Learning for Agent-Action Identification
|
| 2 |
+
|
| 3 |
+
Project Overview
|
| 4 |
+
This project aims to develop a multitask learning model for identifying agents and actions in text data. The model is trained on a custom dataset of text examples, where each example is annotated with the agents and actions present in the text.
|
| 5 |
+
Project Structure
|
| 6 |
+
The project is organized into the following directories and files:
|
| 7 |
+
dataset/: contains the custom dataset class for loading and processing the text data
|
| 8 |
+
dataset.py: defines the dataset class
|
| 9 |
+
data_collator.py: defines the data collator class
|
| 10 |
+
model/: contains the multitask learning model architecture
|
| 11 |
+
model.py: defines the model architecture
|
| 12 |
+
training/: contains the training loop and evaluation code
|
| 13 |
+
main.py: contains the training loop and evaluation code
|
| 14 |
+
data/: contains the dataset files for training, validation, and testing
|
| 15 |
+
train.csv: training dataset
|
| 16 |
+
val.csv: validation dataset
|
| 17 |
+
test.csv: testing dataset
|
| 18 |
+
requirements.txt: lists the dependencies required to run the project
|
| 19 |
+
Dataset
|
| 20 |
+
The dataset consists of text examples, where each example is annotated with the agents and actions present in the text. The dataset is split into training, validation, and testing sets.
|
| 21 |
+
Training Set: 80% of the dataset (10,000 examples)
|
| 22 |
+
Validation Set: 10% of the dataset (1,250 examples)
|
| 23 |
+
Testing Set: 10% of the dataset (1,250 examples)
|
| 24 |
+
Model
|
| 25 |
+
The model is a multitask learning model based on the BERT architecture. The model is trained to predict both agents and actions simultaneously.
|
| 26 |
+
Model Architecture:
|
| 27 |
+
BERT encoder
|
| 28 |
+
Two classification heads for agents and actions
|
| 29 |
+
Model Parameters:
|
| 30 |
+
BERT encoder: 110M parameters
|
| 31 |
+
Classification heads: 10M parameters
|
| 32 |
+
Training
|
| 33 |
+
The model is trained using the Trainer class from the Hugging Face library. The training loop is defined in main.py.
|
| 34 |
+
Training Hyperparameters:
|
| 35 |
+
Batch size: 16
|
| 36 |
+
Number of epochs: 3
|
| 37 |
+
Learning rate: 1e-5
|
| 38 |
+
Training Time: approximately 10 hours on a single NVIDIA V100 GPU
|
| 39 |
+
Evaluation
|
| 40 |
+
The model is evaluated on the validation set during training. The evaluation metric is accuracy.
|
| 41 |
+
Evaluation Metric: accuracy
|
| 42 |
+
Evaluation Frequency: every 500 steps
|
| 43 |
+
Requirements
|
| 44 |
+
The project requires the following dependencies:
|
| 45 |
+
Python: 3.8+
|
| 46 |
+
Transformers: 4.20.1+
|
| 47 |
+
Torch: 1.12.0+
|
| 48 |
+
Pandas: 1.4.2+
|
| 49 |
+
Usage
|
| 50 |
+
To train the model, run the following command:
|
| 51 |
+
Bash
|
| 52 |
+
python main.py
|
| 53 |
+
To evaluate the model, run the following command:
|
| 54 |
+
Bash
|
| 55 |
+
python main.py --mode eval
|
| 56 |
+
License
|
| 57 |
+
This project is licensed under the MIT License.
|
| 58 |
+
Acknowledgments
|
| 59 |
+
This project was inspired by the work of [Dennis Duncan].
|
| 60 |
+
Contributing
|
| 61 |
+
Contributions are welcome! Please open an issue or submit a pull request to contribute to the project.
|