| project: Multitask Learning for Agent-Action Identification | |
| Project Overview | |
| This project aims to develop a multitask learning model for identifying agents and actions in text data. The model is trained on a custom dataset of text examples, where each example is annotated with the agents and actions present in the text. | |
| Project Structure | |
| The project is organized into the following directories and files: | |
| dataset/: contains the custom dataset class for loading and processing the text data | |
| dataset.py: defines the dataset class | |
| data_collator.py: defines the data collator class | |
| model/: contains the multitask learning model architecture | |
| model.py: defines the model architecture | |
| training/: contains the training loop and evaluation code | |
| main.py: contains the training loop and evaluation code | |
| data/: contains the dataset files for training, validation, and testing | |
| train.csv: training dataset | |
| val.csv: validation dataset | |
| test.csv: testing dataset | |
| requirements.txt: lists the dependencies required to run the project | |
| Dataset | |
| The dataset consists of text examples, where each example is annotated with the agents and actions present in the text. The dataset is split into training, validation, and testing sets. | |
| Training Set: 80% of the dataset (10,000 examples) | |
| Validation Set: 10% of the dataset (1,250 examples) | |
| Testing Set: 10% of the dataset (1,250 examples) | |
| Model | |
| The model is a multitask learning model based on the BERT architecture. The model is trained to predict both agents and actions simultaneously. | |
| Model Architecture: | |
| BERT encoder | |
| Two classification heads for agents and actions | |
| Model Parameters: | |
| BERT encoder: 110M parameters | |
| Classification heads: 10M parameters | |
| Training | |
| The model is trained using the Trainer class from the Hugging Face library. The training loop is defined in main.py. | |
| Training Hyperparameters: | |
| Batch size: 16 | |
| Number of epochs: 3 | |
| Learning rate: 1e-5 | |
| Training Time: approximately 10 hours on a single NVIDIA V100 GPU | |
| Evaluation | |
| The model is evaluated on the validation set during training. The evaluation metric is accuracy. | |
| Evaluation Metric: accuracy | |
| Evaluation Frequency: every 500 steps | |
| Requirements | |
| The project requires the following dependencies: | |
| Python: 3.8+ | |
| Transformers: 4.20.1+ | |
| Torch: 1.12.0+ | |
| Pandas: 1.4.2+ | |
| Usage | |
| To train the model, run the following command: | |
| Bash | |
| python main.py | |
| To evaluate the model, run the following command: | |
| Bash | |
| python main.py --mode eval | |
| License | |
| This project is licensed under the MIT License. | |
| Acknowledgments | |
| This project was inspired by the work of [Dennis Duncan]. | |
| Contributing | |
| Contributions are welcome! Please open an issue or submit a pull request to contribute to the project. | |