| | --- |
| | {} |
| | --- |
| | language: en |
| | license: cc-by-4.0 |
| | tags: |
| | - text-classification |
| | repo: https://github.com/AAP9002/COMP34812-NLU-NLI |
| |
|
| | --- |
| |
|
| | # Model Card for z72819ap-e91802zc-NLI |
| |
|
| | <!-- Provide a quick summary of what the model is/does. --> |
| |
|
| | This is a classification model that was trained to detect whether a premise and hypothesis entail each other or not, using binary classification. |
| |
|
| |
|
| | ## Model Details |
| |
|
| | ### Model Description |
| |
|
| | <!-- Provide a longer summary of what this model is. --> |
| |
|
| | This model is based upon the Enhanced LSTM for Natural Language Inference architecture using BILSTM instead of LSTM trained on over 24K premise-hypothesis pairs from the shared task dataset for Natural Language Inference (NLI). |
| |
|
| | - **Developed by:** Alan Prophett and Zac Curtis |
| | - **Language(s):** English |
| | - **Model type:** Supervised |
| | - **Model architecture:** BILSTM |
| | - **Finetuned from model [optional]:** None |
| |
|
| | ### Model Resources |
| |
|
| | <!-- Provide links where applicable. --> |
| |
|
| | - **Repository:** None |
| | - **Paper or documentation:** None |
| |
|
| | ## Training Details |
| |
|
| | ### Training Data |
| |
|
| | <!-- This is a short stub of information on the training data that was used, and documentation related to data pre-processing or additional filtering (if applicable). --> |
| |
|
| | 24K+ premise-hypothesis pairs from the shared task dataset provided for Natural Language Inference (NLI). |
| |
|
| | ### Training Procedure |
| |
|
| | <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. --> |
| |
|
| | #### Training Hyperparameters |
| |
|
| | <!-- This is a summary of the values of hyperparameters used in training the model. --> |
| |
|
| |
|
| | - seed: 42 |
| | - learning_rate: 1e-04 |
| | - train_batch_size: 64 |
| | - eval_batch_size: 64 |
| | - num_epochs: 20 |
| | |
| | |
| | #### Speeds, Sizes, Times |
| |
|
| | <!-- This section provides information about how roughly how long it takes to train the model and the size of the resulting model. --> |
| |
|
| |
|
| | - overall training time: 3 minutes 4 seconds |
| | - duration per training epoch: 34 seconds |
| | - model size: 30.7 MB |
| |
|
| | ## Evaluation |
| |
|
| | <!-- This section describes the evaluation protocols and provides the results. --> |
| |
|
| | ### Testing Data & Metrics |
| |
|
| | #### Testing Data |
| |
|
| | <!-- This should describe any evaluation data used (e.g., the development/validation set provided). --> |
| |
|
| | A subset of the development set provided, amounting to 6K+ pairs. |
| |
|
| | #### Metrics |
| |
|
| | <!-- These are the evaluation metrics being used. --> |
| |
|
| |
|
| | - Recall |
| | - F1-score |
| | - Accuracy |
| |
|
| | ### Results |
| |
|
| | The BILSTM RNN Model obtained an F1-score of 70% and an accuracy of 70%. |
| |
|
| | ## Technical Specifications |
| |
|
| | ### Hardware |
| |
|
| |
|
| | - RAM: at least 25 GB |
| | - Storage: at least 38.1 GB, |
| | - GPU: a100 40GB |
| |
|
| | ### Software |
| |
|
| |
|
| | - Tensorflow 2.18.0+cu12.4 |
| | - Pandas 2.2.2 |
| | - NumPy 2.0.2 |
| | - Seaborn 0.13.2 |
| | - Matplotlib 3.10.0 |
| | - Scikit-learn 1.6.1 |
| |
|
| | ## Bias, Risks, and Limitations |
| |
|
| | <!-- This section is meant to convey both technical and sociotechnical limitations. --> |
| |
|
| | Any inputs (concatenation of two sequences) longer than |
| | 512 subwords will be truncated by the model. |
| | |
| | ## Additional Information |
| |
|
| | <!-- Any other information that would be useful for other people to know. --> |
| |
|
| | The hyperparameters were determined by experimentation |
| | with different values. |