NLI-BILSTM / README.md
aap9002's picture
Create README.md
ae05f43 verified
---
{}
---
language: en
license: cc-by-4.0
tags:
- text-classification
repo: https://github.com/AAP9002/COMP34812-NLU-NLI
---
# Model Card for z72819ap-e91802zc-NLI
<!-- Provide a quick summary of what the model is/does. -->
This is a classification model that was trained to detect whether a premise and hypothesis entail each other or not, using binary classification.
## Model Details
### Model Description
<!-- Provide a longer summary of what this model is. -->
This model is based upon the Enhanced LSTM for Natural Language Inference architecture using BILSTM instead of LSTM trained on over 24K premise-hypothesis pairs from the shared task dataset for Natural Language Inference (NLI).
- **Developed by:** Alan Prophett and Zac Curtis
- **Language(s):** English
- **Model type:** Supervised
- **Model architecture:** BILSTM
- **Finetuned from model [optional]:** None
### Model Resources
<!-- Provide links where applicable. -->
- **Repository:** None
- **Paper or documentation:** None
## Training Details
### Training Data
<!-- This is a short stub of information on the training data that was used, and documentation related to data pre-processing or additional filtering (if applicable). -->
24K+ premise-hypothesis pairs from the shared task dataset provided for Natural Language Inference (NLI).
### Training Procedure
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
#### Training Hyperparameters
<!-- This is a summary of the values of hyperparameters used in training the model. -->
- seed: 42
- learning_rate: 1e-04
- train_batch_size: 64
- eval_batch_size: 64
- num_epochs: 20
#### Speeds, Sizes, Times
<!-- This section provides information about how roughly how long it takes to train the model and the size of the resulting model. -->
- overall training time: 3 minutes 4 seconds
- duration per training epoch: 34 seconds
- model size: 30.7 MB
## Evaluation
<!-- This section describes the evaluation protocols and provides the results. -->
### Testing Data & Metrics
#### Testing Data
<!-- This should describe any evaluation data used (e.g., the development/validation set provided). -->
A subset of the development set provided, amounting to 6K+ pairs.
#### Metrics
<!-- These are the evaluation metrics being used. -->
- Recall
- F1-score
- Accuracy
### Results
The BILSTM RNN Model obtained an F1-score of 70% and an accuracy of 70%.
## Technical Specifications
### Hardware
- RAM: at least 25 GB
- Storage: at least 38.1 GB,
- GPU: a100 40GB
### Software
- Tensorflow 2.18.0+cu12.4
- Pandas 2.2.2
- NumPy 2.0.2
- Seaborn 0.13.2
- Matplotlib 3.10.0
- Scikit-learn 1.6.1
## Bias, Risks, and Limitations
<!-- This section is meant to convey both technical and sociotechnical limitations. -->
Any inputs (concatenation of two sequences) longer than
512 subwords will be truncated by the model.
## Additional Information
<!-- Any other information that would be useful for other people to know. -->
The hyperparameters were determined by experimentation
with different values.