language: en
license: mit
model_id: Statement_Equivalence
developers: Matt Stammers
model_type: BERT-Base-Uncased
model_summary: >-
This model Compares the similarity of two text objects. It is the first BERT
model I have fine tuned so there may be bugs. The model labels should read
equivalent/not-equivalent but despite mapping the id2label variables they are
presently still displaying as label0/1 in the inference module. I may come
back and fix this at a later date.
shared_by: Matt Stammers
finetuned_from: Glue
repo: >-
https://huggingface.co/MattStammers/Statement_Equivalence?text=I+like+you.+I+love+you
paper: N/A
demo: N/A
direct_use: Test it out here
downstream_use: This is a standalone app
out_of_scope_use: >-
The model will not work with any very complex sentences or to compare more
than 3 statements
bias_risks_limitations: Biases inherent in Glue also apply here
bias_recommendations: Do not be surprised if unusual results are obtained
get_started_code: |2-
``` python
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-classification", model="MattStammers/Statement_Equivalence")
# Load model directly
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("MattStammers/Statement_Equivalence")
model = AutoModelForSequenceClassification.from_pretrained("MattStammers/Statement_Equivalence")
```
training_data: 'See Glue Dataset: https://huggingface.co/datasets/glue'
preprocessing: Sentence Pairs to analyse similarity
training_regime: User Defined
speeds_sizes_times: Not Relevant
testing_data: 'MRCP. Link: https://huggingface.co/datasets/SetFit/mrpc'
testing_factors: N/A
testing_metrics: N/A
results: See evaluation results.
results_summary: See Over
model_examination: Model should be interpreted with user discretion.
model_specs: Bert fine-tuned
compute_infrastructure: requires less than 4GB of GPU to run quickly
hardware: T600
hours_used: '0.1'
cloud_provider: N/A
cloud_region: N/A
co2_emitted: <1
software: Python, pytorch with transformers
citation_bibtex: N/A
citation_apa: N/A
glossary: N/A
more_information: Can be made available on request
model_card_authors: Matt Stammers
model_card_contact: Matt Stammers
model-index:
- name: statement
results:
- task:
type: text-classification
dataset:
name: MRCP
type: mrcp
metrics:
- type: accuracy
value: 0.8480392156862745
- type: F1-Score
value: 0.8945578231292517
Model Card for Statement_Equivalence
This model Compares the similarity of two text objects. It is the first BERT model I have fine tuned so there may be bugs. The model labels should read equivalent/not-equivalent but despite mapping the id2label variables they are presently still displaying as label0/1 in the inference module. I may come back and fix this at a later date.
Model Details
Model Description
- Developed by: Matt Stammers
- Shared by [optional]: Matt Stammers
- Model type: BERT-Base-Uncased
- Language(s) (NLP): en
- License: mit
- Finetuned from model [optional]: Glue
Model Sources [optional]
- Repository: https://huggingface.co/MattStammers/Statement_Equivalence?text=I+like+you.+I+love+you
- Paper [optional]: N/A
- Demo [optional]: N/A
Uses
Direct Use
Test it out here
Downstream Use [optional]
This is a standalone app
Out-of-Scope Use
The model will not work with any very complex sentences or to compare more than 3 statements
Bias, Risks, and Limitations
Biases inherent in Glue also apply here
Recommendations
Do not be surprised if unusual results are obtained
How to Get Started with the Model
Use the code below to get started with the model.
``` python
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-classification", model="MattStammers/Statement_Equivalence")
# Load model directly
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("MattStammers/Statement_Equivalence")
model = AutoModelForSequenceClassification.from_pretrained("MattStammers/Statement_Equivalence")
```
Training Details
Training Data
See Glue Dataset: https://huggingface.co/datasets/glue
Training Procedure
Preprocessing [optional]
Sentence Pairs to analyse similarity
Training Hyperparameters
- Training regime: User Defined
Speeds, Sizes, Times [optional]
Not Relevant
Evaluation
Testing Data, Factors & Metrics
Testing Data
MRCP. Link: https://huggingface.co/datasets/SetFit/mrpc
Factors
N/A
Metrics
N/A
Results
See evaluation results.
Summary
See Over
Model Examination [optional]
Model should be interpreted with user discretion.
Environmental Impact
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
- Hardware Type: T600
- Hours used: 0.1
- Cloud Provider: N/A
- Compute Region: N/A
- Carbon Emitted: <1
Technical Specifications [optional]
Model Architecture and Objective
Bert fine-tuned
Compute Infrastructure
requires less than 4GB of GPU to run quickly
Hardware
T600
Software
Python, pytorch with transformers
Citation [optional]
BibTeX:
N/A
APA:
N/A
Glossary [optional]
N/A
More Information [optional]
Can be made available on request
Model Card Authors [optional]
Matt Stammers
Model Card Contact
Matt Stammers