Model Card for Statement_Equivalence

This model Compares the similarity of two text objects. It is the first BERT model I have fine tuned so there may be bugs. The model labels should read equivalent/not-equivalent but despite mapping the id2label variables they are presently still displaying as label0/1 in the inference module. I may come back and fix this at a later date.

Model Details

Model Description

Developed by: Matt Stammers
Shared by [optional]: Matt Stammers
Model type: BERT-Base-Uncased
Language(s) (NLP): en
License: mit
Finetuned from model [optional]: Glue

Model Sources [optional]

Repository: https://huggingface.co/MattStammers/Statement_Equivalence?text=I+like+you.+I+love+you
Paper [optional]: N/A
Demo [optional]: N/A

Uses

Direct Use

Test it out here

Downstream Use [optional]

This is a standalone app

Out-of-Scope Use

The model will not work with any very complex sentences or to compare more than 3 statements

Bias, Risks, and Limitations

Biases inherent in Glue also apply here

Recommendations

Do not be surprised if unusual results are obtained

How to Get Started with the Model

Use the code below to get started with the model.

``` python 
# Use a pipeline as a high-level helper
    from transformers import pipeline

    pipe = pipeline("text-classification", model="MattStammers/Statement_Equivalence")
# Load model directly
    from transformers import AutoTokenizer, AutoModelForSequenceClassification

    tokenizer = AutoTokenizer.from_pretrained("MattStammers/Statement_Equivalence")
    model = AutoModelForSequenceClassification.from_pretrained("MattStammers/Statement_Equivalence")
```

Training Details

Training Data

See Glue Dataset: https://huggingface.co/datasets/glue

Training Procedure

Preprocessing [optional]

Sentence Pairs to analyse similarity

Training Hyperparameters

Training regime: User Defined

Speeds, Sizes, Times [optional]

Not Relevant

Evaluation

Testing Data, Factors & Metrics

Testing Data

MRCP. Link: https://huggingface.co/datasets/SetFit/mrpc

Factors

N/A

Metrics

N/A

Results

See evaluation results.

Summary

See Over

Model Examination [optional]

Model should be interpreted with user discretion.

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

Hardware Type: T600
Hours used: 0.1
Cloud Provider: N/A
Compute Region: N/A
Carbon Emitted: <1

Technical Specifications [optional]

Model Architecture and Objective

Bert fine-tuned

Compute Infrastructure

requires less than 4GB of GPU to run quickly

Hardware

T600

Software

Python, pytorch with transformers

Citation [optional]

BibTeX:

N/A

APA:

N/A

Glossary [optional]

N/A

More Information [optional]

Can be made available on request

Model Card Authors [optional]

Matt Stammers

Model Card Contact

Matt Stammers

Downloads last month: 3

Paper for MattStammers/Statement_Equivalence

Quantifying the Carbon Emissions of Machine Learning

Paper • 1910.09700 • Published Oct 21, 2019 • 52

Evaluation results

accuracy on MRCP
self-reported

0.848
F1-Score on MRCP
self-reported

0.895