File size: 8,153 Bytes

aef6e93
f1b6720
 
c68fce5
5964dbe
0ec67de
d421d07
 
 
 
 
a72b883
6b8d452
c68fce5
7054fb3
 
70b91ba
 
 
 
 
 
ef9f338
 
c68fce5
 
 
 
ef9f338
eaff504
 
 
 
def8b5b
7054fb3
 
 
 
 
 
def8b5b
 
 
eaff504
 
 
 
 
 
 
 
def8b5b
 
0ec67de
c68fce5
0ec67de
 
 
 
 
 
 
 
94d13e5
0ec67de
94d13e5
aef6e93
 
c68fce5
0ec67de
 
 
d421d07
0ec67de
 
 
 
 
 
 
 
 
 
a72b883
0ec67de
 
 
6b8d452
0ec67de
 
 
 
 
c68fce5
7054fb3
 
0ec67de
 
 
 
 
 
 
 
 
70b91ba
0ec67de
 
 
 
 
70b91ba
0ec67de
 
 
 
 
70b91ba
0ec67de
 
 
 
 
70b91ba
0ec67de
 
 
 
 
70b91ba
0ec67de
 
 
 
 
ef9f338
 
70b91ba
 
 
c68fce5
70b91ba
 
 
c68fce5
 
ef9f338
70b91ba
0ec67de
 
 
 
 
 
 
eaff504
0ec67de
 
 
 
 
 
 
eaff504
0ec67de
 
 
 
eaff504
0ec67de
 
 
 
 
eaff504
0ec67de
 
 
 
 
 
 
 
 
 
 
def8b5b
0ec67de
 
 
 
 
7054fb3
0ec67de
 
 
 
 
7054fb3
0ec67de
 
 
7054fb3
0ec67de
 
 
7054fb3
0ec67de
 
 
 
 
7054fb3
0ec67de
 
 
 
 
 
 
def8b5b
 
eaff504
 
 
0ec67de
 
 
 
 
7054fb3
0ec67de
 
 
def8b5b
0ec67de
 
 
def8b5b
0ec67de
 
 
eaff504
0ec67de
 
 
 
 
 
 
eaff504
0ec67de
 
 
eaff504
0ec67de
 
 
 
 
eaff504
0ec67de
 
 
eaff504
0ec67de
 
 
def8b5b
0ec67de
 
 
def8b5b
0ec67de

---
language: en
license: mit
model_id: Statement_Equivalence
developers: Matt Stammers
model_type: BERT-Base-Uncased
model_summary: This model Compares the similarity of two text objects. It is the first
  BERT model I have fine tuned so there may be bugs. The model labels should read
  equivalent/not-equivalent but despite mapping the id2label variables they are presently
  still displaying as label0/1 in the inference module. I may come back and fix this
  at a later date.
shared_by: Matt Stammers
finetuned_from: Glue
repo: https://huggingface.co/MattStammers/Statement_Equivalence?text=I+like+you.+I+love+you
paper: N/A
demo: N/A
direct_use: Test it out here
downstream_use: This is a standalone app
out_of_scope_use: The model will not work with any very complex sentences or to compare
  more than 3 statements
bias_risks_limitations: Biases inherent in Glue also apply here
bias_recommendations: Do not be surprised if unusual results are obtained
get_started_code: "\n    ``` python \n    # Use a pipeline as a high-level helper\n\
  \        from transformers import pipeline\n\n        pipe = pipeline(\"text-classification\"\
  , model=\"MattStammers/Statement_Equivalence\")\n    # Load model directly\n   \
  \     from transformers import AutoTokenizer, AutoModelForSequenceClassification\n\
  \n        tokenizer = AutoTokenizer.from_pretrained(\"MattStammers/Statement_Equivalence\"\
  )\n        model = AutoModelForSequenceClassification.from_pretrained(\"MattStammers/Statement_Equivalence\"\
  )\n    ```\n                        "
training_data: 'See Glue Dataset: https://huggingface.co/datasets/glue'
preprocessing: Sentence Pairs to analyse similarity
training_regime: User Defined
speeds_sizes_times: Not Relevant
testing_data: 'MRCP. Link: https://huggingface.co/datasets/SetFit/mrpc'
testing_factors: N/A
testing_metrics: N/A
results: See evaluation results.
results_summary: See Over
model_examination: Model should be interpreted with user discretion.
model_specs: Bert fine-tuned
compute_infrastructure: requires less than 4GB of GPU to run quickly
hardware: T600
hours_used: '0.1'
cloud_provider: N/A
cloud_region: N/A
co2_emitted: <1
software: Python, pytorch with transformers
citation_bibtex: N/A
citation_apa: N/A
glossary: N/A
more_information: Can be made available on request
model_card_authors: Matt Stammers
model_card_contact: Matt Stammers
model-index:
- name: statement
  results:
  - task:
      type: text-classification
    dataset:
      name: MRCP
      type: mrcp
    metrics:
    - type: accuracy
      value: 0.8480392156862745
    - type: F1-Score
      value: 0.8945578231292517
---

# Model Card for Statement_Equivalence

<!-- Provide a quick summary of what the model is/does. -->

This model Compares the similarity of two text objects. It is the first BERT model I have fine tuned so there may be bugs. The model labels should read equivalent/not-equivalent but despite mapping the id2label variables they are presently still displaying as label0/1 in the inference module. I may come back and fix this at a later date.

## Model Details

### Model Description

<!-- Provide a longer summary of what this model is. -->



- **Developed by:** Matt Stammers
- **Shared by [optional]:** Matt Stammers
- **Model type:** BERT-Base-Uncased
- **Language(s) (NLP):** en
- **License:** mit
- **Finetuned from model [optional]:** Glue

### Model Sources [optional]

<!-- Provide the basic links for the model. -->

- **Repository:** https://huggingface.co/MattStammers/Statement_Equivalence?text=I+like+you.+I+love+you
- **Paper [optional]:** N/A
- **Demo [optional]:** N/A

## Uses

<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->

### Direct Use

<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->

Test it out here

### Downstream Use [optional]

<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->

This is a standalone app

### Out-of-Scope Use

<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->

The model will not work with any very complex sentences or to compare more than 3 statements

## Bias, Risks, and Limitations

<!-- This section is meant to convey both technical and sociotechnical limitations. -->

Biases inherent in Glue also apply here

### Recommendations

<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->

Do not be surprised if unusual results are obtained

## How to Get Started with the Model

Use the code below to get started with the model.


    ``` python 
    # Use a pipeline as a high-level helper
        from transformers import pipeline

        pipe = pipeline("text-classification", model="MattStammers/Statement_Equivalence")
    # Load model directly
        from transformers import AutoTokenizer, AutoModelForSequenceClassification

        tokenizer = AutoTokenizer.from_pretrained("MattStammers/Statement_Equivalence")
        model = AutoModelForSequenceClassification.from_pretrained("MattStammers/Statement_Equivalence")
    ```
                        

## Training Details

### Training Data

<!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->

See Glue Dataset: https://huggingface.co/datasets/glue

### Training Procedure 

<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->

#### Preprocessing [optional]

Sentence Pairs to analyse similarity


#### Training Hyperparameters

- **Training regime:** User Defined <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->

#### Speeds, Sizes, Times [optional]

<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->

Not Relevant

## Evaluation

<!-- This section describes the evaluation protocols and provides the results. -->

### Testing Data, Factors & Metrics

#### Testing Data

<!-- This should link to a Data Card if possible. -->

MRCP. Link: https://huggingface.co/datasets/SetFit/mrpc

#### Factors

<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->

N/A

#### Metrics

<!-- These are the evaluation metrics being used, ideally with a description of why. -->

N/A

### Results

See evaluation results.

#### Summary

See Over

## Model Examination [optional]

<!-- Relevant interpretability work for the model goes here -->

Model should be interpreted with user discretion.

## Environmental Impact

<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->

Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).

- **Hardware Type:** T600
- **Hours used:** 0.1
- **Cloud Provider:** N/A
- **Compute Region:** N/A
- **Carbon Emitted:** <1

## Technical Specifications [optional]

### Model Architecture and Objective

Bert fine-tuned

### Compute Infrastructure

requires less than 4GB of GPU to run quickly

#### Hardware

T600

#### Software

Python, pytorch with transformers

## Citation [optional]

<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->

**BibTeX:**

N/A

**APA:**

N/A

## Glossary [optional]

<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->

N/A

## More Information [optional]

Can be made available on request

## Model Card Authors [optional]

Matt Stammers

## Model Card Contact

Matt Stammers