MattStammers's picture
Upload README.md with huggingface_hub
d421d07
metadata
language: en
license: mit
model_id: Statement_Equivalence
developers: Matt Stammers
model_type: BERT-Base-Uncased
model_summary: >-
  This model Compares the similarity of two text objects. It is the first BERT
  model I have fine tuned so there may be bugs. The model labels should read
  equivalent/not-equivalent but despite mapping the id2label variables they are
  presently still displaying as label0/1 in the inference module. I may come
  back and fix this at a later date.
shared_by: Matt Stammers
finetuned_from: Glue
repo: >-
  https://huggingface.co/MattStammers/Statement_Equivalence?text=I+like+you.+I+love+you
paper: N/A
demo: N/A
direct_use: Test it out here
downstream_use: This is a standalone app
out_of_scope_use: >-
  The model will not work with any very complex sentences or to compare more
  than 3 statements
bias_risks_limitations: Biases inherent in Glue also apply here
bias_recommendations: Do not be surprised if unusual results are obtained
get_started_code: |2-

      ``` python 
      # Use a pipeline as a high-level helper
          from transformers import pipeline

          pipe = pipeline("text-classification", model="MattStammers/Statement_Equivalence")
      # Load model directly
          from transformers import AutoTokenizer, AutoModelForSequenceClassification

          tokenizer = AutoTokenizer.from_pretrained("MattStammers/Statement_Equivalence")
          model = AutoModelForSequenceClassification.from_pretrained("MattStammers/Statement_Equivalence")
      ```
                          
training_data: 'See Glue Dataset: https://huggingface.co/datasets/glue'
preprocessing: Sentence Pairs to analyse similarity
training_regime: User Defined
speeds_sizes_times: Not Relevant
testing_data: 'MRCP. Link: https://huggingface.co/datasets/SetFit/mrpc'
testing_factors: N/A
testing_metrics: N/A
results: See evaluation results.
results_summary: See Over
model_examination: Model should be interpreted with user discretion.
model_specs: Bert fine-tuned
compute_infrastructure: requires less than 4GB of GPU to run quickly
hardware: T600
hours_used: '0.1'
cloud_provider: N/A
cloud_region: N/A
co2_emitted: <1
software: Python, pytorch with transformers
citation_bibtex: N/A
citation_apa: N/A
glossary: N/A
more_information: Can be made available on request
model_card_authors: Matt Stammers
model_card_contact: Matt Stammers
model-index:
  - name: statement
    results:
      - task:
          type: text-classification
        dataset:
          name: MRCP
          type: mrcp
        metrics:
          - type: accuracy
            value: 0.8480392156862745
          - type: F1-Score
            value: 0.8945578231292517

Model Card for Statement_Equivalence

This model Compares the similarity of two text objects. It is the first BERT model I have fine tuned so there may be bugs. The model labels should read equivalent/not-equivalent but despite mapping the id2label variables they are presently still displaying as label0/1 in the inference module. I may come back and fix this at a later date.

Model Details

Model Description

  • Developed by: Matt Stammers
  • Shared by [optional]: Matt Stammers
  • Model type: BERT-Base-Uncased
  • Language(s) (NLP): en
  • License: mit
  • Finetuned from model [optional]: Glue

Model Sources [optional]

Uses

Direct Use

Test it out here

Downstream Use [optional]

This is a standalone app

Out-of-Scope Use

The model will not work with any very complex sentences or to compare more than 3 statements

Bias, Risks, and Limitations

Biases inherent in Glue also apply here

Recommendations

Do not be surprised if unusual results are obtained

How to Get Started with the Model

Use the code below to get started with the model.

``` python 
# Use a pipeline as a high-level helper
    from transformers import pipeline

    pipe = pipeline("text-classification", model="MattStammers/Statement_Equivalence")
# Load model directly
    from transformers import AutoTokenizer, AutoModelForSequenceClassification

    tokenizer = AutoTokenizer.from_pretrained("MattStammers/Statement_Equivalence")
    model = AutoModelForSequenceClassification.from_pretrained("MattStammers/Statement_Equivalence")
```
                    

Training Details

Training Data

See Glue Dataset: https://huggingface.co/datasets/glue

Training Procedure

Preprocessing [optional]

Sentence Pairs to analyse similarity

Training Hyperparameters

  • Training regime: User Defined

Speeds, Sizes, Times [optional]

Not Relevant

Evaluation

Testing Data, Factors & Metrics

Testing Data

MRCP. Link: https://huggingface.co/datasets/SetFit/mrpc

Factors

N/A

Metrics

N/A

Results

See evaluation results.

Summary

See Over

Model Examination [optional]

Model should be interpreted with user discretion.

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

  • Hardware Type: T600
  • Hours used: 0.1
  • Cloud Provider: N/A
  • Compute Region: N/A
  • Carbon Emitted: <1

Technical Specifications [optional]

Model Architecture and Objective

Bert fine-tuned

Compute Infrastructure

requires less than 4GB of GPU to run quickly

Hardware

T600

Software

Python, pytorch with transformers

Citation [optional]

BibTeX:

N/A

APA:

N/A

Glossary [optional]

N/A

More Information [optional]

Can be made available on request

Model Card Authors [optional]

Matt Stammers

Model Card Contact

Matt Stammers