BalaRajesh1
/

mmbert-small-nli

@@ -1,199 +1,133 @@
 ---
 library_name: transformers
-tags: []
 ---
-# Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
-## Model Details
-### Model Description
-<!-- Provide a longer summary of what this model is. -->
-This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
-### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
-## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
-### Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-[More Information Needed]
-### Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
-[More Information Needed]
-### Out-of-Scope Use
-<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
-[More Information Needed]
-## Bias, Risks, and Limitations
-<!-- This section is meant to convey both technical and sociotechnical limitations. -->
-[More Information Needed]
-### Recommendations
-<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
-## How to Get Started with the Model
-Use the code below to get started with the model.
-[More Information Needed]
-## Training Details
-### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
-### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-#### Preprocessing [optional]
-[More Information Needed]
-#### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
-#### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-[More Information Needed]
-## Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
-### Testing Data, Factors & Metrics
-#### Testing Data
-<!-- This should link to a Dataset Card if possible. -->
-[More Information Needed]
-#### Factors
-<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
-[More Information Needed]
-#### Metrics
-<!-- These are the evaluation metrics being used, ideally with a description of why. -->
-[More Information Needed]
-### Results
-[More Information Needed]
-#### Summary
-## Model Examination [optional]
-<!-- Relevant interpretability work for the model goes here -->
-[More Information Needed]
-## Environmental Impact
-<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
-Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
-- **Hardware Type:** [More Information Needed]
-- **Hours used:** [More Information Needed]
-- **Cloud Provider:** [More Information Needed]
-- **Compute Region:** [More Information Needed]
-- **Carbon Emitted:** [More Information Needed]
-## Technical Specifications [optional]
-### Model Architecture and Objective
-[More Information Needed]
-### Compute Infrastructure
-[More Information Needed]
-#### Hardware
-[More Information Needed]
-#### Software
-[More Information Needed]
-## Citation [optional]
-<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
-**BibTeX:**
-[More Information Needed]
-**APA:**
-[More Information Needed]
-## Glossary [optional]
-<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
-[More Information Needed]
-## More Information [optional]
-[More Information Needed]
-## Model Card Authors [optional]
-[More Information Needed]
-## Model Card Contact
-[More Information Needed]

 ---
 library_name: transformers
+license: mit
+base_model: jhu-clsp/mmBERT-small
+tags:
+- generated_from_trainer
+metrics:
+- accuracy
+model-index:
+- name: mmbert-small-nli
+  results: []
 ---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# mmbert-small-nli
+This model is a fine-tuned version of [jhu-clsp/mmBERT-small](https://huggingface.co/jhu-clsp/mmBERT-small) on the None dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.5527
+- Accuracy: 0.7772
+- F1 Macro: 0.7771
+- F1 Entailment: 0.7752
+- F1 Neutral: 0.7431
+- F1 Contradiction: 0.8129
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 2e-05
+- train_batch_size: 32
+- eval_batch_size: 64
+- seed: 42
+- optimizer: Use adamw_torch_fused with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 0.06
+- num_epochs: 3
+- mixed_precision_training: Native AMP
+### Training results
+| Training Loss | Epoch  | Step   | Validation Loss | Accuracy | F1 Macro | F1 Entailment | F1 Neutral | F1 Contradiction |
+|:-------------:|:------:|:------:|:---------------:|:--------:|:--------:|:-------------:|:----------:|:----------------:|
+| 1.0734        | 0.0087 | 2000   | 1.1464          | 0.402    | 0.3901   | 0.4706        | 0.4137     | 0.2862           |
+| 0.8248        | 0.0174 | 4000   | 0.8942          | 0.5951   | 0.5953   | 0.6249        | 0.5604     | 0.6006           |
+| 0.7294        | 0.0261 | 6000   | 0.8418          | 0.6394   | 0.6375   | 0.6719        | 0.5932     | 0.6475           |
+| 0.6950        | 0.0348 | 8000   | 0.7324          | 0.6886   | 0.6886   | 0.7207        | 0.6389     | 0.7063           |
+| 0.6517        | 0.0435 | 10000  | 0.7094          | 0.7052   | 0.7034   | 0.7439        | 0.6444     | 0.7219           |
+| 0.6550        | 0.0522 | 12000  | 0.7001          | 0.7037   | 0.7039   | 0.7306        | 0.6535     | 0.7277           |
+| 0.6181        | 0.0609 | 14000  | 0.6918          | 0.7205   | 0.7198   | 0.7564        | 0.672      | 0.7309           |
+| 0.6304        | 0.0696 | 16000  | 0.6628          | 0.7269   | 0.7254   | 0.7649        | 0.672      | 0.7392           |
+| 0.6088        | 0.0783 | 18000  | 0.6486          | 0.7277   | 0.7285   | 0.7499        | 0.684      | 0.7517           |
+| 0.6096        | 0.0871 | 20000  | 0.6527          | 0.7342   | 0.7345   | 0.7684        | 0.6945     | 0.7408           |
+| 0.5949        | 0.0958 | 22000  | 0.6820          | 0.7261   | 0.7274   | 0.7446        | 0.6856     | 0.7522           |
+| 0.6165        | 0.1045 | 24000  | 0.6378          | 0.7347   | 0.7353   | 0.7579        | 0.6894     | 0.7584           |
+| 0.6145        | 0.1132 | 26000  | 0.6274          | 0.7415   | 0.7422   | 0.7627        | 0.6994     | 0.7645           |
+| 0.6049        | 0.1219 | 28000  | 0.6515          | 0.7436   | 0.7437   | 0.7709        | 0.7019     | 0.7581           |
+| 0.5834        | 0.1306 | 30000  | 0.6514          | 0.7427   | 0.7435   | 0.7704        | 0.7041     | 0.756            |
+| 0.6031        | 0.1393 | 32000  | 0.6432          | 0.7494   | 0.7491   | 0.7797        | 0.706      | 0.7617           |
+| 0.5783        | 0.1480 | 34000  | 0.6438          | 0.7399   | 0.7419   | 0.7618        | 0.7087     | 0.7553           |
+| 0.5933        | 0.1567 | 36000  | 0.6420          | 0.7444   | 0.7434   | 0.7721        | 0.6929     | 0.765            |
+| 0.5766        | 0.1654 | 38000  | 0.6495          | 0.7318   | 0.7342   | 0.7374        | 0.7032     | 0.7621           |
+| 0.5698        | 0.1741 | 40000  | 0.6150          | 0.7525   | 0.7525   | 0.7833        | 0.7072     | 0.767            |
+| 0.5783        | 0.1828 | 42000  | 0.6490          | 0.7364   | 0.7385   | 0.7473        | 0.7087     | 0.7593           |
+| 0.5710        | 0.1915 | 44000  | 0.6284          | 0.7483   | 0.7467   | 0.7784        | 0.6938     | 0.768            |
+| 0.5647        | 0.2002 | 46000  | 0.6516          | 0.7439   | 0.7453   | 0.7653        | 0.7056     | 0.7649           |
+| 0.5625        | 0.2089 | 48000  | 0.6303          | 0.7529   | 0.7541   | 0.7776        | 0.7136     | 0.771            |
+| 0.5542        | 0.2176 | 50000  | 0.6285          | 0.7497   | 0.7507   | 0.7715        | 0.7107     | 0.7698           |
+| 0.5787        | 0.2263 | 52000  | 0.6306          | 0.7482   | 0.7482   | 0.7742        | 0.7007     | 0.7697           |
+| 0.5632        | 0.2350 | 54000  | 0.6289          | 0.7493   | 0.7496   | 0.7699        | 0.712      | 0.767            |
+| 0.5453        | 0.2438 | 56000  | 0.6133          | 0.7522   | 0.7539   | 0.7777        | 0.7145     | 0.7695           |
+| 0.5488        | 0.2525 | 58000  | 0.6306          | 0.7528   | 0.7543   | 0.7728        | 0.7163     | 0.7737           |
+| 0.5558        | 0.2612 | 60000  | 0.6306          | 0.7502   | 0.7477   | 0.7817        | 0.6851     | 0.7763           |
+| 0.5452        | 0.2699 | 62000  | 0.6250          | 0.7558   | 0.7576   | 0.7745        | 0.7226     | 0.7757           |
+| 0.5516        | 0.2786 | 64000  | 0.6121          | 0.7581   | 0.7592   | 0.7803        | 0.7194     | 0.7777           |
+| 0.5295        | 0.2873 | 66000  | 0.6206          | 0.7587   | 0.7597   | 0.7792        | 0.7205     | 0.7795           |
+| 0.5242        | 0.2960 | 68000  | 0.6028          | 0.7593   | 0.7607   | 0.7825        | 0.7252     | 0.7744           |
+| 0.5341        | 0.3047 | 70000  | 0.6173          | 0.7597   | 0.7582   | 0.7907        | 0.7023     | 0.7816           |
+| 0.5346        | 0.3134 | 72000  | 0.6258          | 0.7583   | 0.759    | 0.7812        | 0.7172     | 0.7785           |
+| 0.5194        | 0.3221 | 74000  | 0.6266          | 0.7622   | 0.7622   | 0.7891        | 0.7161     | 0.7815           |
+| 0.5392        | 0.3308 | 76000  | 0.6441          | 0.7531   | 0.7549   | 0.7749        | 0.7232     | 0.7667           |
+| 0.5208        | 0.3395 | 78000  | 0.6283          | 0.7556   | 0.7569   | 0.7695        | 0.7189     | 0.7824           |
+| 0.5306        | 0.3482 | 80000  | 0.6062          | 0.7656   | 0.7667   | 0.7843        | 0.7259     | 0.7899           |
+| 0.5271        | 0.3569 | 82000  | 0.6332          | 0.7644   | 0.7638   | 0.7929        | 0.7115     | 0.7871           |
+| 0.5088        | 0.3656 | 84000  | 0.6253          | 0.7612   | 0.761    | 0.7863        | 0.7131     | 0.7836           |
+| 0.5227        | 0.3743 | 86000  | 0.6285          | 0.7552   | 0.7571   | 0.7671        | 0.7205     | 0.7836           |
+| 0.5147        | 0.3830 | 88000  | 0.6199          | 0.7646   | 0.7631   | 0.7926        | 0.7073     | 0.7894           |
+| 0.5091        | 0.3917 | 90000  | 0.6220          | 0.7644   | 0.7655   | 0.7855        | 0.7262     | 0.7848           |
+| 0.5026        | 0.4005 | 92000  | 0.6216          | 0.766    | 0.7651   | 0.7936        | 0.7104     | 0.7913           |
+| 0.5221        | 0.4092 | 94000  | 0.6211          | 0.7653   | 0.7665   | 0.7869        | 0.7261     | 0.7866           |
+| 0.5081        | 0.4179 | 96000  | 0.6238          | 0.7622   | 0.7635   | 0.7877        | 0.7261     | 0.7768           |
+| 0.5163        | 0.4266 | 98000  | 0.6352          | 0.7702   | 0.7702   | 0.7974        | 0.7215     | 0.7916           |
+| 0.5063        | 0.4353 | 100000 | 0.6075          | 0.7652   | 0.7664   | 0.7874        | 0.7226     | 0.7891           |
+| 0.5023        | 0.4440 | 102000 | 0.6153          | 0.7674   | 0.7681   | 0.7941        | 0.7262     | 0.784            |
+| 0.4876        | 0.4527 | 104000 | 0.6140          | 0.7639   | 0.7645   | 0.7898        | 0.7163     | 0.7872           |
+| 0.5104        | 0.4614 | 106000 | 0.6174          | 0.7638   | 0.7655   | 0.7809        | 0.725      | 0.7906           |
+| 0.5122        | 0.4701 | 108000 | 0.6174          | 0.7634   | 0.7636   | 0.786         | 0.7149     | 0.7898           |
+| 0.4944        | 0.4788 | 110000 | 0.6240          | 0.7717   | 0.7721   | 0.7946        | 0.729      | 0.7929           |
+| 0.4873        | 0.4875 | 112000 | 0.6033          | 0.7682   | 0.7687   | 0.7917        | 0.7236     | 0.7907           |
+| 0.4871        | 0.4962 | 114000 | 0.5942          | 0.7719   | 0.7722   | 0.7955        | 0.7271     | 0.7941           |
+| 0.4954        | 0.5049 | 116000 | 0.5927          | 0.7707   | 0.7717   | 0.7925        | 0.7298     | 0.7927           |
+| 0.4852        | 0.5136 | 118000 | 0.6312          | 0.7701   | 0.7713   | 0.7888        | 0.7285     | 0.7965           |
+| 0.4782        | 0.5223 | 120000 | 0.6233          | 0.7682   | 0.7685   | 0.7912        | 0.7245     | 0.7898           |
+| 0.4915        | 0.5310 | 122000 | 0.6213          | 0.7672   | 0.7676   | 0.7874        | 0.7257     | 0.7898           |
+| 0.4776        | 0.5397 | 124000 | 0.6188          | 0.7714   | 0.7721   | 0.7934        | 0.7286     | 0.7944           |
+| 0.4658        | 0.5484 | 126000 | 0.6559          | 0.7702   | 0.7712   | 0.7937        | 0.7283     | 0.7916           |
+| 0.4830        | 0.5572 | 128000 | 0.6215          | 0.7689   | 0.7699   | 0.7917        | 0.7286     | 0.7896           |
+| 0.4777        | 0.5659 | 130000 | 0.6626          | 0.7677   | 0.7692   | 0.7874        | 0.7319     | 0.7882           |
+| 0.4645        | 0.5746 | 132000 | 0.6406          | 0.7703   | 0.7718   | 0.7947        | 0.7349     | 0.7857           |
+| 0.4887        | 0.5833 | 134000 | 0.6173          | 0.7684   | 0.7688   | 0.7934        | 0.7229     | 0.7901           |
+### Framework versions
+- Transformers 5.2.0
+- Pytorch 2.10.0+cu128
+- Datasets 4.6.1
+- Tokenizers 0.22.2

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:d723ecb711bac15abc4ac116d1d1252d26e38fb787e2a31d268e26345057f4f4
 size 562584932

 version https://git-lfs.github.com/spec/v1
+oid sha256:994ffeb33c6e6831f62ddbb4e6e353d2972f97633fddec0a1f2d590bccadcdab
 size 562584932