Delete cross_encoder_model

Browse files

Files changed (8) hide show

cross_encoder_model/README.md +0 -147
cross_encoder_model/config.json +0 -36
cross_encoder_model/config_sentence_transformers.json +0 -11
cross_encoder_model/model.safetensors +0 -3
cross_encoder_model/modules.json +0 -8
cross_encoder_model/sentence_bert_config.json +0 -10
cross_encoder_model/tokenizer.json +0 -0
cross_encoder_model/tokenizer_config.json +0 -18

cross_encoder_model/README.md DELETED Viewed

@@ -1,147 +0,0 @@
----
-tags:
-- sentence-transformers
-- cross-encoder
-- reranker
-base_model: cross-encoder/ms-marco-MiniLM-L12-v2
-pipeline_tag: text-ranking
-library_name: sentence-transformers
----
-# CrossEncoder based on cross-encoder/ms-marco-MiniLM-L12-v2
-This is a [Cross Encoder](https://www.sbert.net/docs/cross_encoder/usage/usage.html) model finetuned from [cross-encoder/ms-marco-MiniLM-L12-v2](https://huggingface.co/cross-encoder/ms-marco-MiniLM-L12-v2) using the [sentence-transformers](https://www.SBERT.net) library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.
-## Model Details
-### Model Description
-- **Model Type:** Cross Encoder
-- **Base model:** [cross-encoder/ms-marco-MiniLM-L12-v2](https://huggingface.co/cross-encoder/ms-marco-MiniLM-L12-v2) <!-- at revision 7b0235231ca2674cb8ca8f022859a6eba2b1c968 -->
-- **Maximum Sequence Length:** 512 tokens
-- **Number of Output Labels:** 1 label
-- **Supported Modality:** Text
-<!-- - **Training Dataset:** Unknown -->
-<!-- - **Language:** Unknown -->
-<!-- - **License:** Unknown -->
-### Model Sources
-- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
-- **Documentation:** [Cross Encoder Documentation](https://www.sbert.net/docs/cross_encoder/usage/usage.html)
-- **Repository:** [Sentence Transformers on GitHub](https://github.com/huggingface/sentence-transformers)
-- **Hugging Face:** [Cross Encoders on Hugging Face](https://huggingface.co/models?library=sentence-transformers&other=cross-encoder)
-### Full Model Architecture
-```
-CrossEncoder(
-  (0): Transformer({'transformer_task': 'sequence-classification', 'modality_config': {'text': {'method': 'forward', 'method_output_name': 'logits'}}, 'module_output_name': 'scores', 'architecture': 'BertForSequenceClassification'})
-)
-```
-## Usage
-### Direct Usage (Sentence Transformers)
-First install the Sentence Transformers library:
-```bash
-pip install -U sentence-transformers
-```
-Then you can load this model and run inference.
-```python
-from sentence_transformers import CrossEncoder
-# Download from the 🤗 Hub
-model = CrossEncoder("cross_encoder_model_id")
-# Get scores for pairs of inputs
-pairs = [
-    ['How many calories in an egg', 'There are on average between 55 and 80 calories in an egg depending on its size.'],
-    ['How many calories in an egg', 'Egg whites are very low in calories, have no fat, no cholesterol, and are loaded with protein.'],
-    ['How many calories in an egg', 'Most of the calories in an egg come from the yellow yolk in the center.'],
-]
-scores = model.predict(pairs)
-print(scores)
-# [ 9.6793 -2.1906  1.9515]
-# Or rank different texts based on similarity to a single text
-ranks = model.rank(
-    'How many calories in an egg',
-    [
-        'There are on average between 55 and 80 calories in an egg depending on its size.',
-        'Egg whites are very low in calories, have no fat, no cholesterol, and are loaded with protein.',
-        'Most of the calories in an egg come from the yellow yolk in the center.',
-    ]
-)
-# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
-```
-<!--
-### Direct Usage (Transformers)
-<details><summary>Click to see the direct usage in Transformers</summary>
-</details>
--->
-<!--
-### Downstream Usage (Sentence Transformers)
-You can finetune this model on your own dataset.
-<details><summary>Click to expand</summary>
-</details>
--->
-<!--
-### Out-of-Scope Use
-*List how the model may foreseeably be misused and address what users ought not to do with the model.*
--->
-<!--
-## Bias, Risks and Limitations
-*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
--->
-<!--
-### Recommendations
-*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
--->
-## Training Details
-### Framework Versions
-- Python: 3.12.13
-- Sentence Transformers: 5.4.1
-- Transformers: 5.0.0
-- PyTorch: 2.10.0+cu128
-- Accelerate: 1.13.0
-- Datasets: 4.0.0
-- Tokenizers: 0.22.2
-## Citation
-### BibTeX
-<!--
-## Glossary
-*Clearly define terms in order to be accessible across audiences.*
--->
-<!--
-## Model Card Authors
-*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
--->
-<!--
-## Model Card Contact
-*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
--->

cross_encoder_model/config.json DELETED Viewed

@@ -1,36 +0,0 @@
-{
-  "add_cross_attention": false,
-  "architectures": [
-    "BertForSequenceClassification"
-  ],
-  "attention_probs_dropout_prob": 0.1,
-  "bos_token_id": null,
-  "classifier_dropout": null,
-  "dtype": "float32",
-  "eos_token_id": null,
-  "gradient_checkpointing": false,
-  "hidden_act": "gelu",
-  "hidden_dropout_prob": 0.1,
-  "hidden_size": 384,
-  "id2label": {
-    "0": "LABEL_0"
-  },
-  "initializer_range": 0.02,
-  "intermediate_size": 1536,
-  "is_decoder": false,
-  "label2id": {
-    "LABEL_0": 0
-  },
-  "layer_norm_eps": 1e-12,
-  "max_position_embeddings": 512,
-  "model_type": "bert",
-  "num_attention_heads": 12,
-  "num_hidden_layers": 12,
-  "pad_token_id": 0,
-  "position_embedding_type": "absolute",
-  "tie_word_embeddings": true,
-  "transformers_version": "5.0.0",
-  "type_vocab_size": 2,
-  "use_cache": true,
-  "vocab_size": 30522
-}

cross_encoder_model/config_sentence_transformers.json DELETED Viewed

@@ -1,11 +0,0 @@
-{
-  "__version__": {
-    "pytorch": "2.10.0+cu128",
-    "sentence_transformers": "5.4.1",
-    "transformers": "5.0.0"
-  },
-  "activation_fn": "torch.nn.modules.linear.Identity",
-  "default_prompt_name": null,
-  "model_type": "CrossEncoder",
-  "prompts": {}
-}

cross_encoder_model/model.safetensors DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:c6f930d12f0fead9acd03891e24e395903d80c1f7e505c10c6db2d5fb6a79b3b
-size 133464812

cross_encoder_model/modules.json DELETED Viewed

@@ -1,8 +0,0 @@
-[
-  {
-    "idx": 0,
-    "name": "0",
-    "path": "",
-    "type": "sentence_transformers.base.modules.transformer.Transformer"
-  }
-]

cross_encoder_model/sentence_bert_config.json DELETED Viewed

@@ -1,10 +0,0 @@
-{
-    "transformer_task": "sequence-classification",
-    "modality_config": {
-        "text": {
-            "method": "forward",
-            "method_output_name": "logits"
-        }
-    },
-    "module_output_name": "scores"
-}

cross_encoder_model/tokenizer.json DELETED Viewed

The diff for this file is too large to render. See raw diff

cross_encoder_model/tokenizer_config.json DELETED Viewed

@@ -1,18 +0,0 @@
-{
-  "backend": "tokenizers",
-  "clean_up_tokenization_spaces": true,
-  "cls_token": "[CLS]",
-  "do_basic_tokenize": true,
-  "do_lower_case": true,
-  "is_local": false,
-  "mask_token": "[MASK]",
-  "model_max_length": 512,
-  "model_specific_special_tokens": {},
-  "never_split": null,
-  "pad_token": "[PAD]",
-  "sep_token": "[SEP]",
-  "strip_accents": null,
-  "tokenize_chinese_chars": true,
-  "tokenizer_class": "BertTokenizer",
-  "unk_token": "[UNK]"
-}