diff --git "a/README.md" "b/README.md" --- "a/README.md" +++ "b/README.md" @@ -19,6 +19,112 @@ datasets: - adbaral/langcache-sentencepairs-v3 pipeline_tag: text-ranking library_name: sentence-transformers +metrics: +- accuracy +- accuracy_threshold +- f1 +- f1_threshold +- precision +- recall +- average_precision +- map +- mrr@1 +- ndcg@1 +model-index: +- name: Redis fine-tuned CrossEncoder model for semantic caching on LangCache + results: + - task: + type: cross-encoder-classification + name: Cross Encoder Classification + dataset: + name: test cls + type: test_cls + metrics: + - type: accuracy + value: 0.5661078569985861 + name: Accuracy + - type: accuracy_threshold + value: -1.8359375 + name: Accuracy Threshold + - type: f1 + value: 0.6381983914209115 + name: F1 + - type: f1_threshold + value: -3.0 + name: F1 Threshold + - type: precision + value: 0.49571026371466176 + name: Precision + - type: recall + value: 0.895644583571622 + name: Recall + - type: average_precision + value: 0.5123490100661972 + name: Average Precision + - task: + type: cross-encoder-reranking + name: Cross Encoder Reranking + dataset: + name: NanoQuoraRetrieval R25 + type: NanoQuoraRetrieval_R25 + metrics: + - type: map + value: 0.2548 + name: Map + - type: mrr@1 + value: 0.12 + name: Mrr@1 + - type: ndcg@1 + value: 0.12 + name: Ndcg@1 + - task: + type: cross-encoder-reranking + name: Cross Encoder Reranking + dataset: + name: NanoMSMARCO R25 + type: NanoMSMARCO_R25 + metrics: + - type: map + value: 0.187 + name: Map + - type: mrr@1 + value: 0.06 + name: Mrr@1 + - type: ndcg@1 + value: 0.06 + name: Ndcg@1 + - task: + type: cross-encoder-reranking + name: Cross Encoder Reranking + dataset: + name: NanoNQ R25 + type: NanoNQ_R25 + metrics: + - type: map + value: 0.1356 + name: Map + - type: mrr@1 + value: 0.04 + name: Mrr@1 + - type: ndcg@1 + value: 0.04 + name: Ndcg@1 + - task: + type: cross-encoder-nano-beir + name: Cross Encoder Nano BEIR + dataset: + name: NanoBEIR R25 mean + type: NanoBEIR_R25_mean + metrics: + - type: map + value: 0.1925 + name: Map + - type: mrr@1 + value: 0.0733 + name: Mrr@1 + - type: ndcg@1 + value: 0.0733 + name: Ndcg@1 --- # Redis fine-tuned CrossEncoder model for semantic caching on LangCache @@ -110,6 +216,65 @@ You can finetune this model on your own dataset. *List how the model may foreseeably be misused and address what users ought not to do with the model.* --> +## Evaluation + +### Metrics + +#### Cross Encoder Classification + +* Dataset: `test_cls` +* Evaluated with [CrossEncoderClassificationEvaluator](https://sbert.net/docs/package_reference/cross_encoder/evaluation.html#sentence_transformers.cross_encoder.evaluation.CrossEncoderClassificationEvaluator) + +| Metric | Value | +|:----------------------|:-----------| +| accuracy | 0.5661 | +| accuracy_threshold | -1.8359 | +| f1 | 0.6382 | +| f1_threshold | -3.0 | +| precision | 0.4957 | +| recall | 0.8956 | +| **average_precision** | **0.5123** | + +#### Cross Encoder Reranking + +* Datasets: `NanoQuoraRetrieval_R25`, `NanoMSMARCO_R25` and `NanoNQ_R25` +* Evaluated with [CrossEncoderRerankingEvaluator](https://sbert.net/docs/package_reference/cross_encoder/evaluation.html#sentence_transformers.cross_encoder.evaluation.CrossEncoderRerankingEvaluator) with these parameters: + ```json + { + "at_k": 1, + "always_rerank_positives": true + } + ``` + +| Metric | NanoQuoraRetrieval_R25 | NanoMSMARCO_R25 | NanoNQ_R25 | +|:-----------|:-----------------------|:---------------------|:---------------------| +| map | 0.2548 (-0.5756) | 0.1870 (-0.3007) | 0.1356 (-0.2844) | +| mrr@1 | 0.1200 (-0.6800) | 0.0600 (-0.2800) | 0.0400 (-0.2000) | +| **ndcg@1** | **0.1200 (-0.6800)** | **0.0600 (-0.2800)** | **0.0400 (-0.2000)** | + +#### Cross Encoder Nano BEIR + +* Dataset: `NanoBEIR_R25_mean` +* Evaluated with [CrossEncoderNanoBEIREvaluator](https://sbert.net/docs/package_reference/cross_encoder/evaluation.html#sentence_transformers.cross_encoder.evaluation.CrossEncoderNanoBEIREvaluator) with these parameters: + ```json + { + "dataset_names": [ + "QuoraRetrieval", + "MSMARCO", + "NQ" + ], + "rerank_k": 25, + "at_k": 1, + "always_rerank_positives": true + } + ``` + +| Metric | Value | +|:-----------|:---------------------| +| map | 0.1925 (-0.3869) | +| mrr@1 | 0.0733 (-0.3867) | +| **ndcg@1** | **0.0733 (-0.3867)** | +