AventIQ-AI
/

sBERT_Text_Similarity

Model card Files Files and versions

sBERT_Text_Similarity / README.md

Aryan7500's picture

Upload 7 files

1cac734 verified 10 months ago

|

history blame contribute delete

3.5 kB


	# Sentence-BERT Quantized Model for Text Similarity & Paraphrase Detection

	This repository hosts a quantized version of the Sentence-BERT (SBERT) model, fine-tuned on the Quora Question Pairs dataset for text similarity and paraphrase detection. The model computes semantic similarity between two input sentences and has been optimized for efficient deployment using ONNX quantization.

	## Model Details

	- Model Architecture: Sentence-BERT (`all-MiniLM-L6-v2`)
	- Task: Text Similarity & Paraphrase Detection
	- Dataset: Quora Question Pairs (QQP)
	- Quantization: ONNX (Dynamic Quantization)
	- Fine-tuning Framework: Sentence-Transformers (Hugging Face)

	## Usage

	### Installation

	```sh
	pip install sentence-transformers onnxruntime transformers
	```

	### Loading the Model

	#### Original Fine-tuned Model

	```python
	from sentence_transformers import SentenceTransformer

	# Load the fine-tuned model
	model = SentenceTransformer("fine-tuned-model")

	# Encode two sentences and compute cosine similarity
	sentence1 = "How can I learn Python?"
	sentence2 = "What is the best way to study Python?"

	emb1 = model.encode(sentence1)
	emb2 = model.encode(sentence2)

	# Cosine similarity
	import numpy as np
	score = np.dot(emb1, emb2) / (np.linalg.norm(emb1) * np.linalg.norm(emb2))
	print("Similarity Score:", score)

	# Threshold to classify as paraphrase
	print("Paraphrase" if score > 0.75 else "Not Paraphrase")
	```

	#### Quantized ONNX Model

	```python
	from onnxruntime import InferenceSession
	from transformers import AutoTokenizer
	import numpy as np

	# Load tokenizer and ONNX session
	tokenizer = AutoTokenizer.from_pretrained("sentence-transformers/all-MiniLM-L6-v2")
	session = InferenceSession("sbert_onnx/model.onnx")

	def encode_onnx(session, tokenizer, sentence):
	inputs = tokenizer(sentence, return_tensors="np", padding=True, truncation=True)
	outputs = session.run(None, dict(inputs))
	return outputs[0][0]

	# Encode and compute similarity
	emb1 = encode_onnx(session, tokenizer, sentence1)
	emb2 = encode_onnx(session, tokenizer, sentence2)
	score = np.dot(emb1, emb2) / (np.linalg.norm(emb1) * np.linalg.norm(emb2))
	print("Quantized Similarity Score:", score)
	print("Paraphrase" if score > 0.75 else "Not Paraphrase")
	```

	## Performance Metrics

	- Accuracy: ~0.87
	- F1 Score: ~0.85
	- Threshold for classification: 0.75 cosine similarity

	## Fine-Tuning Details

	### Dataset

	- Source: Quora Question Pairs (Kaggle)
	- Size: 400K+ question pairs labeled as paraphrase or not

	### Training Configuration

	- Epochs: 3
	- Batch Size: 16
	- Evaluation Steps: 1000
	- Warmup Steps: 1000
	- Loss Function: CosineSimilarityLoss

	### Quantization

	- Method: ONNX dynamic quantization
	- Tool: Hugging Face Optimum + ONNX Runtime

	## Repository Structure

	```
	.
	├── fine-tuned-model/ # Fine-tuned SBERT model directory
	├── sbert_onnx/ # Quantized ONNX model directory
	├── test_functions.py # Code for evaluation and testing
	├── README.md # Project documentation
	```

	## Limitations

	- The cosine similarity threshold (0.75) may need tuning for different domains.
	- ONNX quantization may introduce slight performance degradation compared to full-precision models.
	- SBERT embeddings do not produce classification logits, only similarity scores.

	## Contributing

	Contributions are welcome! Please open an issue or submit a pull request for bug fixes or improvements.