|
|
--- |
|
|
pipeline_tag: zero-shot-classification |
|
|
library_name: transformers |
|
|
datasets: |
|
|
- Xerv-AI/RainDrop-DTS |
|
|
--- |
|
|
|
|
|
# RainDrop - Zero-Shot Text Classification Model |
|
|
|
|
|
<img src='https://raw.githubusercontent.com/Electroiscoding/super-waddle/refs/heads/main/Screenshot_20250802-174037.png' alt='RainDrop Logo'> |
|
|
|
|
|
RainDrop is an experimental zero-shot text classification model designed to explore transformer-based dual-encoder architectures with contrastive learning. Currently trained on a small, test-scale dataset, it provides a foundation for future improvements toward robust zero-shot classification. |
|
|
|
|
|
At this stage, **RainDrop is not multilingual and shows modest accuracy metrics.** It is intended primarily for research, experimentation, and development purposes rather than production use. |
|
|
|
|
|
**Features** |
|
|
|
|
|
- Trained on a small, test-scale dataset |
|
|
- Contrastive learning with shared transformer encoder |
|
|
- Compatible with Hugging Face Transformers library |
|
|
|
|
|
**Benchmarking Snapshots** |
|
|
|
|
|
<img src='https://raw.githubusercontent.com/Electroiscoding/super-waddle/refs/heads/main/download%20(1).png' alt='Benchmark 1'> |
|
|
<img src='https://raw.githubusercontent.com/Electroiscoding/super-waddle/refs/heads/main/download%20(2).png' alt='Benchmark 2'> |
|
|
<img src='https://raw.githubusercontent.com/Electroiscoding/super-waddle/refs/heads/main/download.png' alt='Benchmark 3'> |
|
|
|
|
|
**How to Use** |
|
|
|
|
|
```python |
|
|
from transformers import AutoModelForSequenceClassification, AutoTokenizer |
|
|
from huggingface_hub import hf_hub_download |
|
|
import os |
|
|
|
|
|
repo_id = 'Xerv-AI/RainDrop-Demo-1' |
|
|
model_dir = './downloaded_raindrop_model' |
|
|
os.makedirs(model_dir, exist_ok=True) |
|
|
|
|
|
files_to_download = [ |
|
|
'config.json', |
|
|
'model.safetensors', |
|
|
'tokenizer.json', |
|
|
'tokenizer_config.json', |
|
|
'vocab.txt', |
|
|
'special_tokens_map.json', |
|
|
'artifacts.pkl', |
|
|
'training_args.bin' |
|
|
] |
|
|
|
|
|
for file_name in files_to_download: |
|
|
try: |
|
|
hf_hub_download(repo_id=repo_id, filename=file_name, local_dir=model_dir) |
|
|
except Exception as e: |
|
|
print(f"Could not download {file_name}: {e}") |
|
|
|
|
|
# Function to load a RainDrop model |
|
|
def load_raindrop_model(model_path): |
|
|
""" |
|
|
Loads a RainDrop model for sequence classification. |
|
|
""" |
|
|
tokenizer = AutoTokenizer.from_pretrained(model_path) |
|
|
model = AutoModelForSequenceClassification.from_pretrained(model_path) |
|
|
|
|
|
class RainDropModel: |
|
|
def __init__(self, model, tokenizer): |
|
|
self.model = model |
|
|
self.tokenizer = tokenizer |
|
|
|
|
|
def predict_single(self, text): |
|
|
inputs = self.tokenizer(text, return_tensors="pt") |
|
|
outputs = self.model(**inputs) |
|
|
logits = outputs.logits |
|
|
predicted_class_id = logits.argmax().item() |
|
|
predicted_label = self.model.config.id2label[predicted_class_id] |
|
|
confidence = logits.softmax(dim=1)[0][predicted_class_id].item() |
|
|
return predicted_label, confidence |
|
|
|
|
|
return RainDropModel(model, tokenizer) |
|
|
|
|
|
|
|
|
try: |
|
|
loaded_model_from_hub = load_raindrop_model(model_dir) |
|
|
|
|
|
test_text = "Physics is cool." |
|
|
predicted_label, confidence = loaded_model_from_hub.predict_single(test_text) |
|
|
|
|
|
print(f"Text: '{test_text}'") |
|
|
print(f"Predicted Label: {predicted_label}") |
|
|
print(f"Confidence: {confidence:.4f}") |
|
|
|
|
|
except Exception as e: |
|
|
print(f"Error loading or evaluating the model from the hub: {e}") |
|
|
``` |
|
|
|
|
|
About |
|
|
|
|
|
RainDrop is developed by Xerv AI as an early-stage research project to push the boundaries of zero-shot classification. It is a stepping stone towards a more accurate, multilingual model in the future. |