Update model card

d77c52b verified 2 days ago

3.87 kB

	---
	language:
	- en
	license: apache-2.0
	library_name: transformers
	tags:
	- text-classification
	- sentiment-analysis
	- distilbert
	- imdb
	- pytorch
	pipeline_tag: text-classification
	datasets:
	- imdb
	metrics:
	- accuracy
	- f1
	model-index:
	- name: ohanvi-sentiment-analysis
	results:
	- task:
	type: text-classification
	name: Sentiment Analysis
	dataset:
	name: IMDb
	type: imdb
	split: test
	metrics:
	- type: accuracy
	value: 0.932
	name: Accuracy
	- type: f1
	value: 0.931
	name: F1
	---

	# 🎬 Ohanvi Sentiment Analysis

	A fine-tuned DistilBERT model for binary sentiment analysis on movie reviews.
	Given any text it predicts whether the sentiment is positive or negative.

	## Model Details

	\| Attribute \| Value \|
	\|-----------\|-------\|
	\| Base model \| `distilbert-base-uncased` \|
	\| Fine-tuned on \| [IMDb Movie Reviews](https://huggingface.co/datasets/imdb) \|
	\| Task \| Text Classification (Sentiment Analysis) \|
	\| Labels \| `positive` (1) / `negative` (0) \|
	\| Max sequence length \| 512 tokens \|
	\| Framework \| PyTorch + 🤗 Transformers \|
	\| License \| Apache 2.0 \|

	## Performance

	Evaluated on the IMDb test split (25 000 samples):

	\| Metric \| Score \|
	\|--------\|-------\|
	\| Accuracy \| ~93.2% \|
	\| F1 (binary) \| ~93.1% \|

	## Quick Start

	```python
	from transformers import pipeline

	classifier = pipeline(
	"text-classification",
	model="ohanvi/ohanvi-sentiment-analysis",
	)

	result = classifier("This movie was absolutely fantastic!")
	# → [{'label': 'positive', 'score': 0.9978}]

	result = classifier("Terrible film, complete waste of time.")
	# → [{'label': 'negative', 'score': 0.9965}]
	```

	## Training Details

	### Hyperparameters

	\| Parameter \| Value \|
	\|-----------\|-------\|
	\| Epochs \| 3 \|
	\| Batch size (train) \| 16 \|
	\| Learning rate \| 2e-5 \|
	\| Weight decay \| 0.01 \|
	\| Warmup ratio \| 10% \|
	\| Optimiser \| AdamW \|
	\| LR scheduler \| Linear with warmup \|

	### Training Data

	The model was fine-tuned on the full [IMDb](https://huggingface.co/datasets/imdb) dataset:
	- Train: 25 000 reviews (12 500 positive, 12 500 negative)
	- Test: 25 000 reviews (12 500 positive, 12 500 negative)

	### Training Environment

	- Hardware: GPU (NVIDIA / Apple Silicon MPS)
	- Mixed precision: fp16 (when CUDA available)
	- Early stopping: patience = 2 epochs

	## How to Use (Advanced)

	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification
	import torch

	model_name = "ohanvi/ohanvi-sentiment-analysis"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForSequenceClassification.from_pretrained(model_name)
	model.eval()

	text = "An outstanding film with incredible performances."
	inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)

	with torch.no_grad():
	logits = model(**inputs).logits

	probs = torch.softmax(logits, dim=-1)
	label_id = probs.argmax().item()
	label = model.config.id2label[label_id]
	confidence = probs[0][label_id].item()

	print(f"Label: {label} ({confidence:.1%})")
	```

	## Limitations

	- Trained exclusively on English movie reviews; performance on other languages or domains may be lower.
	- Very short texts (< 5 words) may produce less reliable results.
	- The model inherits any biases present in the IMDb dataset.

	## Citation

	If you use this model, please cite:

	```bibtex
	@misc{ohanvi-sentiment-2026,
	title = {Ohanvi Sentiment Analysis},
	author = {Gourav Bansal},
	year = {2026},
	url = {https://huggingface.co/ohanvi/ohanvi-sentiment-analysis},
	}
	```

	## Acknowledgements

	Built with 🤗 [Transformers](https://github.com/huggingface/transformers),
	🤗 [Datasets](https://github.com/huggingface/datasets), and
	[Gradio](https://gradio.app/).