my-test-model / README.md
superone001's picture
Update README.md
7fb0ffe verified
---
library_name: transformers
license: apache-2.0
base_model: distilbert-base-uncased
tags:
- generated_from_trainer
metrics:
- accuracy
- f1
model-index:
- name: my-test-model
results: []
datasets:
- stanfordnlp/imdb
---
# my-test-model
This model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) on imdb dataset.
## Model description
This model is a fine-tuned version of DistilBERT-base-uncased for binary sentiment analysis on movie reviews. Key specifications:
Task: Sentiment classification (positive/negative)
Base Architecture: 6-layer distilled Transformer model
Parameters: ~66 million (standard DistilBERT configuration)
Output Labels:
0 → "NEGATIVE"
1 → "POSITIVE"
## Intended uses & limitations
Acceptable Use Cases ✅
Sentiment analysis of English movie reviews
Educational/research purposes for text classification
Baseline model for entertainment industry applications
Integration in sentiment analysis pipelines
Limitations ⚠️
Language Restriction: Only supports English text
Domain Specificity: Optimized for movie reviews - performance degrades on other text types
Bias Risks: May reflect demographic/cultural biases in training data
Length Constraint: Maximum input length of 256 tokens (longer texts are truncated)
Not Suitable For:
Multilingual text analysis
Sarcasm/irony detection
Fine-grained sentiment analysis (e.g., detecting anger, excitement)
## Training and evaluation data
Training Data
Dataset: IMDB Movie Reviews
Size: 25,000 labeled examples
Class Distribution:
Positive: 12,500 (50%)
Negative: 12,500 (50%)
Preprocessing:
Lowercasing
DistilBERT tokenization (WordPiece)
Dynamic padding
Evaluation Data
Test Set: Official IMDB test split (25,000 examples)
## Training procedure
TrainingArguments(
num_train_epochs=3,
per_device_train_batch_size=16,
per_device_eval_batch_size=64,
learning_rate=2e-5,
weight_decay=0.01,
evaluation_strategy="epoch",
save_strategy="epoch",
metric_for_best_model="accuracy"
)
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 64
- seed: 42
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 3
### Training results
| Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 |
|:-------------:|:-----:|:----:|:---------------:|:--------:|:------:|
| 0.2497 | 1.0 | 1563 | 0.2486 | 0.9026 | 0.9024 |
| 0.1496 | 2.0 | 3126 | 0.2896 | 0.9135 | 0.9135 |
| 0.1222 | 3.0 | 4689 | 0.3448 | 0.9130 | 0.9130 |
### Framework versions
- Transformers 4.52.3
- Pytorch 2.7.0+cu128
- Datasets 3.6.0
- Tokenizers 0.21.1