wakaflocka17 commited on
Commit
194ad92
·
verified ·
1 Parent(s): c169ea8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +116 -16
README.md CHANGED
@@ -1,16 +1,116 @@
1
- ---
2
- datasets:
3
- - stanfordnlp/imdb
4
- language:
5
- - en
6
- metrics:
7
- - accuracy
8
- - precision
9
- - recall
10
- - f1
11
- base_model:
12
- - facebook/bart-base
13
- - google-bert/bert-base-uncased
14
- - EleutherAI/gpt-neo-2.7B
15
- pipeline_tag: text-classification
16
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ datasets:
3
+ - stanfordnlp/imdb
4
+ language:
5
+ - en
6
+ metrics:
7
+ - accuracy
8
+ - precision
9
+ - recall
10
+ - f1
11
+ base_model:
12
+ - facebook/bart-base
13
+ - google-bert/bert-base-uncased
14
+ - EleutherAI/gpt-neo-2.7B
15
+ pipeline_tag: text-classification
16
+ license: apache-2.0
17
+ ---
18
+
19
+ # 📝 Model Card: ensemble-majority-voting-imdb
20
+
21
+ ## 🔍 Introduction
22
+ The `wakaflocka17/ensemble-majority-voting-imdb` model is a majority-voting ensemble of three fine-tuned sentiment classifiers (`bert-imdb-finetuned`, `bart-imdb-finetuned`, `gptneo-imdb-finetuned`) on the IMDb dataset. Each model votes on the sentiment label and the ensemble returns the label with the most votes, improving overall accuracy.
23
+
24
+ ## 📊 Evaluation Metrics
25
+ | Metric | Value |
26
+ |-----------|---------|
27
+ | Accuracy | 0.93296 |
28
+ | Precision | 0.9559 |
29
+ | Recall | 0.9078 |
30
+ | F1-score | 0.9312 |
31
+
32
+ ## ⚙️ Training Parameters
33
+ | Parameter | Values |
34
+ |-----------------------|--------------------------------------------------|
35
+ | Models in ensemble | `bert_base_uncased`, `bart_base`, `gpt_neo_2_7b` |
36
+ | Repo for ensemble | `models/ensemble_majority_voting` |
37
+ | Batch size (eval) | 64 |
38
+
39
+ ## 🚀 Example of use in Colab
40
+
41
+ #### Installing dependencies
42
+ ```bash
43
+ !pip install --upgrade transformers huggingface_hub
44
+ ```
45
+ #### (Optional) Authentication for private models
46
+ ```python
47
+ from huggingface_hub import login
48
+ login(token="hf_yourhftoken")
49
+ ```
50
+ #### Loading models and creating ensemble pipeline
51
+ ```python
52
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification, TextClassificationPipeline
53
+ from collections import Counter
54
+
55
+ # List of fine-tuned model repo IDs
56
+ model_ids = [
57
+ "wakaflocka17/bert-imdb-finetuned",
58
+ "wakaflocka17/bart-imdb-finetuned",
59
+ "wakaflocka17/gptneo-imdb-finetuned"
60
+ ]
61
+ ```
62
+ #### Load pipelines
63
+ ```python
64
+ pipelines = []
65
+ for repo_id in model_ids:
66
+ tokenizer = AutoTokenizer.from_pretrained(repo_id)
67
+ model = AutoModelForSequenceClassification.from_pretrained(repo_id)
68
+ model.config.id2label = {0: 'NEGATIVE', 1: 'POSITIVE'}
69
+ pipelines.append(TextClassificationPipeline(model=model, tokenizer=tokenizer, return_all_scores=False))
70
+ ```
71
+ #### Ensemble prediction function
72
+ ```python
73
+ def ensemble_predict(text):
74
+ votes = []
75
+ # Collect each model's vote along with its name
76
+ for model_id, pipe in zip(model_ids, pipelines):
77
+ label = pipe(text)[0]['label']
78
+ votes.append({
79
+ "model": model_id, # or model_id.split("/")[-1] for just the short name
80
+ "label": label
81
+ })
82
+ # Determine majority label
83
+ majority_label = Counter([v["label"] for v in votes]).most_common(1)[0][0]
84
+ return {
85
+ "ensemble_label": majority_label,
86
+ "individual_votes": votes
87
+ }
88
+ ```
89
+ #### Inference on a text example
90
+ ```python
91
+ testo = "This movie was absolutely fantastic—wonderful performances and a gripping story!"
92
+ result = ensemble_predict(testo)
93
+ print(result)
94
+ # Example output:
95
+ # {
96
+ # 'ensemble_label': 'POSITIVE',
97
+ # 'individual_votes': [
98
+ # {'model': 'wakaflocka17/bert-imdb-finetuned', 'label': 'POSITIVE'},
99
+ # {'model': 'wakaflocka17/bart-imdb-finetuned', 'label': 'NEGATIVE'},
100
+ # {'model': 'wakaflocka17/gptneo-imdb-finetuned', 'label': 'POSITIVE'}
101
+ # ]
102
+ # }
103
+ ```
104
+ ## 📖 How to cite
105
+ If you use this model in your work, you can cite it as:
106
+ ```latex
107
+ @misc{Sentiment-Project,
108
+ author = {Francesco Congiu},
109
+ title = {Sentiment Analysis with Pretrained, Fine-tuned and Ensemble Transformer Models},
110
+ howpublished = {\url{https://github.com/wakaflocka17/DLA_LLMSANALYSIS}},
111
+ year = {2025}
112
+ }
113
+ ```
114
+ ## 🔗 Reference Repository
115
+ > All the file structure and script examples can be found at:
116
+ > https://github.com/wakaflocka17/DLA_LLMSANALYSIS/tree/main