MidhunKanadan commited on
Commit
ec9ff38
·
verified ·
1 Parent(s): 7f03d66

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +124 -0
README.md ADDED
@@ -0,0 +1,124 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ # roberta-large-fallacy-classification
3
+
4
+ This model is a fine-tuned version of `roberta-large` trained for logical fallacy detection on the [Logical Fallacy Dataset](https://huggingface.co/datasets/tasksource/logical-fallacy). It is capable of classifying various types of logical fallacies in text.
5
+
6
+ ## Model Details
7
+
8
+ - **Base Model**: `roberta-large`
9
+ - **Dataset**: Logical Fallacy Dataset
10
+ - **Number of Classes**: 13
11
+ - **Training Parameters**:
12
+ - **Learning Rate**: 5e-6 with cosine decay scheduler
13
+ - **Batch Size**: 8 (with gradient accumulation for effective batch size of 16)
14
+ - **Weight Decay**: 0.3
15
+ - **Label Smoothing**: 0.1
16
+ - **Mixed Precision (FP16)**: Enabled
17
+ - **Early Stopping**: Used with patience of 2 epochs
18
+ - **Training Time**: Approximately 10 epochs
19
+
20
+ ## Example Pipeline
21
+
22
+ To use the model for quick classification with a text pipeline:
23
+
24
+ ```python
25
+ from transformers import pipeline
26
+
27
+ # Replace with your Hugging Face model path
28
+ model_path = "MidhunKanadan/roberta-large-fallacy-classification"
29
+
30
+ # Initialize the text classification pipeline
31
+ pipe = pipeline("text-classification", model=model_path, tokenizer=model_path, device=0) # Set device=0 to use GPU if available
32
+
33
+ # Define a sample text
34
+ text = "The rooster crows always before the sun rises, therefore the crowing rooster causes the sun to rise."
35
+
36
+ # Make a prediction
37
+ result = pipe(text)
38
+
39
+ # Print the predicted label and score
40
+ print(f"Predicted Label: {result[0]['label']}")
41
+ print(f"Score: {result[0]['score']:.4f}")
42
+ ```
43
+
44
+ Expected Output:
45
+ ```
46
+ Predicted Label: false causality
47
+ Score: 0.8938
48
+ ```
49
+
50
+ ## Full Classification Example
51
+
52
+ For more control, load the model and tokenizer directly and perform classification:
53
+
54
+ ```python
55
+ import torch
56
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
57
+
58
+ # Load your model and tokenizer from Hugging Face
59
+ model = AutoModelForSequenceClassification.from_pretrained(model_path)
60
+ tokenizer = AutoTokenizer.from_pretrained(model_path)
61
+ model.to("cuda") # Move to GPU if available
62
+
63
+ # Define a sample text
64
+ text = "The rooster crows always before the sun rises, therefore the crowing rooster causes the sun to rise."
65
+
66
+ # Tokenize the input text
67
+ inputs = tokenizer(text, return_tensors="pt").to("cuda")
68
+
69
+ # Run the model and get logits
70
+ with torch.no_grad():
71
+ logits = model(**inputs).logits
72
+
73
+ # Apply softmax to get probabilities
74
+ probabilities = torch.nn.functional.softmax(logits, dim=1)[0]
75
+
76
+ # Print each label and its corresponding score
77
+ for label, score in zip(model.config.id2label.values(), probabilities):
78
+ print(f"{label}: {score.item():.4f}")
79
+
80
+ ```
81
+
82
+ Expected Output:
83
+ ```
84
+ ad hominem: 0.0025
85
+ appeal to emotion: 0.0037
86
+ false dilemma: 0.0053
87
+ false causality: 0.8938
88
+ fallacy of relevance: 0.0059
89
+ ad populum: 0.0053
90
+ faulty generalization: 0.0104
91
+ fallacy of credibility: 0.0040
92
+ fallacy of extension: 0.0042
93
+ intentional: 0.0036
94
+ circular reasoning: 0.0127
95
+ fallacy of logic: 0.0366
96
+ equivocation: 0.0121
97
+ ```
98
+
99
+ ## Training Details
100
+
101
+ The model was trained using the following parameters:
102
+ - **Optimizer**: AdamW
103
+ - **Learning Rate**: 5e-6 with cosine decay scheduler
104
+ - **Batch Size**: 8 (with gradient accumulation to achieve effective batch size of 16)
105
+ - **Weight Decay**: 0.3
106
+ - **Label Smoothing Factor**: 0.1
107
+ - **Early Stopping**: Enabled (patience = 2)
108
+ - **Mixed Precision**: Enabled (FP16)
109
+
110
+ ## Dataset
111
+
112
+ - **Dataset Name**: Logical Fallacy Dataset
113
+ - **Source**: [Hugging Face Datasets](https://huggingface.co/datasets/tasksource/logical-fallacy)
114
+ - **Number of Classes**: 14 fallacies (e.g., ad hominem, appeal to emotion, faulty generalization, etc.)
115
+
116
+ ## Limitations
117
+
118
+ This model may not generalize well to all types of logical fallacies due to the limited size of the dataset and potential class imbalance. It may require additional fine-tuning or data augmentation to perform effectively in production.
119
+
120
+ ## Evaluation
121
+
122
+ The model achieved the following evaluation metrics:
123
+ - **Accuracy**: Varies by dataset split; see training logs for more details.
124
+ - **F1 Score**: Varies by dataset split; see training logs for more details.