YashikaNagpal commited on
Commit
aaf57ee
·
verified ·
1 Parent(s): c8811b6

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +125 -0
README.md ADDED
@@ -0,0 +1,125 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Model Details
2
+
3
+ **Model Name:** Fine-Tuned BART for Customer Support Resolution Generation
4
+
5
+ **Base Model:** facebook/bart-base
6
+
7
+ **Dataset:** bitext/Bitext-customer-support-llm-chatbot-training-dataset
8
+
9
+ **Quantization:** Applied FP16 for optimized inference
10
+
11
+ **Training Device:** CUDA (GPU)
12
+
13
+ ---
14
+
15
+ # Dataset Information
16
+
17
+ **Dataset Structure:**
18
+ DatasetDict({
19
+ train: Dataset({
20
+ features: ['input_text', 'target_text'],
21
+ num_rows: 24184
22
+ })
23
+ validation: Dataset({
24
+ features: ['input_text', 'target_text'],
25
+ num_rows: 2688
26
+ })
27
+ })
28
+
29
+
30
+ **Available Splits:**
31
+
32
+ - **Train:** 24,184 examples
33
+ - **Validation:** 2,688 examples
34
+
35
+ **Feature Representation:**
36
+
37
+ - **input_text:** Customer issue text (e.g., "Customer: How do I cancel my order?")
38
+ - **target_text:** Resolution text (e.g., "Log into the portal and cancel it there.")
39
+
40
+ ---
41
+
42
+ # Training Details
43
+
44
+ **Training Process:**
45
+
46
+ - Fine-tuned for 3 epochs
47
+ - Loss reduced progressively across epochs
48
+
49
+ **Hyperparameters:**
50
+
51
+ - Epochs: 3
52
+ - Learning Rate: 2e-5
53
+ - Batch Size: 8
54
+ - Weight Decay: 0.01
55
+ - Mixed Precision: FP16
56
+
57
+ **Performance Metrics:**
58
+
59
+ - Final Training Loss: ~0.0140
60
+ - Final Validation Loss: ~0.0121
61
+
62
+
63
+ ---
64
+
65
+ # Inference Example
66
+
67
+ ```python
68
+ import torch
69
+ from transformers import BartForConditionalGeneration, BartTokenizer
70
+
71
+ def load_model(model_path):
72
+ tokenizer = BartTokenizer.from_pretrained(model_path)
73
+ model = BartForConditionalGeneration.from_pretrained(model_path).half() # FP16
74
+ model.eval()
75
+ return model, tokenizer
76
+
77
+ def generate_resolution(issue, model, tokenizer, device="cuda"):
78
+ input_text = f"Customer: {issue}"
79
+ inputs = tokenizer(
80
+ input_text,
81
+ max_length=512,
82
+ padding="max_length",
83
+ truncation=True,
84
+ return_tensors="pt"
85
+ ).to(device)
86
+ outputs = model.generate(
87
+ inputs["input_ids"],
88
+ max_length=128,
89
+ num_beams=4,
90
+ early_stopping=True
91
+ )
92
+ return tokenizer.decode(outputs[0], skip_special_tokens=True)
93
+
94
+ # Example usage
95
+ if __name__ == "__main__":
96
+ model_path = "your-username/bart-resolution-summarizer-fp16" # Replace with your HF repo
97
+ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
98
+ model, tokenizer = load_model(model_path)
99
+ model.to(device)
100
+
101
+ issue = "How do I cancel my order?"
102
+ resolution = generate_resolution(issue, model, tokenizer, device)
103
+ print(f"Issue: {issue}")
104
+ print(f"Resolution: {resolution}")
105
+ Expected Output:
106
+
107
+
108
+ Issue: How do I cancel my order?
109
+ Resolution: Log into the portal and cancel it there.
110
+ ```
111
+ # Quantization & Optimization
112
+ Quantization: Applied FP16 using PyTorch’s .half() post-training for faster inference and reduced model size (~279MB from ~558MB).
113
+ Optimization: Trained with mixed precision (FP16) on CUDA, further quantized for deployment efficiency.
114
+ Usage
115
+ Input: Text representing a customer support issue (e.g., "Customer: My payment isn’t going through, help!")
116
+
117
+ Output: Text providing an actionable resolution (e.g., "Check your card details and try again.")
118
+
119
+ # Limitations
120
+ Model may struggle with issues requiring specific resolutions not well-represented in the training data (e.g., time-related queries like "When can I call support?").
121
+ Resolution extraction relied on heuristics, potentially missing nuanced answers in verbose responses.
122
+ # Future Improvements
123
+ Refine resolution extraction with more advanced NLP techniques or manual curation.
124
+ Fine-tune on additional customer support datasets for broader coverage.
125
+ Evaluate with formal metrics (e.g., ROUGE) for quantitative performance.