AventIQ-AI
/

bart_customer_ticket_raiser

Safetensors

bart

Model card Files Files and versions

xet

Community

YashikaNagpal commited on Feb 26, 2025

Commit

aaf57ee

verified ·

1 Parent(s): c8811b6

Create README.md

Browse files

Files changed (1) hide show

README.md +125 -0

README.md ADDED Viewed

	@@ -0,0 +1,125 @@

+# Model Details
+**Model Name:** Fine-Tuned BART for Customer Support Resolution Generation
+**Base Model:** facebook/bart-base
+**Dataset:** bitext/Bitext-customer-support-llm-chatbot-training-dataset
+**Quantization:** Applied FP16 for optimized inference
+**Training Device:** CUDA (GPU)
+---
+# Dataset Information
+**Dataset Structure:**
+DatasetDict({
+train: Dataset({
+features: ['input_text', 'target_text'],
+num_rows: 24184
+})
+validation: Dataset({
+features: ['input_text', 'target_text'],
+num_rows: 2688
+})
+})
+**Available Splits:**
+- **Train:** 24,184 examples
+- **Validation:** 2,688 examples
+**Feature Representation:**
+- **input_text:** Customer issue text (e.g., "Customer: How do I cancel my order?")
+- **target_text:** Resolution text (e.g., "Log into the portal and cancel it there.")
+---
+# Training Details
+**Training Process:**
+- Fine-tuned for 3 epochs
+- Loss reduced progressively across epochs
+**Hyperparameters:**
+- Epochs: 3
+- Learning Rate: 2e-5
+- Batch Size: 8
+- Weight Decay: 0.01
+- Mixed Precision: FP16
+**Performance Metrics:**
+- Final Training Loss: ~0.0140
+- Final Validation Loss: ~0.0121
+---
+# Inference Example
+```python
+import torch
+from transformers import BartForConditionalGeneration, BartTokenizer
+def load_model(model_path):
+    tokenizer = BartTokenizer.from_pretrained(model_path)
+    model = BartForConditionalGeneration.from_pretrained(model_path).half()  # FP16
+    model.eval()
+    return model, tokenizer
+def generate_resolution(issue, model, tokenizer, device="cuda"):
+    input_text = f"Customer: {issue}"
+    inputs = tokenizer(
+        input_text,
+        max_length=512,
+        padding="max_length",
+        truncation=True,
+        return_tensors="pt"
+    ).to(device)
+    outputs = model.generate(
+        inputs["input_ids"],
+        max_length=128,
+        num_beams=4,
+        early_stopping=True
+    )
+    return tokenizer.decode(outputs[0], skip_special_tokens=True)
+# Example usage
+if __name__ == "__main__":
+    model_path = "your-username/bart-resolution-summarizer-fp16"  # Replace with your HF repo
+    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
+    model, tokenizer = load_model(model_path)
+    model.to(device)
+    issue = "How do I cancel my order?"
+    resolution = generate_resolution(issue, model, tokenizer, device)
+    print(f"Issue: {issue}")
+    print(f"Resolution: {resolution}")
+Expected Output:
+Issue: How do I cancel my order?
+Resolution: Log into the portal and cancel it there.
+```
+# Quantization & Optimization
+Quantization: Applied FP16 using PyTorch’s .half() post-training for faster inference and reduced model size (~279MB from ~558MB).
+Optimization: Trained with mixed precision (FP16) on CUDA, further quantized for deployment efficiency.
+Usage
+Input: Text representing a customer support issue (e.g., "Customer: My payment isn’t going through, help!")
+Output: Text providing an actionable resolution (e.g., "Check your card details and try again.")
+# Limitations
+Model may struggle with issues requiring specific resolutions not well-represented in the training data (e.g., time-related queries like "When can I call support?").
+Resolution extraction relied on heuristics, potentially missing nuanced answers in verbose responses.
+# Future Improvements
+Refine resolution extraction with more advanced NLP techniques or manual curation.
+Fine-tune on additional customer support datasets for broader coverage.
+Evaluate with formal metrics (e.g., ROUGE) for quantitative performance.