YashikaNagpal commited on
Commit
d7abd5f
·
verified ·
1 Parent(s): c5301b5

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +77 -0
README.md ADDED
@@ -0,0 +1,77 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Model Card: T5 Email Response Generator
2
+ - **Model Details**
3
+ - **Model Name:** T5 Email Response Generator
4
+ - **Model Version:** 1.0
5
+ - **Base Model:** t5-base (Hugging Face Transformers)
6
+ - **Task:** Text generation for email response automation
7
+
8
+ # Model Description
9
+ This model is a fine-tuned version of the T5 (Text-to-Text Transfer Transformer) t5-base model, designed to generate concise and contextually appropriate email responses. It was trained on a custom dataset (email.csv) containing input prompts and corresponding email responses. The model supports both FP32 and FP16 precision, with the latter optimized for reduced memory usage on GPUs.
10
+
11
+ # Intended Use
12
+ - **Primary Use Case:** Automating email response generation for common queries (e.g., scheduling, confirmations, updates).
13
+ - **Target Users:** Individuals or organizations looking to streamline email communication.
14
+ - **Out of Scope:** Generating long-form content, handling highly sensitive or complex email threads requiring human judgment.
15
+ -
16
+ # Model Architecture
17
+ -b**Base Model:** T5 (t5-base)
18
+ - **Parameters:** ~220M
19
+ - **Layers:** 12 encoder layers, 12 decoder layers
20
+ - **Hidden Size:** 768
21
+ - **Precision:** Available in FP32 (full precision) and FP16 (mixed precision)
22
+ # Training Details
23
+ - **Dataset**
24
+ - **Source:** Custom dataset (email.csv)
25
+ - **Format:** CSV with columns input (prompt) and output (response)
26
+ # Preprocessing:
27
+ - Added prefix "generate response: " to all inputs.
28
+ - Filtered out examples with None values, lengths > 100 characters, or containing "dataset" in the input.
29
+ - **Split:** 90% training, 10% validation.
30
+ # Training Procedure
31
+ - **Framework:** Hugging Face Transformers
32
+ - **Hardware:** GPU (e.g., NVIDIA with 12 GB memory)
33
+ # Training Arguments:
34
+ - **Epochs:** 30
35
+ - **Batch Size:** 4 (effective 8 with gradient accumulation)
36
+ - **Learning Rate:** 3e-4
37
+ - **Warmup Steps:** 10
38
+ - **Weight Decay:** 0.01
39
+ - **Optimizer:** AdamW
40
+ - **Mixed Precision:** FP16 enabled
41
+ - **Evaluation:** Performed at the end of each epoch, using validation loss as the metric for the best model.
42
+ # Tokenization
43
+ - **Tokenizer:** T5Tokenizer from t5-base
44
+ - **Max Length:** 128 tokens (input and output)
45
+ - **Padding:** Applied with max_length
46
+ - **Truncation:** Enabled for longer sequences
47
+ # Performance
48
+ - **Metrics:** Validation loss (best model selected based on lowest loss)
49
+ # Sample Outputs:
50
+ - **Input:** "Can you send me the report?"
51
+ - **Output:** I’ll send the report over this afternoon!
52
+ - **Input:** Write a follow-up email for our last discussion.
53
+ - **Output:** I’ll send a follow-up for you shortly.
54
+ - **Limitations:** Performance depends on the quality and diversity of email.csv. May struggle with prompts outside the training distribution.
55
+
56
+ # Installation
57
+
58
+ - pip install transformers datasets torch pandas accelerate -q
59
+ # Loading the Model
60
+ ```python
61
+ from transformers import T5ForConditionalGeneration, T5Tokenizer
62
+ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
63
+
64
+ # Load FP16 model
65
+ model = T5ForConditionalGeneration.from_pretrained("./t5_email_finetuned_fp16").to(device)
66
+ tokenizer = T5Tokenizer.from_pretrained("./t5_email_finetuned_fp16")
67
+
68
+ # Generate a response
69
+ def generate_response(prompt, max_length=128):
70
+ input_text = f"generate response: {prompt}"
71
+ inputs = tokenizer(input_text, max_length=128, truncation=True, padding="max_length", return_tensors="pt").to(device)
72
+ outputs = model.generate(input_ids=inputs["input_ids"], attention_mask=inputs["attention_mask"], max_length=max_length, num_beams=4, early_stopping=True)
73
+ return tokenizer.decode(outputs[0], skip_special_tokens=True)
74
+
75
+ # Example
76
+ print(generate_response("Can you send me the report?"))
77
+ ```