SatyamSinghal commited on
Commit
1c3aaf2
·
verified ·
1 Parent(s): c055262

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +199 -31
README.md CHANGED
@@ -1,62 +1,230 @@
1
  ---
2
  base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
3
  library_name: peft
4
- model_name: taskmind_lora_peft
5
  tags:
6
- - base_model:adapter:TinyLlama/TinyLlama-1.1B-Chat-v1.0
7
- - lora
8
- - sft
9
- - transformers
10
- - trl
11
- licence: license
 
 
 
 
 
12
  pipeline_tag: text-generation
 
 
 
 
 
13
  ---
14
 
15
- # Model Card for taskmind_lora_peft
16
 
17
- This model is a fine-tuned version of [TinyLlama/TinyLlama-1.1B-Chat-v1.0](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0).
18
- It has been trained using [TRL](https://github.com/huggingface/trl).
19
 
20
- ## Quick start
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
 
22
  ```python
23
- from transformers import pipeline
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
24
 
25
- question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
26
- generator = pipeline("text-generation", model="None", device="cuda")
27
- output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
28
- print(output["generated_text"])
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
29
  ```
30
 
31
- ## Training procedure
32
 
33
-
34
 
 
 
 
 
 
 
 
 
35
 
 
36
 
37
- This model was trained with SFT.
38
 
39
- ### Framework versions
 
 
 
40
 
41
- - PEFT 0.18.1
42
- - TRL: 1.1.0
43
- - Transformers: 4.57.0
44
- - Pytorch: 2.2.2
45
- - Datasets: 4.8.4
46
- - Tokenizers: 0.22.1
47
 
48
- ## Citations
49
 
 
50
 
 
 
 
 
 
 
 
 
 
 
 
51
 
52
- Cite TRL as:
53
-
54
  ```bibtex
55
  @software{vonwerra2020trl,
56
  title = {{TRL: Transformers Reinforcement Learning}},
57
- author = {von Werra, Leandro and Belkada, Younes and Tunstall, Lewis and Beeching, Edward and Thrush, Tristan and Lambert, Nathan and Huang, Shengyi and Rasul, Kashif and Gallouédec, Quentin},
 
 
58
  license = {Apache-2.0},
59
  url = {https://github.com/huggingface/trl},
60
  year = {2020}
61
  }
62
- ```
 
1
  ---
2
  base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
3
  library_name: peft
4
+ model_name: TaskMind — TinyLlama 1.1B Chat LoRA
5
  tags:
6
+ - lora
7
+ - sft
8
+ - peft
9
+ - trl
10
+ - transformers
11
+ - text-classification
12
+ - intent-detection
13
+ - task-management
14
+ - hinglish
15
+ - base_model:adapter:TinyLlama/TinyLlama-1.1B-Chat-v1.0
16
+ license: apache-2.0
17
  pipeline_tag: text-generation
18
+ language:
19
+ - en
20
+ - hi
21
+ metrics:
22
+ - token_accuracy
23
  ---
24
 
25
+ # TaskMind TinyLlama 1.1B Chat LoRA
26
 
27
+ A LoRA adapter fine-tuned on [TinyLlama/TinyLlama-1.1B-Chat-v1.0](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0) for **WhatsApp message intent classification and structured task extraction** in English and Hinglish (Hindi–English code-switch).
 
28
 
29
+ Trained entirely on **Apple Silicon MPS (M5 Max)** — no cloud GPU, no cost, 2 minutes 12 seconds.
30
+
31
+ > 📦 Full pipeline, production API server, test suite, and deployment docs →
32
+ > [github.com/vijendradhanotiya/taskmind-ai](https://github.com/vijendradhanotiya/taskmind-ai)
33
+
34
+ ---
35
+
36
+ ## What It Does
37
+
38
+ Given a raw WhatsApp team message, the model extracts structured intent as JSON — the model itself outputs valid JSON, no regex hacks needed.
39
+
40
+ **Input:**
41
+ ```
42
+ @Neha the design review is pending from your end
43
+ ```
44
+
45
+ **Output:**
46
+ ```json
47
+ {
48
+ "intent": "TASK_ASSIGN",
49
+ "assigneeName": "Neha",
50
+ "project": null,
51
+ "title": "Design review",
52
+ "deadline": null,
53
+ "priority": "normal",
54
+ "progressPercent": null
55
+ }
56
+ ```
57
+
58
+ ---
59
+
60
+ ## Supported Intents
61
+
62
+ | Intent | Trigger Pattern | Example |
63
+ |---|---|---|
64
+ | `TASK_ASSIGN` | @mention + action | "@Rohan review the PR I just pushed" |
65
+ | `TASK_DONE` | completion language | "done bhai, merged the PR" |
66
+ | `TASK_UPDATE` | progress percentage | "login page 60% ho gaya" |
67
+ | `TASK_BLOCKED` | blocker / error | "CI/CD pipeline is broken again" |
68
+ | `PROGRESS_NOTE` | status update | "deployment failed on prod — rollback initiated" |
69
+ | `GENERAL_MESSAGE` | no task signal | "good morning team!", "okay noted" |
70
+
71
+ ---
72
+
73
+ ## Quick Start
74
 
75
  ```python
76
+ from peft import PeftModel
77
+ from transformers import AutoModelForCausalLM, AutoTokenizer
78
+ import torch, json
79
+
80
+ BASE_MODEL = "TinyLlama/TinyLlama-1.1B-Chat-v1.0"
81
+ ADAPTER = "SatyamSinghal/taskmind-1.1b-chat-lora"
82
+
83
+ tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL)
84
+ model = AutoModelForCausalLM.from_pretrained(BASE_MODEL, torch_dtype=torch.float32)
85
+ model = PeftModel.from_pretrained(model, ADAPTER)
86
+ model.eval()
87
+
88
+ SYSTEM_PROMPT = (
89
+ "You are TaskMind, an AI that reads WhatsApp messages and extracts structured task data. "
90
+ "Always respond with valid JSON only. No explanation. No markdown."
91
+ )
92
+
93
+ def classify(message: str) -> dict:
94
+ chat = [
95
+ {"role": "system", "content": SYSTEM_PROMPT},
96
+ {"role": "user", "content": message},
97
+ ]
98
+ ids = tokenizer.apply_chat_template(chat, return_tensors="pt", add_generation_prompt=True)
99
+ with torch.no_grad():
100
+ out = model.generate(ids, max_new_tokens=150, do_sample=False, pad_token_id=tokenizer.eos_token_id)
101
+ text = tokenizer.decode(out[0][ids.shape[-1]:], skip_special_tokens=True).strip()
102
+ try:
103
+ return json.loads(text)
104
+ except json.JSONDecodeError:
105
+ return {"raw": text, "parse_success": False}
106
+
107
+ print(classify("@Agrim fix the growstreams deck ASAP"))
108
+ ```
109
+
110
+ ---
111
+
112
+ ## Training Details
113
+
114
+ | Parameter | Value |
115
+ |---|---|
116
+ | Base model | TinyLlama/TinyLlama-1.1B-Chat-v1.0 |
117
+ | Method | LoRA (Low-Rank Adaptation) via SFT |
118
+ | LoRA rank | r = 16 |
119
+ | LoRA alpha | 32 |
120
+ | Target modules | q_proj, v_proj |
121
+ | Trainable params | ~4.2M / 1.1B (0.38%) |
122
+ | Dataset size | 131 training + 20 validation examples |
123
+ | Epochs | 5 |
124
+ | Batch size | 4 |
125
+ | Max sequence length | 512 |
126
+ | Optimizer | AdamW (paged) |
127
+ | Learning rate | 2e-4 with cosine schedule |
128
+ | Hardware | Apple M5 Max — MPS backend |
129
+ | Training time | 2 minutes 12 seconds |
130
+ | Training cost | $0 |
131
 
132
+ ---
133
+
134
+ ## Performance
135
+
136
+ | Metric | Before Fine-tuning | After Fine-tuning |
137
+ |---|---|---|
138
+ | Eval loss | 2.28 | **0.39** |
139
+ | Token accuracy | 59% | **92.8%** |
140
+ | JSON parse success | ~30% | **~97%** |
141
+ | Correct intent | Often wrong | **Correct in tested cases** |
142
+
143
+ ### Before vs After — Real Examples
144
+
145
+ | Message | Base Model | TaskMind |
146
+ |---|---|---|
147
+ | `@Agrim fix deck ASAP` | Fake deadline 2021-01-01, assignee "John Doe" | `TASK_ASSIGN`, correct title |
148
+ | `done bhai, merged the PR` | Fake project "PR-123", wrong intent | `TASK_DONE`, null fields |
149
+ | `login page 60% ho gaya` | `TASK_ASSIGN`, hallucinated data | `TASK_UPDATE`, progressPercent=60 |
150
+ | `getting 500 error` | Hallucinated task | `GENERAL_MESSAGE` |
151
+ | `Sure sir ready for it` | John Doe, fake task | `GENERAL_MESSAGE`, null |
152
+
153
+ ---
154
+
155
+ ## API Server
156
+
157
+ A production-ready FastAPI server wrapping this adapter is available in the companion repo.
158
+
159
+ ```bash
160
+ git clone https://github.com/vijendradhanotiya/taskmind-ai
161
+ pip install -r requirements.txt
162
+ python3 -m uvicorn api.main:app --host 0.0.0.0 --port 8001
163
+ ```
164
+
165
+ OpenAI-compatible endpoints included:
166
+
167
+ ```bash
168
+ # Classify a WhatsApp message
169
+ curl -X POST http://localhost:8001/v1/classify \
170
+ -H "Content-Type: application/json" \
171
+ -d '{"message": "@Vijendra deploy karo production pe aaj raat tak, urgent hai!"}'
172
+
173
+ # Generic chat completion
174
+ curl -X POST http://localhost:8001/v1/chat/completions \
175
+ -H "Content-Type: application/json" \
176
+ -d '{"messages": [{"role": "user", "content": "What is LoRA?"}], "max_tokens": 150}'
177
  ```
178
 
179
+ ---
180
 
181
+ ## Framework Versions
182
 
183
+ | Library | Version |
184
+ |---|---|
185
+ | PEFT | 0.18.1 |
186
+ | TRL | 1.1.0 |
187
+ | Transformers | 4.57.0 |
188
+ | PyTorch | 2.2.2 |
189
+ | Datasets | 4.8.4 |
190
+ | Tokenizers | 0.22.1 |
191
 
192
+ ---
193
 
194
+ ## Contributors
195
 
196
+ | Name | Role | GitHub |
197
+ |---|---|---|
198
+ | **Satyam Singhal** | Model training, dataset curation, API development | [@SatyamSinghal](https://github.com/SatyamSinghal) |
199
+ | **Vijendra Dhanotiya** | Architecture, deployment, repo maintainer | [@vijendradhanotiya](https://github.com/vijendradhanotiya) |
200
 
201
+ > Full source, deployment guide, hardware benchmarks, and test suite:
202
+ > **[github.com/vijendradhanotiya/taskmind-ai](https://github.com/vijendradhanotiya/taskmind-ai)**
 
 
 
 
203
 
204
+ ---
205
 
206
+ ## Citation
207
 
208
+ If you use this model or the TaskMind pipeline in your work:
209
+
210
+ ```bibtex
211
+ @misc{taskmind2025,
212
+ title = {TaskMind: WhatsApp Intent Classification via LoRA Fine-tuning on TinyLlama},
213
+ author = {Singhal, Satyam and Dhanotiya, Vijendra},
214
+ year = {2025},
215
+ url = {https://huggingface.co/SatyamSinghal/taskmind-1.1b-chat-lora},
216
+ note = {LoRA adapter for TinyLlama-1.1B-Chat-v1.0, trained on Apple Silicon MPS}
217
+ }
218
+ ```
219
 
 
 
220
  ```bibtex
221
  @software{vonwerra2020trl,
222
  title = {{TRL: Transformers Reinforcement Learning}},
223
+ author = {von Werra, Leandro and Belkada, Younes and Tunstall, Lewis and Beeching, Edward
224
+ and Thrush, Tristan and Lambert, Nathan and Huang, Shengyi and Rasul, Kashif
225
+ and Gallouedec, Quentin},
226
  license = {Apache-2.0},
227
  url = {https://github.com/huggingface/trl},
228
  year = {2020}
229
  }
230
+ ```