Instructions to use SatyamSinghal/taskmind-1.1b-chat-lora with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use SatyamSinghal/taskmind-1.1b-chat-lora with PEFT:

from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("TinyLlama/TinyLlama-1.1B-Chat-v1.0")
model = PeftModel.from_pretrained(base_model, "SatyamSinghal/taskmind-1.1b-chat-lora")

Transformers

How to use SatyamSinghal/taskmind-1.1b-chat-lora with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="SatyamSinghal/taskmind-1.1b-chat-lora")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("SatyamSinghal/taskmind-1.1b-chat-lora", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use SatyamSinghal/taskmind-1.1b-chat-lora with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "SatyamSinghal/taskmind-1.1b-chat-lora"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "SatyamSinghal/taskmind-1.1b-chat-lora",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/SatyamSinghal/taskmind-1.1b-chat-lora

SGLang

How to use SatyamSinghal/taskmind-1.1b-chat-lora with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "SatyamSinghal/taskmind-1.1b-chat-lora" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "SatyamSinghal/taskmind-1.1b-chat-lora",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "SatyamSinghal/taskmind-1.1b-chat-lora" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "SatyamSinghal/taskmind-1.1b-chat-lora",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use SatyamSinghal/taskmind-1.1b-chat-lora with Docker Model Runner:
```
docker model run hf.co/SatyamSinghal/taskmind-1.1b-chat-lora
```

TaskMind — TinyLlama 1.1B Chat LoRA

A LoRA adapter fine-tuned on TinyLlama/TinyLlama-1.1B-Chat-v1.0 for WhatsApp message intent classification and structured task extraction in English and Hinglish (Hindi–English code-switch).

Trained entirely on Apple Silicon MPS (M5 Max) — no cloud GPU, no cost, 2 minutes 12 seconds.

📦 Full pipeline, production API server, test suite, and deployment docs → github.com/vijendradhanotiya/taskmind-ai

What It Does

Given a raw WhatsApp team message, the model extracts structured intent as JSON — the model itself outputs valid JSON, no regex hacks needed.

Input:

@Neha the design review is pending from your end

Output:

{
  "intent": "TASK_ASSIGN",
  "assigneeName": "Neha",
  "project": null,
  "title": "Design review",
  "deadline": null,
  "priority": "normal",
  "progressPercent": null
}

Supported Intents

Intent	Trigger Pattern	Example
`TASK_ASSIGN`	@mention + action	"@Rohan review the PR I just pushed"
`TASK_DONE`	completion language	"done bhai, merged the PR"
`TASK_UPDATE`	progress percentage	"login page 60% ho gaya"
`TASK_BLOCKED`	blocker / error	"CI/CD pipeline is broken again"
`PROGRESS_NOTE`	status update	"deployment failed on prod — rollback initiated"
`GENERAL_MESSAGE`	no task signal	"good morning team!", "okay noted"

Quick Start

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch, json

BASE_MODEL = "TinyLlama/TinyLlama-1.1B-Chat-v1.0"
ADAPTER    = "SatyamSinghal/taskmind-1.1b-chat-lora"

tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL)
model     = AutoModelForCausalLM.from_pretrained(BASE_MODEL, torch_dtype=torch.float32)
model     = PeftModel.from_pretrained(model, ADAPTER)
model.eval()

SYSTEM_PROMPT = (
    "You are TaskMind, an AI that reads WhatsApp messages and extracts structured task data. "
    "Always respond with valid JSON only. No explanation. No markdown."
)

def classify(message: str) -> dict:
    chat = [
        {"role": "system", "content": SYSTEM_PROMPT},
        {"role": "user",   "content": message},
    ]
    ids = tokenizer.apply_chat_template(chat, return_tensors="pt", add_generation_prompt=True)
    with torch.no_grad():
        out = model.generate(ids, max_new_tokens=150, do_sample=False, pad_token_id=tokenizer.eos_token_id)
    text = tokenizer.decode(out[0][ids.shape[-1]:], skip_special_tokens=True).strip()
    try:
        return json.loads(text)
    except json.JSONDecodeError:
        return {"raw": text, "parse_success": False}

print(classify("@Agrim fix the growstreams deck ASAP"))

Training Details

Parameter	Value
Base model	TinyLlama/TinyLlama-1.1B-Chat-v1.0
Method	LoRA (Low-Rank Adaptation) via SFT
LoRA rank	r = 16
LoRA alpha	32
Target modules	q_proj, v_proj
Trainable params	~4.2M / 1.1B (0.38%)
Dataset size	131 training + 20 validation examples
Epochs	5
Batch size	4
Max sequence length	512
Optimizer	AdamW (paged)
Learning rate	2e-4 with cosine schedule
Hardware	Apple M5 Max — MPS backend
Training time	2 minutes 12 seconds
Training cost	$0

Performance

Metric	Before Fine-tuning	After Fine-tuning
Eval loss	2.28	0.39
Token accuracy	59%	92.8%
JSON parse success	~30%	~97%
Correct intent	Often wrong	Correct in tested cases

Before vs After — Real Examples

Message	Base Model	TaskMind
`@Agrim fix deck ASAP`	Fake deadline 2021-01-01, assignee "John Doe"	`TASK_ASSIGN`, correct title
`done bhai, merged the PR`	Fake project "PR-123", wrong intent	`TASK_DONE`, null fields
`login page 60% ho gaya`	`TASK_ASSIGN`, hallucinated data	`TASK_UPDATE`, progressPercent=60
`getting 500 error`	Hallucinated task	`GENERAL_MESSAGE`
`Sure sir ready for it`	John Doe, fake task	`GENERAL_MESSAGE`, null

API Server

A production-ready FastAPI server wrapping this adapter is available in the companion repo.

git clone https://github.com/vijendradhanotiya/taskmind-ai
pip install -r requirements.txt
python3 -m uvicorn api.main:app --host 0.0.0.0 --port 8001

OpenAI-compatible endpoints included:

# Classify a WhatsApp message
curl -X POST http://localhost:8001/v1/classify \
  -H "Content-Type: application/json" \
  -d '{"message": "@Vijendra deploy karo production pe aaj raat tak, urgent hai!"}'

# Generic chat completion
curl -X POST http://localhost:8001/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "user", "content": "What is LoRA?"}], "max_tokens": 150}'

Framework Versions

Library	Version
PEFT	0.18.1
TRL	1.1.0
Transformers	4.57.0
PyTorch	2.2.2
Datasets	4.8.4
Tokenizers	0.22.1

Contributors

Name	Role	GitHub
Satyam Singhal	Model training, dataset curation, API development	@SatyamSinghal
Vijendra Dhanotiya	Architecture, deployment, repo maintainer	@vijendradhanotiya

Full source, deployment guide, hardware benchmarks, and test suite: github.com/vijendradhanotiya/taskmind-ai

Citation

If you use this model or the TaskMind pipeline in your work:

@misc{taskmind2025,
  title   = {TaskMind: WhatsApp Intent Classification via LoRA Fine-tuning on TinyLlama},
  author  = {Singhal, Satyam and Dhanotiya, Vijendra},
  year    = {2025},
  url     = {https://huggingface.co/SatyamSinghal/taskmind-1.1b-chat-lora},
  note    = {LoRA adapter for TinyLlama-1.1B-Chat-v1.0, trained on Apple Silicon MPS}
}

@software{vonwerra2020trl,
  title   = {{TRL: Transformers Reinforcement Learning}},
  author  = {von Werra, Leandro and Belkada, Younes and Tunstall, Lewis and Beeching, Edward
             and Thrush, Tristan and Lambert, Nathan and Huang, Shengyi and Rasul, Kashif
             and Gallouedec, Quentin},
  license = {Apache-2.0},
  url     = {https://github.com/huggingface/trl},
  year    = {2020}
}

Downloads last month: -

Model tree for SatyamSinghal/taskmind-1.1b-chat-lora

Base model

TinyLlama/TinyLlama-1.1B-Chat-v1.0

Adapter

(1508)

this model

SatyamSinghal
/

taskmind-1.1b-chat-lora