metadata
base_model: Qwen/Qwen3.5-2B
tags:
- text-generation-inference
- transformers
- unsloth
- qwen3_5
license: apache-2.0
language:
- hi
- en
- ta
- te
- kn
- bn
- mr
The model was finetuned on ~128,000 curated transcripts across different domanins and language preferences
- Expanded Training: Now optimized for CX Support, Healthcare, Loan Collection, Insurance, Ecommerce, and Concierge services.
- Feature Improvement: Significantly enhanced relative date-time extraction for more precise data processing.
- Training Overview
- You can plug it into your calling or voice AI stack to automatically extract:
- Enum-based classifications (e.g., call outcome, intent, disposition)
- Conversation summaries
- Action items / follow-ups
- Relative DateTime Artifacts
- You can plug it into your calling or voice AI stack to automatically extract:
It’s built to handle real-world Hindi, English, Indic noisy transcripts. test out our even smaller SLM
Finetuning Parameters:
rank = 64 # kept small to know change inherent model intelligence but to make sure structured ectraction is followed
trainer = SFTTrainer(
model = model,
tokenizer = tokenizer,
train_dataset = train_dataset,
eval_dataset = test_dataset,
args = SFTConfig(
dataset_text_field = "prompt",
max_seq_length = max_seq_length,
per_device_train_batch_size = 5,
gradient_accumulation_steps = 5,
warmup_steps = 10,
num_train_epochs = 2,
learning_rate = 2e-4,
lr_scheduler_type = "linear",
optim = "adamw_8bit",
weight_decay = 0.01, # Unsloth default (was 0.001)
seed = SEED,
logging_steps = 50,
report_to = "wandb",
eval_strategy = "steps",
eval_steps = 5000,
save_strategy = "steps",
save_steps = 5000,
load_best_model_at_end = True,
metric_for_best_model = "eval_loss",
output_dir = "outputs_qwen35_2b",
dataset_num_proc = 8,
fp16= not torch.cuda.is_bf16_supported(),
bf16= torch.cuda.is_bf16_supported()
),
)
Provide the below schema for best output:
response_schema = {
"type": "object",
"properties": {
"key_points": {
"type": "array",
"items": {"type": "string"},
"nullable": True,
},
"action_items": {
"type": "array",
"items": {"type": "string"},
"nullable": True,
},
"summary": {"type": "string"},
"classification": classification_schema,
"callback_requested": {
"type": "boolean",
"nullable": False,
"description": "If the user requested a callback or mentiones currently he is busy then value is true otherwise false",
},
"callback_requested_time": {
"type": "string",
"nullable": True,
"description": "ISO 8601 datetime string (YYYY-MM-DDTHH:MM:SS) in the call's timezone, if user requested a callback",
},
},
"required": ["summary", "classification"],
}
Uploaded finetuned model
- Developed by: RinggAI
- License: apache-2.0
- Finetuned from model : Qwen/Qwen3.5-2B
This qwen3_5 model was trained 2x faster with Unsloth and Huggingface's TRL library.


