The model was finetuned on ~128,000 curated transcripts across different domanins and language preferences

  • Expanded Training: Now optimized for CX Support, Healthcare, Loan Collection, Insurance, Ecommerce, and Concierge services.
  • Feature Improvement: Significantly enhanced relative date-time extraction for more precise data processing.
  • Training Overview
    • You can plug it into your calling or voice AI stack to automatically extract:
      • Enum-based classifications (e.g., call outcome, intent, disposition)
      • Conversation summaries
      • Action items / follow-ups
      • Relative DateTime Artifacts

It’s built to handle real-world Hindi, English, Indic noisy transcripts. test out our even smaller SLM

Training Overview

Finetuning Parameters:

rank = 64 # kept small to know change inherent model intelligence but to make sure structured ectraction is followed
trainer = SFTTrainer(
    model = model,
    tokenizer = tokenizer,
    train_dataset = train_dataset,
    eval_dataset  = test_dataset,
    args = SFTConfig(
        dataset_text_field = "prompt",
        max_seq_length = max_seq_length,
        per_device_train_batch_size = 5,   
        gradient_accumulation_steps = 5,   

        warmup_steps       = 10,           
        num_train_epochs   = 2,            
        learning_rate      = 2e-4,         
        lr_scheduler_type  = "linear",     

        optim        = "adamw_8bit",
        weight_decay = 0.01,               # Unsloth default (was 0.001)
        seed         = SEED,

        logging_steps  = 50,
        report_to      = "wandb",

        eval_strategy  = "steps",
        eval_steps     = 5000,
        save_strategy  = "steps",
        save_steps     = 5000,
        load_best_model_at_end  = True,    
        metric_for_best_model   = "eval_loss",

        output_dir     = "outputs_qwen35_2b",
        dataset_num_proc = 8,
        fp16= not torch.cuda.is_bf16_supported(),
        bf16=  torch.cuda.is_bf16_supported()
    ),
)

Provide the below schema for best output:

response_schema = {
        "type": "object",
        "properties": {
            "key_points": {
                "type": "array",
                "items": {"type": "string"},
                "nullable": True,
            },
            "action_items": {
                "type": "array",
                "items": {"type": "string"},
                "nullable": True,
            },
            "summary": {"type": "string"},
            "classification": classification_schema,
            "callback_requested": {
                "type": "boolean",
                "nullable": False,
                "description": "If the user requested a callback or mentiones currently he is busy then value is true otherwise false",
            },
            "callback_requested_time": {
                "type": "string",
                "nullable": True,
                "description": "ISO 8601 datetime string (YYYY-MM-DDTHH:MM:SS) in the call's timezone, if user requested a callback",
            },
        },
        "required": ["summary", "classification"],
    }

Uploaded finetuned model

  • Developed by: RinggAI
  • License: apache-2.0
  • Finetuned from model : Qwen/Qwen3.5-2B

This qwen3_5 model was trained 2x faster with Unsloth and Huggingface's TRL library.

Downloads last month
27
Safetensors
Model size
2B params
Tensor type
F32
·
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for RinggAI/Transcript-Analytics-Qwen3.5-2B

Finetuned
Qwen/Qwen3.5-2B
Finetuned
(46)
this model