Model Card for Model ID

Model Details

Model Description

This model is a LoRA (Low-Rank Adaptation) adapter for Llama-3.2-3B-Instruct, specifically fine-tuned for high-quality multilingual(fr,en,sp) summarization of phone call transcripts. It has been optimized to handle long-form dialogue and extract key information across multiple European languages.

  • Training Time: 2026 jan
  • Model type: LoRA Adapter (PEFT)
  • Language(s) (NLP): multilangue (finetuned on FR,EN,SP )
  • Finetuned from model [optional]: [meta-llama/Llama-3.2-3B-Instruct]

Quick Start

Since this is a LoRA adapter, you must load the base model first, then apply these adapters on top.

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base_model_id = "meta-llama/Meta-Llama-3-8B-Instruct"
adapter_id = "ringover/ringover-summaries-llama3b-instruct-v1.2-lora"

base_model = AutoModelForCausalLM.from_pretrained(base_model_id)
tokenizer = AutoTokenizer.from_pretrained(base_model_id)

#  Load lora adapter 
ft_model = PeftModel.from_pretrained(base_model, adapter_id)

# Ready for inference
inputs = tokenizer("Summarizing the following phone call transcript ", return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=700)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details

Training Data

- 14 724 total  transcirptions 
- train & test dataset : 13724 trans , eval dataset :  1000 transcriptions
- 95% transcriptions are ≤  8535 tokens
- max length : 33201 tokens
- Language distribution :
    
    Counter({'fr': 11079,
    'es': 3176,
    'en': 1393,
    'ca': 49,
    'it': 28,
    'pt': 13,
    'de': 3,
    'pl': 1})

Training Procedure

This model was fine-tuned using the SFTTrainer from the trl library.

  • Framework : PyTorch & Hugging Face Transformers

  • Library : PEFT (Parameter-Efficient Fine-Tuning)

  • Precision: BF16

Training Hyperparameters

  per_device_train_batch_size=2,
    gradient_accumulation_steps=4,
    num_train_epochs=3,
    learning_rate=2e-5, # 1e-4 was too high 
    
    logging_steps=50,
    warmup_ratio=0.1,
    
    eval_strategy="steps",
    eval_steps =200,
    save_strategy="steps",
    save_steps =400,
     
    report_to="tensorboard",
    load_best_model_at_end=True,
    save_total_limit=1,

    metric_for_best_model="eval_loss"
    greater_is_better=False,
    
    # metric_for_best_model="eval_rougeL",
    # greater_is_better=True,

    fp16=True,
    lr_scheduler_type="cosine",

     LoraConfig(
        r=16, #rank 
        lora_alpha=32, # alpha value
        lora_dropout=0.1,
        target_modules=["q_proj","k_proj","v_proj","o_proj","gate_proj","down_proj","up_proj"],
        bias="none",
        task_type="CAUSAL_LM",
    )

Evaluation

Metrics

Multi-dimensional evaluation approach:

  • Base metrics: rouge, bleu, bertoscore, LLM-as-a-juge (GPT4o-mini)

  • Language Count meric: : DetectLang

  • Lexical metrics:(finetuned summ V.S. gold summ) : BLEU_details_Brevity_Penalty, chrF,METEOR, bleurt

  • Facts metrics: (finetuned summ V.S. Context): alignscore, uniEval

Results

See Ringover Summarization Doc

Downloads last month
16
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ringover/ringover-summaries-llama3b-instruct-v1.2

Adapter
(616)
this model