๐Ÿ‡ฎ๐Ÿ‡ณ IndiSum-AI: Indian News Summarizer

IndiSum-AI is a fine-tuned PRIMERA (LED-based) model optimized for abstractive summarization of the Indian news ecosystem. It is specifically trained to handle long-form articles related to Indian finance, technology, space missions (ISRO), and government policy.

๐Ÿš€ Model Description

  • Developed by: Mohd Musheer (Takshashila Mahavidyalaya, Amravati)
  • Model type: LED/PRIMERA (Long-form Encoder-Decoder)
  • Finetuned from: allenai/PRIMERA
  • Language: English
  • Context Window: 1024 tokens

๐Ÿ“Š Evaluation Results

Evaluated on a test set of Indian news articles (2025-2026 contexts):

Metric Score
ROUGE-1 71.43
ROUGE-2 46.15
ROUGE-L 68.25
BERTScore (F1) 0.93

๐Ÿ› ๏ธ How to Use

You can use this model directly with the Hugging Face pipeline or AutoModelForSeq2SeqLM.

Simple Pipeline Usage:

from transformers import pipeline

summarizer = pipeline("summarization", model="mohd-musheer/News-Summarizer-AI")
text = "PASTE_YOUR_LONG_NEWS_ARTICLE_HERE"
print(summarizer(text, max_length=128, min_length=30, do_sample=False))
Manual Usage (Best for Performance):
Python
import torch
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("mohd-musheer/News-Summarizer-AI")
model = AutoModelForSeq2SeqLM.from_pretrained("mohd-musheer/News-Summarizer-AI")

article = "..."
inputs = tokenizer(article, truncation=True, max_length=1024, return_tensors="pt")

# Global attention on the first token is recommended for LED/PRIMERA
global_attention_mask = torch.zeros_like(inputs["input_ids"])
global_attention_mask[:, 0] = 1

summary_ids = model.generate(
    inputs["input_ids"], 
    global_attention_mask=global_attention_mask, 
    max_length=128, 
    num_beams=4
)
print(tokenizer.decode(summary_ids[0], skip_special_tokens=True))
Downloads last month
65
Safetensors
Model size
0.4B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for mohd-musheer/News-Summarizer-AI

Base model

allenai/PRIMERA
Finetuned
(5)
this model

Dataset used to train mohd-musheer/News-Summarizer-AI

Evaluation results