🇮🇳 IndiSum-AI: Indian News Summarizer

IndiSum-AI is a fine-tuned PRIMERA (LED-based) model optimized for abstractive summarization of the Indian news ecosystem. It is specifically trained to handle long-form articles related to Indian finance, technology, space missions (ISRO), and government policy.

🚀 Model Description

Developed by: Mohd Musheer (Takshashila Mahavidyalaya, Amravati)
Model type: LED/PRIMERA (Long-form Encoder-Decoder)
Finetuned from: allenai/PRIMERA
Language: English
Context Window: 1024 tokens

📊 Evaluation Results

Evaluated on a test set of Indian news articles (2025-2026 contexts):

Metric	Score
ROUGE-1	71.43
ROUGE-2	46.15
ROUGE-L	68.25
BERTScore (F1)	0.93

🛠️ How to Use

You can use this model directly with the Hugging Face pipeline or AutoModelForSeq2SeqLM.

Simple Pipeline Usage:

from transformers import pipeline

summarizer = pipeline("summarization", model="mohd-musheer/News-Summarizer-AI")
text = "PASTE_YOUR_LONG_NEWS_ARTICLE_HERE"
print(summarizer(text, max_length=128, min_length=30, do_sample=False))
Manual Usage (Best for Performance):
Python
import torch
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("mohd-musheer/News-Summarizer-AI")
model = AutoModelForSeq2SeqLM.from_pretrained("mohd-musheer/News-Summarizer-AI")

article = "..."
inputs = tokenizer(article, truncation=True, max_length=1024, return_tensors="pt")

# Global attention on the first token is recommended for LED/PRIMERA
global_attention_mask = torch.zeros_like(inputs["input_ids"])
global_attention_mask[:, 0] = 1

summary_ids = model.generate(
    inputs["input_ids"], 
    global_attention_mask=global_attention_mask, 
    max_length=128, 
    num_beams=4
)
print(tokenizer.decode(summary_ids[0], skip_special_tokens=True))

Downloads last month: 9

Safetensors

Model size

0.4B params

Tensor type

F32

Model tree for mohd-musheer/News-Summarizer-AI

Base model

allenai/PRIMERA

Finetuned

(5)

this model

Dataset used to train mohd-musheer/News-Summarizer-AI

Evaluation results

rouge1 on cleaned-news-summ-no-outliers
self-reported

71.430
bertscore on cleaned-news-summ-no-outliers
self-reported

0.930