Model Description

BharatGPT mini is a Transformer-based language model pretrained on a large corpus of publicly available text data using a self-supervised learning approach. This means the model was trained without any human-labeled annotations—learning directly from raw text using an automatic mechanism to generate training signals.

During pretraining, BharatGPT mini was optimized for the causal language modeling task: given a sequence of tokens, the model learns to predict the next token in the sequence. More specifically, it takes a sequence of continuous text as input and is trained to predict the next word or subword by shifting the target sequence one position to the right. A masking mechanism ensures that predictions for token i are based only on tokens from positions 1 to i, without peeking at future tokens. This preserves the autoregressive nature of language modeling.

Through this training process, BharatGPT mini develops a deep internal understanding of language patterns, grammar, and semantics. While it can be fine-tuned for various downstream tasks such as classification, summarization, or question answering, it performs best in text generation tasks, which align with its original training objective.

import torch
from transformers import GPT2Tokenizer, GPT2LMHeadModel

# Load tokenizer and model
tokenizer = GPT2Tokenizer.from_pretrained("CoRover/BharatGPT-mini")
model = GPT2LMHeadModel.from_pretrained("CoRover/BharatGPT-mini")

model.eval()

# Input text
text = "Future of AI"

# Tokenize
inputs = tokenizer(
    text,
    return_tensors="pt"
)

# Generate text
with torch.no_grad():
    output_ids = model.generate(
        **inputs,
        max_length=100,
        do_sample=True,
        top_p=0.95,
        top_k=50,
        temperature=0.8,
        repetition_penalty=1.1,
        eos_token_id=tokenizer.eos_token_id
    )

# Decode output
generated_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)
print(generated_text)
  • Developed by: CoRover.ai
Downloads last month
37
Safetensors
Model size
0.5B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for CoRover/BharatGPT-mini

Quantizations
1 model