| | --- |
| | library_name: transformers |
| | tags: [] |
| | --- |
| | |
| | ### Model Description |
| |
|
| |
|
| | BharatGPT mini is a Transformer-based language model pretrained on a large corpus of publicly available text data using a self-supervised learning approach. This means the model was trained without any human-labeled annotations—learning directly from raw text using an automatic mechanism to generate training signals. |
| |
|
| | During pretraining, BharatGPT mini was optimized for the causal language modeling task: given a sequence of tokens, the model learns to predict the next token in the sequence. More specifically, it takes a sequence of continuous text as input and is trained to predict the next word or subword by shifting the target sequence one position to the right. A masking mechanism ensures that predictions for token i are based only on tokens from positions 1 to i, without peeking at future tokens. This preserves the autoregressive nature of language modeling. |
| |
|
| | Through this training process, BharatGPT mini develops a deep internal understanding of language patterns, grammar, and semantics. While it can be fine-tuned for various downstream tasks such as classification, summarization, or question answering, it performs best in text generation tasks, which align with its original training objective. |
| |
|
| |
|
| | ```python |
| | import torch |
| | from transformers import GPT2Tokenizer, GPT2LMHeadModel |
| | |
| | # Load tokenizer and model |
| | tokenizer = GPT2Tokenizer.from_pretrained("CoRover/BharatGPT-mini") |
| | model = GPT2LMHeadModel.from_pretrained("CoRover/BharatGPT-mini") |
| | |
| | model.eval() |
| | |
| | # Input text |
| | text = "Future of AI" |
| | |
| | # Tokenize |
| | inputs = tokenizer( |
| | text, |
| | return_tensors="pt" |
| | ) |
| | |
| | # Generate text |
| | with torch.no_grad(): |
| | output_ids = model.generate( |
| | **inputs, |
| | max_length=100, |
| | do_sample=True, |
| | top_p=0.95, |
| | top_k=50, |
| | temperature=0.8, |
| | repetition_penalty=1.1, |
| | eos_token_id=tokenizer.eos_token_id |
| | ) |
| | |
| | # Decode output |
| | generated_text = tokenizer.decode(output_ids[0], skip_special_tokens=True) |
| | print(generated_text) |
| | ``` |
| |
|
| | - **Developed by:** CoRover.ai |