CoRover
/

BharatGPT-mini

Text Generation

text-generation-inference

Model card Files Files and versions

BharatGPT-mini / README.md

CoRover's picture

Update README.md

4ac8e65 verified 29 days ago

|

history blame contribute delete

2.08 kB

	---
	library_name: transformers
	tags: []
	---

	### Model Description


	BharatGPT mini is a Transformer-based language model pretrained on a large corpus of publicly available text data using a self-supervised learning approach. This means the model was trained without any human-labeled annotations—learning directly from raw text using an automatic mechanism to generate training signals.

	During pretraining, BharatGPT mini was optimized for the causal language modeling task: given a sequence of tokens, the model learns to predict the next token in the sequence. More specifically, it takes a sequence of continuous text as input and is trained to predict the next word or subword by shifting the target sequence one position to the right. A masking mechanism ensures that predictions for token i are based only on tokens from positions 1 to i, without peeking at future tokens. This preserves the autoregressive nature of language modeling.

	Through this training process, BharatGPT mini develops a deep internal understanding of language patterns, grammar, and semantics. While it can be fine-tuned for various downstream tasks such as classification, summarization, or question answering, it performs best in text generation tasks, which align with its original training objective.


	```python
	import torch
	from transformers import GPT2Tokenizer, GPT2LMHeadModel

	# Load tokenizer and model
	tokenizer = GPT2Tokenizer.from_pretrained("CoRover/BharatGPT-mini")
	model = GPT2LMHeadModel.from_pretrained("CoRover/BharatGPT-mini")

	model.eval()

	# Input text
	text = "Future of AI"

	# Tokenize
	inputs = tokenizer(
	text,
	return_tensors="pt"
	)

	# Generate text
	with torch.no_grad():
	output_ids = model.generate(
	**inputs,
	max_length=100,
	do_sample=True,
	top_p=0.95,
	top_k=50,
	temperature=0.8,
	repetition_penalty=1.1,
	eos_token_id=tokenizer.eos_token_id
	)

	# Decode output
	generated_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)
	print(generated_text)
	```

	- Developed by: CoRover.ai