Instructions to use PyPranav/Bhagwat-Corpus with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use PyPranav/Bhagwat-Corpus with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="PyPranav/Bhagwat-Corpus")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("PyPranav/Bhagwat-Corpus")
model = AutoModelForCausalLM.from_pretrained("PyPranav/Bhagwat-Corpus")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use PyPranav/Bhagwat-Corpus with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "PyPranav/Bhagwat-Corpus"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "PyPranav/Bhagwat-Corpus",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/PyPranav/Bhagwat-Corpus

SGLang

How to use PyPranav/Bhagwat-Corpus with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "PyPranav/Bhagwat-Corpus" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "PyPranav/Bhagwat-Corpus",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "PyPranav/Bhagwat-Corpus" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "PyPranav/Bhagwat-Corpus",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use PyPranav/Bhagwat-Corpus with Docker Model Runner:
```
docker model run hf.co/PyPranav/Bhagwat-Corpus
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

Bhagvad Corpus LLM (Instruction-tuned)

Abstract

Although large language models (LLMs) are being used more and more to answer complicated questions, they frequently fail to provide philosophically sound answers that are based on scriptural traditions such as Vedic thought. The deficiency of specialized instruction-tuning datasets made for such culturally diverse contexts is the cause of this gap.

We present the Bhagvad Corpus, a synthetic dataset of approximately 90,000 examples based on the Itihasa corpus (Aralikatte et al., 2021)[1], which includes Sanskrit-English shloka pairs from the Mahabharata and Ramayana. A synthetic user question, the original shloka, its translation, and a thorough explanation linking the verse to the purpose of the query are all included in each instance found in the Bhagvad Corpus.

We show how useful this dataset is by using QLoRA to refine LLMs so they can produce structured, shloka-backed responses in formats like JSON. Initial results show notable improvements in the relevance and depth of generated philosophical responses. We release Bhagvad Corpus publicly to support further research in building culturally aware and spiritually aligned language models.

Model Usage

This model is instruction-tuned to generate Vedic philosophical responses, always outputting a JSON object with the following keys:

sanskrit_shloka: The relevant Sanskrit verse
english_translation: The English translation of the shloka
explanation: A detailed explanation connecting the shloka to the user's query

Prompt Template

To use the model, format your prompt as follows:

Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
 Provide a Vedic philosophical response based on ancient scriptures. Always provide JSON output with the following keys: 'sanskrit_shloka', 'english_translation', 'explanation'.

### Input:
 <YOUR_QUESTION_HERE>

### Response:

Example:

Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
 Provide a Vedic philosophical response based on ancient scriptures. Always provide JSON output with the following keys: 'sanskrit_shloka', 'english_translation', 'explanation'.

### Input:
 What is the Vedic perspective on forgiveness?

### Response:

The model will generate a JSON object as the response, for example:

{
  "sanskrit_shloka": "क्षिप्रं हि मानुषे लोके सिद्धिर्भवति कर्मजा। ...",
  "english_translation": "Success is quickly achieved in the human world by actions...",
  "explanation": "According to the Mahabharata, forgiveness is considered a great virtue..."
}

How to Run

You can load and run this model using any standard HuggingFace-compatible inference script or library. Here is a minimal example using the HuggingFace Transformers library in Python:

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

tokenizer = AutoTokenizer.from_pretrained("<your-model-path-or-hub-name>")
model = AutoModelForCausalLM.from_pretrained("<your-model-path-or-hub-name>").to("cuda")

prompt = '''Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
 Provide a Vedic philosophical response based on ancient scriptures. Always provide JSON output with the following keys: 'sanskrit_shloka', 'english_translation', 'explanation'.

### Input:
 What is the Vedic perspective on forgiveness?

### Response:
'''

inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=512)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Replace <your-model-path-or-hub-name> with the path to this model directory or the HuggingFace Hub name if uploaded.

Output Format

The model is designed to always return a JSON object with the following keys:

sanskrit_shloka
english_translation
explanation

If the output is not valid JSON, you may need to post-process the string to extract the JSON part.

Citation

If you use this model or dataset, please cite:

[1] Aralikatte, R., et al. (2021). Itihasa Corpus: A Large-Scale, Synthetically Generated Dataset for Sanskrit-English Machine Translation. arXiv preprint arXiv:2104.05561.

License

This model and dataset are released for research purposes. Please see the repository for license details.

Downloads last month: -

Safetensors

Model size

8B params

Tensor type

BF16

Model tree for PyPranav/Bhagwat-Corpus

Quantizations

1 model

Dataset used to train PyPranav/Bhagwat-Corpus

Paper for PyPranav/Bhagwat-Corpus

Evidence for an MHD disk wind via optical forbidden line spectro-astrometry

Paper • 2104.05561 • Published Apr 12, 2021