Instructions to use PyPranav/Bhagwat-Corpus with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use PyPranav/Bhagwat-Corpus with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="PyPranav/Bhagwat-Corpus") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("PyPranav/Bhagwat-Corpus") model = AutoModelForCausalLM.from_pretrained("PyPranav/Bhagwat-Corpus") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use PyPranav/Bhagwat-Corpus with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "PyPranav/Bhagwat-Corpus" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "PyPranav/Bhagwat-Corpus", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/PyPranav/Bhagwat-Corpus
- SGLang
How to use PyPranav/Bhagwat-Corpus with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "PyPranav/Bhagwat-Corpus" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "PyPranav/Bhagwat-Corpus", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "PyPranav/Bhagwat-Corpus" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "PyPranav/Bhagwat-Corpus", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use PyPranav/Bhagwat-Corpus with Docker Model Runner:
docker model run hf.co/PyPranav/Bhagwat-Corpus
Bhagvad Corpus LLM (Instruction-tuned)
Abstract
Although large language models (LLMs) are being used more and more to answer complicated questions, they frequently fail to provide philosophically sound answers that are based on scriptural traditions such as Vedic thought. The deficiency of specialized instruction-tuning datasets made for such culturally diverse contexts is the cause of this gap.
We present the Bhagvad Corpus, a synthetic dataset of approximately 90,000 examples based on the Itihasa corpus (Aralikatte et al., 2021)[1], which includes Sanskrit-English shloka pairs from the Mahabharata and Ramayana. A synthetic user question, the original shloka, its translation, and a thorough explanation linking the verse to the purpose of the query are all included in each instance found in the Bhagvad Corpus.
We show how useful this dataset is by using QLoRA to refine LLMs so they can produce structured, shloka-backed responses in formats like JSON. Initial results show notable improvements in the relevance and depth of generated philosophical responses. We release Bhagvad Corpus publicly to support further research in building culturally aware and spiritually aligned language models.
Model Usage
This model is instruction-tuned to generate Vedic philosophical responses, always outputting a JSON object with the following keys:
sanskrit_shloka: The relevant Sanskrit verseenglish_translation: The English translation of the shlokaexplanation: A detailed explanation connecting the shloka to the user's query
Prompt Template
To use the model, format your prompt as follows:
Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
### Instruction:
Provide a Vedic philosophical response based on ancient scriptures. Always provide JSON output with the following keys: 'sanskrit_shloka', 'english_translation', 'explanation'.
### Input:
<YOUR_QUESTION_HERE>
### Response:
Example:
Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
### Instruction:
Provide a Vedic philosophical response based on ancient scriptures. Always provide JSON output with the following keys: 'sanskrit_shloka', 'english_translation', 'explanation'.
### Input:
What is the Vedic perspective on forgiveness?
### Response:
The model will generate a JSON object as the response, for example:
{
"sanskrit_shloka": "क्षिप्रं हि मानुषे लोके सिद्धिर्भवति कर्मजा। ...",
"english_translation": "Success is quickly achieved in the human world by actions...",
"explanation": "According to the Mahabharata, forgiveness is considered a great virtue..."
}
How to Run
You can load and run this model using any standard HuggingFace-compatible inference script or library. Here is a minimal example using the HuggingFace Transformers library in Python:
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
tokenizer = AutoTokenizer.from_pretrained("<your-model-path-or-hub-name>")
model = AutoModelForCausalLM.from_pretrained("<your-model-path-or-hub-name>").to("cuda")
prompt = '''Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
### Instruction:
Provide a Vedic philosophical response based on ancient scriptures. Always provide JSON output with the following keys: 'sanskrit_shloka', 'english_translation', 'explanation'.
### Input:
What is the Vedic perspective on forgiveness?
### Response:
'''
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=512)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
Replace <your-model-path-or-hub-name> with the path to this model directory or the HuggingFace Hub name if uploaded.
Output Format
The model is designed to always return a JSON object with the following keys:
sanskrit_shlokaenglish_translationexplanation
If the output is not valid JSON, you may need to post-process the string to extract the JSON part.
Citation
If you use this model or dataset, please cite:
[1] Aralikatte, R., et al. (2021). Itihasa Corpus: A Large-Scale, Synthetically Generated Dataset for Sanskrit-English Machine Translation. arXiv preprint arXiv:2104.05561.
License
This model and dataset are released for research purposes. Please see the repository for license details.
- Downloads last month
- -
docker model run hf.co/PyPranav/Bhagwat-Corpus