Instructions to use KedarPN/GrantsLLM with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use KedarPN/GrantsLLM with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="KedarPN/GrantsLLM")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("KedarPN/GrantsLLM")
model = AutoModelForCausalLM.from_pretrained("KedarPN/GrantsLLM")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

llama-cpp-python

How to use KedarPN/GrantsLLM with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="KedarPN/GrantsLLM",
	filename="unsloth.Q4_K_M.gguf",
)

llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use KedarPN/GrantsLLM with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf KedarPN/GrantsLLM:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf KedarPN/GrantsLLM:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf KedarPN/GrantsLLM:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf KedarPN/GrantsLLM:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf KedarPN/GrantsLLM:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf KedarPN/GrantsLLM:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf KedarPN/GrantsLLM:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf KedarPN/GrantsLLM:Q4_K_M

Use Docker

docker model run hf.co/KedarPN/GrantsLLM:Q4_K_M

LM Studio
Jan

vLLM

How to use KedarPN/GrantsLLM with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "KedarPN/GrantsLLM"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "KedarPN/GrantsLLM",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/KedarPN/GrantsLLM:Q4_K_M

SGLang

How to use KedarPN/GrantsLLM with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "KedarPN/GrantsLLM" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "KedarPN/GrantsLLM",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "KedarPN/GrantsLLM" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "KedarPN/GrantsLLM",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Ollama
How to use KedarPN/GrantsLLM with Ollama:
```
ollama run hf.co/KedarPN/GrantsLLM:Q4_K_M
```

Unsloth Studio new

How to use KedarPN/GrantsLLM with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for KedarPN/GrantsLLM to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for KedarPN/GrantsLLM to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for KedarPN/GrantsLLM to start chatting

Pi new

How to use KedarPN/GrantsLLM with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf KedarPN/GrantsLLM:Q4_K_M

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "KedarPN/GrantsLLM:Q4_K_M"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use KedarPN/GrantsLLM with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf KedarPN/GrantsLLM:Q4_K_M

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default KedarPN/GrantsLLM:Q4_K_M

Run Hermes

hermes

Docker Model Runner
How to use KedarPN/GrantsLLM with Docker Model Runner:
```
docker model run hf.co/KedarPN/GrantsLLM:Q4_K_M
```

Lemonade

How to use KedarPN/GrantsLLM with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull KedarPN/GrantsLLM:Q4_K_M

Run and chat with the model

lemonade run user.GrantsLLM-Q4_K_M

List all available models

lemonade list

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

GrantsLLM

A specialized language model for STEM research grant writing and review

Developed by Evionex | Created by Kedar P. Navsariwala

Model Description

GrantsLLM is a domain-specialized language model fine-tuned on 78 STEM research grant applications to assist researchers in drafting, refining, and reviewing grant proposals. Built on Qwen3-4B, this model has been trained to understand the structure, terminology, and writing style of successful research grants across NIH, NSF, and similar funding mechanisms.

Developed by: Kedar P. Navsariwala, CTO & Co-Founder at Evionex
Model type: Causal Language Model (Decoder-only Transformer)
Language(s): English
License: CC BY 4.0 (requires attribution)
Finetuned from: Qwen/Qwen3-4B

🎯 Use Cases

What GrantsLLM Can Do

✅ Generate complete grant proposals (NIH R03/R01/R21, NSF, etc.)
✅ Draft specific sections: Specific Aims, Significance, Innovation, Approach, Research Strategy
✅ Improve existing text for clarity, structure, and persuasiveness
✅ Provide review feedback on grant coherence and alignment
✅ Expand bullet points into full narrative sections
✅ Adapt tone to academic/scientific writing standards

Intended Users

Principal Investigators (PIs) and research scientists
Postdoctoral researchers and graduate students
University grant support offices
Biotech and research startups
Academic research administrators

Out of Scope

❌ Automated funding decisions or grant scoring
❌ Legal, regulatory, or IRB compliance review
❌ Generating fabricated data or citations
❌ Non-STEM grants (humanities, arts, social sciences may have reduced quality)
❌ Non-English grant applications

🚀 Quick Start

Installation

pip install transformers torch accelerate

Basic Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "KedarPN/GrantsLLM"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)

prompt = """Write a Specific Aims section for an NIH R03 grant on developing novel CRISPR-based gene editing tools for treating sickle cell disease. Include 2-3 specific aims with clear objectives and expected outcomes."""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
    **inputs,
    max_new_tokens=512,
    temperature=0.7,
    top_p=0.9,
    do_sample=True
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Using with Pipeline

from transformers import pipeline

generator = pipeline(
    "text-generation",
    model="KedarPN/GrantsLLM",
    device_map="auto"
)

prompt = "Draft a Research Significance statement for a computational biology grant on protein folding prediction using deep learning."
output = generator(prompt, max_new_tokens=400, temperature=0.7, top_p=0.9)
print(output[0]['generated_text'])

Prompt Templates

For Section Generation:

Write a [Section] for a [Funder] [Mechanism] grant on [Topic].
Requirements: [Specific elements needed]
Word limit: [Number] words

For Review/Feedback:

Review the following [Section] and provide feedback on clarity, structure, and alignment with [Funder] guidelines:
[Paste text here]

Examples:

"Write Specific Aims for an NIH R01 grant on cancer immunotherapy"
"Draft Innovation section for NSF CAREER award on quantum computing"
"Review this Research Strategy for logical flow and hypothesis clarity"

📊 Training Data

Dataset Composition

Size: 78 research grant applications
Domains: Biotechnology, Molecular Biology, Computational Biology, Chemistry, Biomedical Sciences
Formats: NIH (R01, R03, R21), NSF, and similar federal/institutional grant formats
Sources: Publicly available grant examples, institutional repositories, and NIH RePORTER
Language: English

Data Processing

Stage 1: Continued Pretraining (CPT)

Raw grant text extracted and cleaned from PDFs/documents
Structured into single-column text format (JSONL/Parquet)
Preserves section structure and domain terminology

Stage 2: Supervised Fine-Tuning (SFT)

Chat-style instruction pairs using ChatML template
Tasks include: section generation, expansion, refinement, review
Format: {"messages": [{"role": "user", "content": "..."}, {"role": "assistant", "content": "..."}]}

🔧 Training Procedure

Training Hyperparameters

Base Model: Qwen/Qwen3-4B (~4B parameters)
Training Framework: Unsloth + PyTorch
Hardware: Google Colab (single GPU, T4/V100)
Fine-tuning Method: LoRA/QLoRA (Parameter-Efficient Fine-Tuning)
Training Stages:
1. Continued Pretraining on grant corpus
2. Supervised Instruction Fine-Tuning on QnA pairs
Optimizer: AdamW
Learning Rate: Low rate to prevent catastrophic forgetting
Training monitored for: Overfitting, repetition, coherence

Training Details

Training Type: Full fine-tuning with LoRA adapters
Epochs: [Adjusted based on validation performance]
Batch Size: Optimized for 4B model on single GPU
Context Length: 262,144 tokens (256K)
Loss Function: Causal Language Modeling (CLM) loss
Validation Strategy: Qualitative evaluation on held-out grant examples

📈 Performance & Evaluation

Evaluation Methodology

Qualitative Assessment:

Human expert review of generated grant sections
Evaluation criteria: coherence, structure, domain accuracy, persuasiveness
Practical testing on mock NIH/NSF grant prompts

Known Strengths

✅ Strong grasp of STEM grant structure (Aims, Significance, Innovation, Approach)
✅ Effective expansion of bullet points to narrative
✅ Appropriate academic/scientific tone
✅ Good understanding of NIH/NSF terminology and conventions
✅ Maintains logical flow between sections

Known Limitations

⚠️ Hallucination Risk: May generate plausible but incorrect citations, grant numbers, or policies
⚠️ Format Bias: Optimized for NIH/NSF; other formats (European, private foundations) may be weaker
⚠️ Domain Bias: Best for biotech/life sciences; physics/engineering grants may be less polished
⚠️ Repetition: Can produce repetitive text if prompt lacks detail or structure
⚠️ Recency: Training data may not reflect latest funder guidelines (post-2025)

⚠️ Bias, Risks, and Limitations

Bias Sources

Domain Bias: Model is optimized for STEM fields represented in training data (biotech, molecular biology, computational biology). Grants in underrepresented fields may receive lower quality outputs.

Institutional Bias: Writing style may reflect patterns from R1 research universities and well-funded institutions present in training examples.

Funding Mechanism Bias: Strongest performance on NIH R-series and NSF standard grants; less reliable for fellowships, training grants, or international formats.

Historical Bias: May reinforce language patterns from historically funded research areas, potentially disadvantaging emerging or interdisciplinary fields.

Risks

Fabrication: Model may generate convincing but false information including:

Non-existent citations and references
Incorrect grant mechanism details
Fabricated preliminary data or results
Inaccurate funder policies

Over-reliance: Users may trust outputs without verification, risking submission of flawed proposals.

Privacy: Users may inadvertently input confidential research ideas or unpublished data.

Recommendations

Always verify: Check all factual claims, citations, and funder guidelines
Human review required: Never submit AI-generated grants without expert review
Iterative refinement: Use as drafting assistant, not final author
Protect IP: Don't input confidential or proprietary information
Disclose usage: Be transparent with collaborators and (when appropriate) funders about AI assistance
Update manually: Cross-reference current funder guidelines and requirements

🔐 Ethical Considerations

Responsible Use

Transparency: Disclose AI assistance to co-authors and collaborators
Human oversight: Keep domain experts in the loop for all submissions
Academic integrity: Ensure outputs align with your institution's policies on AI use
Verification: Validate all scientific claims and citations independently
Privacy: Avoid inputting sensitive, unpublished, or identifiable information

Funder Policies

As of February 2026, grant-writing AI policies vary by funder:

NIH: Generally permits AI assistance for writing, but PIs remain responsible for all content
NSF: Similar stance; emphasizes researcher accountability
Check specific RFAs for any AI-related restrictions or disclosure requirements

When in doubt: Contact your program officer or sponsored research office.

📜 Licensing & Attribution

License: CC BY 4.0

This model is licensed under Creative Commons Attribution 4.0 International.

You Must:

✅ Give appropriate credit to Evionex and Kedar P. Navsariwala
✅ Provide a link to the license
✅ Indicate if changes were made to the model
✅ Retain attribution in any derivative works or applications

Citation

If you use GrantsLLM in your research or projects, please cite:

@software{grantsllm2026,
  author       = {Navsariwala, Kedar P.},
  title        = {GrantsLLM: A Fine-Tuned Language Model for STEM Grant Writing},
  year         = {2026},
  publisher    = {Hugging Face},
  organization = {Evionex},
  howpublished = {\url{https://huggingface.co/KedarPN/GrantsLLM}},
  license      = {CC-BY-4.0}
}

Attribution Example

Grant drafting assistance provided by GrantsLLM (Navsariwala, 2026), developed by Evionex.
Available at https://huggingface.co/KedarPN/GrantsLLM

🛠️ Technical Specifications

Model Architecture

Architecture: Qwen3 (Decoder-only Transformer)
Parameters: ~4 billion
Layers: 36
Hidden Size: 2560
Attention Heads: 32
Vocabulary Size: 151,936
Context Window: 262,144 tokens (256K)

Software Stack

Training: Unsloth, PyTorch, Hugging Face Transformers
Fine-tuning: LoRA/QLoRA with PEFT
Environment: Google Colab (GPU)
Export Formats:
- Hugging Face Transformers checkpoint (BF16 + BNB NF4 4-bit)
- GGUF (Q4_K_M, Q5_K_M, Q8_0)

Hardware Requirements

Inference:

Minimum: 8GB VRAM (with GGUF quantization) or 16GB RAM (CPU)
Recommended: 16GB+ VRAM for full precision
CPU inference: Supported via GGUF quantized versions

📦 Model Variants

Variant	File	Size	Use Case	Hardware
Full precision (BF16)	`model-0000[1-2]-of-00002.safetensors`	~8.05 GB	Maximum quality	16GB+ VRAM
BNB NF4 4-bit	`model.safetensors`	~3.51 GB	Memory-efficient fine-tuning checkpoint	8GB+ VRAM
GGUF Q8_0	`unsloth.Q8_0.gguf`	~4.28 GB	Balanced quality/speed	8GB+ VRAM or CPU
GGUF Q5_K_M	`unsloth.Q5_K_M.gguf`	~2.89 GB	Good quality, reduced size	6GB+ VRAM or CPU
GGUF Q4_K_M	`unsloth.Q4_K_M.gguf`	~2.5 GB	Fast inference, minimal VRAM	4GB+ VRAM or CPU

🤝 Acknowledgments

Built With

Base Model: Qwen3-4B by Alibaba/Qwen Team
Training Framework: Unsloth for efficient fine-tuning
ML Libraries: PyTorch, Hugging Face Transformers
Infrastructure: Google Colab

Special Thanks

Open-source grant examples from NIH RePORTER and NSF Award Search
Academic institutions sharing grant templates and examples
Unsloth team for efficient fine-tuning tools
Hugging Face for model hosting and inference infrastructure

📞 Contact & Support

Developer: Kedar P. Navsariwala
Organization: Evionex
Website: www.evionex.com
Model Repository: KedarPN/GrantsLLM

Issues & Feedback

Report bugs or issues in the Discussion tab
Share use cases and success stories
Request features or improvements
Contribute to model evaluation

📌 Disclaimer

GrantsLLM is an assistive tool designed to support the grant writing process. It does not:

Guarantee grant success or funding approval
Replace domain expertise or scientific judgment
Ensure compliance with all funder requirements
Eliminate the need for human review and verification

Always consult official funder guidelines and domain experts before grant submission.

🔄 Version History

v1.0 (February 2026)

Initial release
Trained on 78 STEM grant applications
Base model: Qwen/Qwen3-4B
Supports NIH and NSF formats

This Qwen3 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Hugging Face's TRL library.

Downloads last month: -

Safetensors

Model size

4B params

Tensor type

F32

BF16

Model tree for KedarPN/GrantsLLM

Base model

Qwen/Qwen3-4B-Base

Finetuned

Qwen/Qwen3-4B

Quantized

(217)

this model