YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
Gemma 2B NIRF Lookup 2025 - GGUF Version 2
Overview
This repository contains GGUF-converted versions of the coderop12/gemma2b-nirf-lookup-2025 model, optimized for efficient inference with llama.cpp and compatible frameworks.
Model Details
- Base Model: google/gemma-2-2b-it
- Fine-tuning: QLoRA (4-bit) on NIRF 2025 institutional data
- Specialty: Indian higher education institutional ranking lookups
- Training Data: 100 NIRF 2025 lookup samples
- Conversion: HuggingFace โ GGUF format
Files Included
gemma2b-nirf-lookup-2025-f16.gguf(4.88 GB) - Original FP16 precisiongemma2b-nirf-lookup-2025-q4_k_m.gguf(1.59 GB) - Q4_K_M quantized (recommended)
Quick Start
Option 1: llama.cpp
# Clone llama.cpp
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
# Build
cmake -B build
cmake --build build --config Release
# Run inference
./build/bin/llama-cli \
--model gemma2b-nirf-lookup-2025-q4_k_m.gguf \
--prompt "What is the ranking of IIT Madras in NIRF 2025?" \
--n-predict 100 \
--temp 0.7
Option 2: Ollama
bash# Create Modelfile
echo 'FROM ./gemma2b-nirf-lookup-2025-q4_k_m.gguf' > Modelfile
# Import model
ollama create nirf-lookup -f Modelfile
# Run
ollama run nirf-lookup "What is the ranking of IIT Delhi in NIRF 2025?"
Option 3: Python with llama-cpp-python
pythonfrom llama_cpp import Llama
# Load model
llm = Llama(model_path="gemma2b-nirf-lookup-2025-q4_k_m.gguf")
# Generate response
output = llm("What is the ranking of IIT Bombay in NIRF 2025?",
max_tokens=100, temperature=0.7)
print(output['choices'][0]['text'])
Sample Queries
"What is the ranking of IIT Madras in NIRF 2025?"
"Which engineering college ranks #2 in NIRF 2025?"
"Tell me about the top 3 universities in NIRF 2025 overall ranking"
"What is the NIRF score of IIT Delhi in 2025?"
Expected Output Format
The model provides structured responses with:
Institution ranking and score
Source references (e.g., [NIRF2025-OVERALL-IR-O-U-0456])
Additional contextual information
Performance
Q4_K_M Version: ~12 tokens/second on T4 GPU
Memory Usage: ~2GB VRAM for Q4_K_M version
Quality: Minimal degradation from original model
Technical Specifications
Architecture: Gemma2ForCausalLM
Parameters: 2.61B
Context Length: 8192 tokens
Quantization: Q4_K_M (recommended) / FP16 (maximum quality)
Hardware Recommendations
CPU: 4+ cores, 8GB+ RAM
GPU: T4/RTX 3060 or better for optimal performance
Storage: 2GB+ free space
Conversion Process
Downloaded base fine-tuned model from HuggingFace
Converted to GGUF F16 format using llama.cpp
Quantized to Q4_K_M for optimal size/quality balance
Validated functionality with NIRF-specific queries
License
This model derivative follows Google's Gemma Terms of Use. See original base model license.
Citation
If you use this model, please cite:
@misc{gemma2b-nirf-gguf-v2,
title={Gemma 2B NIRF Lookup 2025 - GGUF Version 2},
author={coderop12},
year={2025},
url={https://huggingface.co/coderop12/gemma2b-nirf-lookup-2025-gguf-v2}
}
Limitations
Specialized for NIRF 2025 data only
Limited training dataset (100 samples)
May not generalize to other ranking systems
Verify critical information against official NIRF sources
Support
For issues or questions, please open an issue in this repository.
- Downloads last month
- 5
Hardware compatibility
Log In
to view the estimation
4-bit
16-bit
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support