YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Gemma 2B NIRF Lookup 2025 - GGUF Version 2

Overview

This repository contains GGUF-converted versions of the coderop12/gemma2b-nirf-lookup-2025 model, optimized for efficient inference with llama.cpp and compatible frameworks.

Model Details

  • Base Model: google/gemma-2-2b-it
  • Fine-tuning: QLoRA (4-bit) on NIRF 2025 institutional data
  • Specialty: Indian higher education institutional ranking lookups
  • Training Data: 100 NIRF 2025 lookup samples
  • Conversion: HuggingFace โ†’ GGUF format

Files Included

  • gemma2b-nirf-lookup-2025-f16.gguf (4.88 GB) - Original FP16 precision
  • gemma2b-nirf-lookup-2025-q4_k_m.gguf (1.59 GB) - Q4_K_M quantized (recommended)

Quick Start

Option 1: llama.cpp

# Clone llama.cpp
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp

# Build
cmake -B build
cmake --build build --config Release

# Run inference
./build/bin/llama-cli \
    --model gemma2b-nirf-lookup-2025-q4_k_m.gguf \
    --prompt "What is the ranking of IIT Madras in NIRF 2025?" \
    --n-predict 100 \
    --temp 0.7
Option 2: Ollama
bash# Create Modelfile
echo 'FROM ./gemma2b-nirf-lookup-2025-q4_k_m.gguf' > Modelfile

# Import model
ollama create nirf-lookup -f Modelfile

# Run
ollama run nirf-lookup "What is the ranking of IIT Delhi in NIRF 2025?"
Option 3: Python with llama-cpp-python
pythonfrom llama_cpp import Llama

# Load model
llm = Llama(model_path="gemma2b-nirf-lookup-2025-q4_k_m.gguf")

# Generate response
output = llm("What is the ranking of IIT Bombay in NIRF 2025?", 
             max_tokens=100, temperature=0.7)
print(output['choices'][0]['text'])
Sample Queries

"What is the ranking of IIT Madras in NIRF 2025?"
"Which engineering college ranks #2 in NIRF 2025?"
"Tell me about the top 3 universities in NIRF 2025 overall ranking"
"What is the NIRF score of IIT Delhi in 2025?"

Expected Output Format
The model provides structured responses with:

Institution ranking and score
Source references (e.g., [NIRF2025-OVERALL-IR-O-U-0456])
Additional contextual information

Performance

Q4_K_M Version: ~12 tokens/second on T4 GPU
Memory Usage: ~2GB VRAM for Q4_K_M version
Quality: Minimal degradation from original model

Technical Specifications

Architecture: Gemma2ForCausalLM
Parameters: 2.61B
Context Length: 8192 tokens
Quantization: Q4_K_M (recommended) / FP16 (maximum quality)

Hardware Recommendations

CPU: 4+ cores, 8GB+ RAM
GPU: T4/RTX 3060 or better for optimal performance
Storage: 2GB+ free space

Conversion Process

Downloaded base fine-tuned model from HuggingFace
Converted to GGUF F16 format using llama.cpp
Quantized to Q4_K_M for optimal size/quality balance
Validated functionality with NIRF-specific queries

License
This model derivative follows Google's Gemma Terms of Use. See original base model license.
Citation
If you use this model, please cite:
@misc{gemma2b-nirf-gguf-v2,
  title={Gemma 2B NIRF Lookup 2025 - GGUF Version 2},
  author={coderop12},
  year={2025},
  url={https://huggingface.co/coderop12/gemma2b-nirf-lookup-2025-gguf-v2}
}
Limitations

Specialized for NIRF 2025 data only
Limited training dataset (100 samples)
May not generalize to other ranking systems
Verify critical information against official NIRF sources

Support
For issues or questions, please open an issue in this repository.
Downloads last month
5
GGUF
Model size
3B params
Architecture
gemma2
Hardware compatibility
Log In to view the estimation

4-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support