LegalParam GGUF Models
GGUF quantized versions of bharatgenai/LegalParam for use with Ollama.
Model Information
Original Model: bharatgenai/LegalParam
- Architecture: ParamBharatGen (LLaMA-based)
- Parameters: 2.9B
- Context Length: 2048 tokens
- Purpose: Specialized AI assistant for Indian law
Available Quantizations
| Quantization | File Size | Description | Use Case |
|---|---|---|---|
| Q4_K_M | 1.7GB | 4-bit quantized | Recommended for most use cases |
| Q6_K | 2.2GB | 6-bit quantized | Higher quality, moderate resource usage |
| F16 | 5.4GB | 16-bit float (no quantization) | Highest quality, requires more memory |
Quick Start
1. Install Ollama
curl -fsSL https://ollama.com/install.sh | sh
2. Create the Model
Choose a quantization level:
# Q4_K_M (Recommended - 1.7GB)
ollama create legalparam:q4 -f Modelfile
# Q6_K (Higher quality - 2.2GB)
ollama create legalparam:q6 -f Modelfile-q6
# F16 (Highest quality - 5.4GB)
ollama create legalparam:f16 -f Modelfile-f16
3. Run the Model
# Interactive chat
ollama run legalparam:q4
# Single query
ollama run legalparam:q4 "What steps should a farmer take to legally transfer agricultural land ownership?"
Python Usage
from ollama import Client
client = Client()
response = client.chat(model='legalparam:q4', messages=[
{'role': 'user', 'content': 'What are the fundamental rights in the Indian Constitution?'}
])
print(response['message']['content'])
Model File Details
All Modelfiles include:
- Correct chat template matching the tokenizer's format
- Stop tokens (
</s>,<user>,<assistant>) to prevent infinite generation loops - Optimized parameters for legal question answering
Chat Template Format
<user>
{user_message}
<assistant>
{assistant_response}
Context Window
- Default: 2048 tokens (combined input + output)
- Scaling: Can be extended with RoPE scaling in Ollama (experimental)
Example Queries
The model excels at Indian legal queries:
- "Explain the First Amendment of the Indian Constitution"
- "What is the procedure for filing a civil suit in India?"
- "What are the key provisions of the Land Acquisition Act?"
- "Explain the concept of judicial review in India"
- "What are the powers of the Supreme Court of India?"
Technical Specifications
Model Architecture
- Hidden size: 2048
- Layers: 32
- Attention heads: 16
- KV heads: 8 (Grouped Query Attention)
- Vocabulary: 256,006 tokens
Special Tokens
<s>: Beginning of sequence (BOS)</s>: End of sequence (EOS)<user>: User message marker<assistant>: Assistant message marker
Limitations
- Context limited to 2048 tokens
- Training data cutoff: August 2023
- Optimized for Indian law queries
- May not perform well on non-legal topics
Original Model
This is a quantized version of bharatgenai/LegalParam. For the original PyTorch model, training details, and full documentation, please refer to the original repository.
License
Please refer to the original model repository for licensing information.
Conversion Process
These models were converted from the original HuggingFace format to GGUF using llama.cpp with the following process:
- Loaded original model with transformers
- Converted to GGUF format
- Quantized to Q4_K_M, Q6_K, and F16 precision
- Validated with Ollama inference engine
Troubleshooting
Model repeats or loops
- Ensure you're using the provided Modelfiles
- Stop tokens are pre-configured to prevent infinite loops
Out of memory errors
- Try a smaller quantization (Q4_K_M instead of Q6_K)
- Reduce
num_ctxparameter in Ollama
Poor quality responses
- Try F16 quantization for highest quality
- Ensure proper prompt formatting with
<user>and<assistant>tags
Acknowledgments
- Original model: bharatgenai/LegalParam
- GGUF conversion: llama.cpp
- Inference engine: Ollama
- Downloads last month
- 3,925
Hardware compatibility
Log In to add your hardware
4-bit
6-bit
Model tree for parthashirolkar/LegalParam-GGUF
Base model
bharatgenai/LegalParam