---
license: apache-2.0
library_name: transformers
base_model: google/gemma-2b
tags:
  - text-generation
  - fine-tuned
  - pdf-grounded
  - zero-hallucination
  - technical-provenance
language:
  - en
pipeline_tag: text-generation
---

# 🏗️ Solvrays Llm Pdf (Grounded Version)

## 🌟 Overview
This is a specialized, fine-tuned version of **Gemma 2B** optimized for **Ground-Truth Technical Retrieval**. Unlike standard LLMs, this model has been conditioned through specific "Senior AI Engineering" grounding templates to minimize hallucinations and prioritize information extracted directly from technical documentation.

### 🛠 Key Capabilities
- **Zero-Hallucination Mode**: Configured for deterministic greedy decoding.
- **Direct Provenance**: Trained to recognize specific technical documents as "Ground Truth".
- **Infrastructure Focused**: Fine-tuned on complex architectural guidelines (e.g., Saturn Project components).
- **Merged Weights**: Standalone weights for high-speed, native inference.

## 💻 Grounded Quick Start (Precise Inference)
To get the most accurate, non-hallucinatory responses, use the following grounded prompt:

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "singtan/solvrays-llm-pdf"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", torch_dtype=torch.float16)

# MUST use the Ground-Truth prompt template
prompt = "Based strictly on the provided architectural documentation, provide a precise summary of technical insights."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(
        **inputs, 
        max_new_tokens=256, 
        do_sample=False,      # Force deterministic facts
        repetition_penalty=1.5
    )

print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

## 📊 Engineering Specifications
- **Strategy**: SFT (Supervised Fine-Tuning) with Grounding Headers.
- **Rank (r)**: 16 (High capacity for technical fact retention).
- **Epochs**: 5 (Heavy Fact Reinforcement).
- **Context Window**: 512 with 128-token overlap for fact-continuity.

## ⚠️ Usage Recommendations
For production-grade accuracy, always verify specific numeric values with the original PDF. This model is intended for summarizing and retrieving architectural concepts documented in its training corpus.

--- 
**Scientifically Fine-tuned by Bibek Lama Singtan**