---
license: apache-2.0
library_name: transformers
base_model: google/gemma-2b
tags:
  - text-generation
  - standalone
  - merged-weights
  - pdf-optimized
  - gemma
  - vision-guided-training
language:
  - en
pipeline_tag: text-generation
---

# 🚀 Solvrays Finetuned Pdf (Standalone Merged Weight)

## 🌟 Overview
This model is a high-performance, standalone version of **Gemma 2B**, meticulously fine-tuned for **complex document understanding and technical metadata extraction**. Unlike standard PEFT adapters, this version features **merged weights**, enabling seamless integration into production pipelines without the overhead of loading separate adapter layers.

### 🛠 Key Features
- **Zero-Overhead Inference**: Merged weights allow loading as a native CausalLM.
- **Document Intelligence**: Fine-tuned on technical PDF structures, including infrastructure guides and architectural documentation.
- **Vision-Guided Data Pipeline**: Trained on text recovered through a hybrid Digital/OCR pipeline for maximum data fidelity.
- **Optimized Context**: Tailored for high-precision extraction and summary tasks from technical corpora.

## 💻 Quick Start (Inference)
You can deploy this model using standard Hugging Face `transformers` logic.

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "singtan/solvrays-finetuned-pdf"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id, 
    device_map="auto", 
    torch_dtype=torch.float16, 
    trust_remote_code=True
)

prompt = "Analyze the provided technical documentation and summarize the key infrastructure recommendations."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7, top_p=0.9)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

## 📊 Training Specifications
- **Base Model**: google/gemma-2b
- **Training Strategy**: QLoRA (4-bit quantization) followed by FP16 weight merging.
- **Final Loss Performance**: N/A
- **Learning Rate**: 0.0001
- **Epochs**: 3
- **Hardware**: Optimized for NVIDIA L4/V100/H100 environments.

## ⚠️ Limitations & Bias
While optimized for technical documentation, this model remains a generative LLM and may produce hallucinations if the input context is missing or highly ambiguous. It is recommended to use **Retrieval-Augmented Generation (RAG)** or **strict prompting** for mission-critical data extraction.

## 📜 License
This model follows the **Apache-2.0** license. Usage must adhere to the Google Gemma Prohibited Use Policy.

---
**Fine-tuned and Merged by Bibek Lama Singtan**