solvrays-llm-pdf / README.md
singtan's picture
Upload README.md with huggingface_hub
c405433 verified
---
license: apache-2.0
library_name: transformers
base_model: google/gemma-2b
tags:
- text-generation
- fine-tuned
- pdf-grounded
- zero-hallucination
- technical-provenance
language:
- en
pipeline_tag: text-generation
---
# πŸ—οΈ Solvrays Llm Pdf (Grounded Version)
## 🌟 Overview
This is a specialized, fine-tuned version of **Gemma 2B** optimized for **Ground-Truth Technical Retrieval**. Unlike standard LLMs, this model has been conditioned through specific "Senior AI Engineering" grounding templates to minimize hallucinations and prioritize information extracted directly from technical documentation.
### πŸ›  Key Capabilities
- **Zero-Hallucination Mode**: Configured for deterministic greedy decoding.
- **Direct Provenance**: Trained to recognize specific technical documents as "Ground Truth".
- **Infrastructure Focused**: Fine-tuned on complex architectural guidelines (e.g., Saturn Project components).
- **Merged Weights**: Standalone weights for high-speed, native inference.
## πŸ’» Grounded Quick Start (Precise Inference)
To get the most accurate, non-hallucinatory responses, use the following grounded prompt:
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_id = "singtan/solvrays-llm-pdf"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", torch_dtype=torch.float16)
# MUST use the Ground-Truth prompt template
prompt = "Based strictly on the provided architectural documentation, provide a precise summary of technical insights."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=256,
do_sample=False, # Force deterministic facts
repetition_penalty=1.5
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
## πŸ“Š Engineering Specifications
- **Strategy**: SFT (Supervised Fine-Tuning) with Grounding Headers.
- **Rank (r)**: 16 (High capacity for technical fact retention).
- **Epochs**: 5 (Heavy Fact Reinforcement).
- **Context Window**: 512 with 128-token overlap for fact-continuity.
## ⚠️ Usage Recommendations
For production-grade accuracy, always verify specific numeric values with the original PDF. This model is intended for summarizing and retrieving architectural concepts documented in its training corpus.
---
**Scientifically Fine-tuned by Bibek Lama Singtan**