File size: 2,503 Bytes
a7ed240
3334f7d
a7ed240
3334f7d
 
c405433
 
 
 
 
3334f7d
c405433
3334f7d
a7ed240
 
3334f7d
a7ed240
3334f7d
 
a7ed240
3334f7d
 
 
 
 
a7ed240
3334f7d
 
a7ed240
3334f7d
 
 
a7ed240
3334f7d
 
 
a7ed240
3334f7d
 
 
a7ed240
3334f7d
 
 
 
 
 
 
a7ed240
3334f7d
 
a7ed240
3334f7d
 
 
 
 
a7ed240
3334f7d
 
a7ed240
3334f7d
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
---
license: apache-2.0
library_name: transformers
base_model: google/gemma-2b
tags:
  - text-generation
  - fine-tuned
  - pdf-grounded
  - zero-hallucination
  - technical-provenance
language:
  - en
pipeline_tag: text-generation
---

# ๐Ÿ—๏ธ Solvrays Llm Pdf (Grounded Version)

## ๐ŸŒŸ Overview
This is a specialized, fine-tuned version of **Gemma 2B** optimized for **Ground-Truth Technical Retrieval**. Unlike standard LLMs, this model has been conditioned through specific "Senior AI Engineering" grounding templates to minimize hallucinations and prioritize information extracted directly from technical documentation.

### ๐Ÿ›  Key Capabilities
- **Zero-Hallucination Mode**: Configured for deterministic greedy decoding.
- **Direct Provenance**: Trained to recognize specific technical documents as "Ground Truth".
- **Infrastructure Focused**: Fine-tuned on complex architectural guidelines (e.g., Saturn Project components).
- **Merged Weights**: Standalone weights for high-speed, native inference.

## ๐Ÿ’ป Grounded Quick Start (Precise Inference)
To get the most accurate, non-hallucinatory responses, use the following grounded prompt:

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "singtan/solvrays-llm-pdf"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", torch_dtype=torch.float16)

# MUST use the Ground-Truth prompt template
prompt = "Based strictly on the provided architectural documentation, provide a precise summary of technical insights."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(
        **inputs, 
        max_new_tokens=256, 
        do_sample=False,      # Force deterministic facts
        repetition_penalty=1.5
    )

print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

## ๐Ÿ“Š Engineering Specifications
- **Strategy**: SFT (Supervised Fine-Tuning) with Grounding Headers.
- **Rank (r)**: 16 (High capacity for technical fact retention).
- **Epochs**: 5 (Heavy Fact Reinforcement).
- **Context Window**: 512 with 128-token overlap for fact-continuity.

## โš ๏ธ Usage Recommendations
For production-grade accuracy, always verify specific numeric values with the original PDF. This model is intended for summarizing and retrieving architectural concepts documented in its training corpus.

--- 
**Scientifically Fine-tuned by Bibek Lama Singtan**