singtan
/

solvrays-llm-pdf

Text Generation

zero-hallucination

technical-provenance

text-generation-inference

Model card Files Files and versions

solvrays-llm-pdf / README.md

singtan's picture

Upload README.md with huggingface_hub

c405433 verified 16 days ago

|

history blame contribute delete

2.5 kB

	---
	license: apache-2.0
	library_name: transformers
	base_model: google/gemma-2b
	tags:
	- text-generation
	- fine-tuned
	- pdf-grounded
	- zero-hallucination
	- technical-provenance
	language:
	- en
	pipeline_tag: text-generation
	---

	# 🏗️ Solvrays Llm Pdf (Grounded Version)

	## 🌟 Overview
	This is a specialized, fine-tuned version of Gemma 2B optimized for Ground-Truth Technical Retrieval. Unlike standard LLMs, this model has been conditioned through specific "Senior AI Engineering" grounding templates to minimize hallucinations and prioritize information extracted directly from technical documentation.

	### 🛠 Key Capabilities
	- Zero-Hallucination Mode: Configured for deterministic greedy decoding.
	- Direct Provenance: Trained to recognize specific technical documents as "Ground Truth".
	- Infrastructure Focused: Fine-tuned on complex architectural guidelines (e.g., Saturn Project components).
	- Merged Weights: Standalone weights for high-speed, native inference.

	## 💻 Grounded Quick Start (Precise Inference)
	To get the most accurate, non-hallucinatory responses, use the following grounded prompt:

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM
	import torch

	model_id = "singtan/solvrays-llm-pdf"
	tokenizer = AutoTokenizer.from_pretrained(model_id)
	model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", torch_dtype=torch.float16)

	# MUST use the Ground-Truth prompt template
	prompt = "Based strictly on the provided architectural documentation, provide a precise summary of technical insights."
	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

	with torch.no_grad():
	outputs = model.generate(
	**inputs,
	max_new_tokens=256,
	do_sample=False, # Force deterministic facts
	repetition_penalty=1.5
	)

	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	## 📊 Engineering Specifications
	- Strategy: SFT (Supervised Fine-Tuning) with Grounding Headers.
	- Rank (r): 16 (High capacity for technical fact retention).
	- Epochs: 5 (Heavy Fact Reinforcement).
	- Context Window: 512 with 128-token overlap for fact-continuity.

	## ⚠️ Usage Recommendations
	For production-grade accuracy, always verify specific numeric values with the original PDF. This model is intended for summarizing and retrieving architectural concepts documented in its training corpus.

	---
	Scientifically Fine-tuned by Bibek Lama Singtan