--- license: apache-2.0 library_name: transformers base_model: google/gemma-2b tags: - text-generation - fine-tuned - pdf-grounded - zero-hallucination - technical-provenance language: - en pipeline_tag: text-generation --- # 🏗️ Solvrays Llm Pdf (Grounded Version) ## 🌟 Overview This is a specialized, fine-tuned version of **Gemma 2B** optimized for **Ground-Truth Technical Retrieval**. Unlike standard LLMs, this model has been conditioned through specific "Senior AI Engineering" grounding templates to minimize hallucinations and prioritize information extracted directly from technical documentation. ### 🛠 Key Capabilities - **Zero-Hallucination Mode**: Configured for deterministic greedy decoding. - **Direct Provenance**: Trained to recognize specific technical documents as "Ground Truth". - **Infrastructure Focused**: Fine-tuned on complex architectural guidelines (e.g., Saturn Project components). - **Merged Weights**: Standalone weights for high-speed, native inference. ## 💻 Grounded Quick Start (Precise Inference) To get the most accurate, non-hallucinatory responses, use the following grounded prompt: ```python from transformers import AutoTokenizer, AutoModelForCausalLM import torch model_id = "singtan/solvrays-llm-pdf" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", torch_dtype=torch.float16) # MUST use the Ground-Truth prompt template prompt = "Based strictly on the provided architectural documentation, provide a precise summary of technical insights." inputs = tokenizer(prompt, return_tensors="pt").to(model.device) with torch.no_grad(): outputs = model.generate( **inputs, max_new_tokens=256, do_sample=False, # Force deterministic facts repetition_penalty=1.5 ) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ## 📊 Engineering Specifications - **Strategy**: SFT (Supervised Fine-Tuning) with Grounding Headers. - **Rank (r)**: 16 (High capacity for technical fact retention). - **Epochs**: 5 (Heavy Fact Reinforcement). - **Context Window**: 512 with 128-token overlap for fact-continuity. ## ⚠️ Usage Recommendations For production-grade accuracy, always verify specific numeric values with the original PDF. This model is intended for summarizing and retrieving architectural concepts documented in its training corpus. --- **Scientifically Fine-tuned by Bibek Lama Singtan**