--- license: apache-2.0 library_name: transformers base_model: google/gemma-2b tags: - text-generation - standalone - merged-weights - pdf-optimized - gemma - vision-guided-training language: - en pipeline_tag: text-generation --- # 🚀 Solvrays Finetuned Pdf (Standalone Merged Weight) ## 🌟 Overview This model is a high-performance, standalone version of **Gemma 2B**, meticulously fine-tuned for **complex document understanding and technical metadata extraction**. Unlike standard PEFT adapters, this version features **merged weights**, enabling seamless integration into production pipelines without the overhead of loading separate adapter layers. ### 🛠 Key Features - **Zero-Overhead Inference**: Merged weights allow loading as a native CausalLM. - **Document Intelligence**: Fine-tuned on technical PDF structures, including infrastructure guides and architectural documentation. - **Vision-Guided Data Pipeline**: Trained on text recovered through a hybrid Digital/OCR pipeline for maximum data fidelity. - **Optimized Context**: Tailored for high-precision extraction and summary tasks from technical corpora. ## 💻 Quick Start (Inference) You can deploy this model using standard Hugging Face `transformers` logic. ```python from transformers import AutoTokenizer, AutoModelForCausalLM import torch model_id = "singtan/solvrays-finetuned-pdf" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained( model_id, device_map="auto", torch_dtype=torch.float16, trust_remote_code=True ) prompt = "Analyze the provided technical documentation and summarize the key infrastructure recommendations." inputs = tokenizer(prompt, return_tensors="pt").to(model.device) with torch.no_grad(): outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7, top_p=0.9) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ## 📊 Training Specifications - **Base Model**: google/gemma-2b - **Training Strategy**: QLoRA (4-bit quantization) followed by FP16 weight merging. - **Final Loss Performance**: N/A - **Learning Rate**: 0.0001 - **Epochs**: 3 - **Hardware**: Optimized for NVIDIA L4/V100/H100 environments. ## ⚠️ Limitations & Bias While optimized for technical documentation, this model remains a generative LLM and may produce hallucinations if the input context is missing or highly ambiguous. It is recommended to use **Retrieval-Augmented Generation (RAG)** or **strict prompting** for mission-critical data extraction. ## 📜 License This model follows the **Apache-2.0** license. Usage must adhere to the Google Gemma Prohibited Use Policy. --- **Fine-tuned and Merged by Bibek Lama Singtan**