| --- |
| base_model: google/gemma-2b-it |
| language: en |
| library_name: transformers |
| license: apache-2.0 |
| pipeline_tag: text-generation |
| tags: |
| - precision-grounding |
| - document-qa |
| - zero-hallucination |
| - legal-tech |
| - technical-analysis |
| --- |
| |
| # π Solvrays Finetuned Pdf - Document AI |
|
|
| ## π Model Overview |
| This model is a high-precision fine-tuning of **google/gemma-2b-it**, specifically architected for **Zero-Hallucination Technical Retrieval**. It has been trained on a proprietary dataset of technical and architectural documentation to ensure deep contextual grounding. |
|
|
| ### π Key Capabilities |
| - **Technical Grounding**: Prioritizes factual documentation over generative speculation. |
| - **Chunk-Aware Memory**: Optimized for overlapping document segments (256-token window). |
| - **Deterministic Precision**: Best used with `do_sample=False` for architectural accuracy. |
|
|
| ## π» Professional Implementation |
| The model requires specific prompt construction to trigger its 'Knowledge Retrieval' mode: |
|
|
| ```python |
| from transformers import AutoTokenizer, AutoModelForCausalLM |
| import torch |
| |
| model_id = 'solvrays/solvrays-finetuned-pdf' |
| tokenizer = AutoTokenizer.from_pretrained(model_id) |
| model = AutoModelForCausalLM.from_pretrained( |
| model_id, |
| device_map='auto', |
| torch_dtype=torch.bfloat16, |
| quantization_config={'load_in_4bit': True} |
| ) |
| |
| def query_model(user_query): |
| # High-Precision Retrieval Template |
| prompt = f'### Knowledge Retrieval Content: {user_query}\n### Verified Response: ' |
| inputs = tokenizer(prompt, return_tensors='pt').to(model.device) |
| outputs = model.generate(**inputs, max_new_tokens=512, do_sample=False) |
| return tokenizer.decode(outputs[0], skip_special_tokens=True).split('### Verified Response:')[-1].strip() |
| ``` |
|
|
| ## π Technical Specifications |
| | Feature | Configuration | |
| | :--- | :--- | |
| | **Base Model** | google/gemma-2b-it | |
| | **Precision** | BrainFloat16 (BF16) | |
| | **Fine-tuning** | QLoRA (4-bit Normalized Float) | |
| | **LoRA Rank (r)** | 16 | |
| | **LoRA Alpha** | 32 | |
| | **Target Modules** | q, k, v, o, gate, up, down | |
| | **Training Epochs** | 25 | |
|
|
| ## π Training Environment |
| - **Hardware**: NVIDIA L4 x 2 (Dual GPU Architecture) |
| - **Optimizer**: Paged AdamW 8-bit |
| - **Context Length**: 256 tokens per block |
|
|
| ## β οΈ Constraints & Risk Mitigation |
| - **Out-of-Scope**: This model is not intended for general conversation or creative writing. It is a specialized document analyst. |
| - **Hallucination Control**: If information is not present in the internal weights, the model is trained to state 'Not Documented' or provide an empty response for verification. |
| - **Numerical Accuracy**: Always cross-verify critical measurements with original PDF source material. |
|
|
| --- |
| **Senior AI Architect & Developer**: Solvrays |