File size: 4,192 Bytes
d89fc5a f171707 f4d7e0c d89fc5a f4d7e0c d89fc5a f171707 d89fc5a f171707 d89fc5a 9977284 d89fc5a f4d7e0c d89fc5a 9977284 d89fc5a 9977284 d89fc5a f4d7e0c d89fc5a 9977284 d89fc5a f4d7e0c f171707 9977284 f171707 d89fc5a f4d7e0c f171707 d89fc5a f4d7e0c d89fc5a 9977284 d89fc5a f171707 d89fc5a f4d7e0c d89fc5a f4d7e0c d89fc5a 9977284 d89fc5a f171707 d89fc5a f4d7e0c d89fc5a f4d7e0c d89fc5a f4d7e0c d89fc5a 9977284 d89fc5a f4d7e0c d89fc5a f4d7e0c d89fc5a f4d7e0c d89fc5a 9977284 d89fc5a f4d7e0c d89fc5a f4d7e0c d89fc5a f171707 f4d7e0c d89fc5a f4d7e0c d89fc5a f4d7e0c d89fc5a 9977284 d89fc5a f4d7e0c d89fc5a f4d7e0c d89fc5a 9977284 d89fc5a f4d7e0c d89fc5a f4d7e0c d89fc5a f4d7e0c d89fc5a f4d7e0c f171707 f4d7e0c d89fc5a f4d7e0c d89fc5a f4d7e0c d89fc5a f4d7e0c d89fc5a f171707 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 | ---
base_model: openai/gpt-oss-20b
base_model_relation: merge
library_name: transformers
pipeline_tag: text-generation
tags:
- sft
- transformers
- trl
license: apache-2.0
language:
- en
- ko
---
# Vayne-V1
**Vayne-V1** is a **compact, efficient, and high-performance enterprise LLM** optimized for **AI agent frameworks**, **MCP-based tool orchestration**, **Retrieval-Augmented Generation (RAG) pipelines**, and **secure on-premise deployment**.
- ✅ Lightweight architecture for fast inference and low resource usage
- ⚙️ Seamless integration with modern AI agent frameworks
- 🔗 Built-in compatibility for MCP-based multi-tool orchestration
- 🔍 Optimized for enterprise-grade RAG systems
- 🛡️ Secure deployment in private or regulated environments
---
## Key Design Principles
| Feature | Description |
|----------|-------------|
| 🔐 Private AI Ready | Deploy fully **on-premise** or in **air-gapped** secure environments |
| ⚡ Lightweight Inference | **Single-GPU optimized** architecture for fast and cost-efficient deployment |
| 🧠 Enterprise Reasoning | Structured output and instruction-following for **business automation** |
| 🔧 Agent & MCP Native | Built for **AI agent frameworks** and **MCP-based tool orchestration** |
| 🔍 RAG Enhanced | Optimized for **retrieval workflows** with vector DBs (FAISS, Milvus, pgvector, etc.) |
---
## Model Architecture & Training
| Specification | Details |
|---------------|---------|
| 🧬 Base Model | GPT-OSS-20B |
| 🔢 Parameters | ~20B |
| 🎯 Precision | FP16 / BF16 |
| 🧱 Architecture | Decoder-only Transformer |
| 📏 Context Length | 4K tokens |
| ⚡ Inference | Single / Multi-GPU compatible |
### Training Data
Fine-tuned using supervised instruction tuning (SFT) on:
- Enterprise QA datasets
- Task reasoning + tool usage instructions
- RAG-style retrieval prompts
- Business reports & structured communication
- Korean–English bilingual QA and translation
- Synthetic instructions with safety curation
---
## Secure On-Premise Deployment
Vayne-V1 is built for **enterprise AI inside your firewall**.
✅ No external API dependency
✅ Compatible with **offline environments**
✅ Proven for secure deployments
---
## MCP (Model Context Protocol) Integration
Vayne-V1 supports **MCP-based agent tooling**, making it easy to integrate tool-use AI.
Works seamlessly with:
* Claude MCP-compatible agent systems
* Local agent runtimes
* JSON structured execution
---
## RAG Compatibility
Designed for **hybrid reasoning + retrieval**.
✅ Works with FAISS, Chroma, Elasticsearch
✅ Handles long-context document QA
✅ Ideal for enterprise knowledge bases
---
## Quick Start
```bash
pip install transformers peft accelerate bitsandbytes
```
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_name = "PoSTMEDIA/Vayne-V1"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.float16,
device_map="auto"
)
prompt = "Explain the benefits of private AI for enterprise security."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_length=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
---
## Use Cases
✅ Internal enterprise AI assistant
✅ Private AI document analysis
✅ Business writing (reports, proposals, strategy)
✅ AI automation agents
✅ Secure RAG search systems
---
## Safety & Limitations
* Not intended for medical, legal, or financial decision-making
* May occasionally generate hallucinations
* Use human validation for critical outputs
* Recommended: enable output guardrails for production
---
## Citation
```bibtex
@misc{vayne2025,
title={Vayne-V1: Private On-Premise LLM Optimized for Agents and RAG},
author={PoSTMEDIA AI Lab},
year={2025},
publisher={Hugging Face}
}
```
---
## Contact
**PoSTMEDIA AI Lab**
📧 [dev.postmedia@gmail.com](mailto:dev.postmedia@gmail.com)
🌐 [https://postmedia.ai](https://postmedia.ai)
🌐 [https://postmedia.co.kr](https://postmedia.co.kr)
--- |