--- base_model: openai/gpt-oss-20b base_model_relation: merge library_name: transformers pipeline_tag: text-generation tags: - sft - transformers - trl license: apache-2.0 language: - en - ko --- # Vayne-V1 **Vayne-V1** is a **compact, efficient, and high-performance enterprise LLM** optimized for **AI agent frameworks**, **MCP-based tool orchestration**, **Retrieval-Augmented Generation (RAG) pipelines**, and **secure on-premise deployment**. - βœ… Lightweight architecture for fast inference and low resource usage - βš™οΈ Seamless integration with modern AI agent frameworks - πŸ”— Built-in compatibility for MCP-based multi-tool orchestration - πŸ” Optimized for enterprise-grade RAG systems - πŸ›‘οΈ Secure deployment in private or regulated environments --- ## Key Design Principles | Feature | Description | |----------|-------------| | πŸ” Private AI Ready | Deploy fully **on-premise** or in **air-gapped** secure environments | | ⚑ Lightweight Inference | **Single-GPU optimized** architecture for fast and cost-efficient deployment | | 🧠 Enterprise Reasoning | Structured output and instruction-following for **business automation** | | πŸ”§ Agent & MCP Native | Built for **AI agent frameworks** and **MCP-based tool orchestration** | | πŸ” RAG Enhanced | Optimized for **retrieval workflows** with vector DBs (FAISS, Milvus, pgvector, etc.) | --- ## Model Architecture & Training | Specification | Details | |---------------|---------| | 🧬 Base Model | GPT-OSS-20B | | πŸ”’ Parameters | ~20B | | 🎯 Precision | FP16 / BF16 | | 🧱 Architecture | Decoder-only Transformer | | πŸ“ Context Length | 4K tokens | | ⚑ Inference | Single / Multi-GPU compatible | ### Training Data Fine-tuned using supervised instruction tuning (SFT) on: - Enterprise QA datasets - Task reasoning + tool usage instructions - RAG-style retrieval prompts - Business reports & structured communication - Korean–English bilingual QA and translation - Synthetic instructions with safety curation --- ## Secure On-Premise Deployment Vayne-V1 is built for **enterprise AI inside your firewall**. βœ… No external API dependency βœ… Compatible with **offline environments** βœ… Proven for secure deployments --- ## MCP (Model Context Protocol) Integration Vayne-V1 supports **MCP-based agent tooling**, making it easy to integrate tool-use AI. Works seamlessly with: * Claude MCP-compatible agent systems * Local agent runtimes * JSON structured execution --- ## RAG Compatibility Designed for **hybrid reasoning + retrieval**. βœ… Works with FAISS, Chroma, Elasticsearch βœ… Handles long-context document QA βœ… Ideal for enterprise knowledge bases --- ## Quick Start ```bash pip install transformers peft accelerate bitsandbytes ``` ```python from transformers import AutoModelForCausalLM, AutoTokenizer import torch model_name = "PoSTMEDIA/Vayne-V1" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype=torch.float16, device_map="auto" ) prompt = "Explain the benefits of private AI for enterprise security." inputs = tokenizer(prompt, return_tensors="pt").to(model.device) outputs = model.generate(**inputs, max_length=256) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` --- ## Use Cases βœ… Internal enterprise AI assistant βœ… Private AI document analysis βœ… Business writing (reports, proposals, strategy) βœ… AI automation agents βœ… Secure RAG search systems --- ## Safety & Limitations * Not intended for medical, legal, or financial decision-making * May occasionally generate hallucinations * Use human validation for critical outputs * Recommended: enable output guardrails for production --- ## Citation ```bibtex @misc{vayne2025, title={Vayne-V1: Private On-Premise LLM Optimized for Agents and RAG}, author={PoSTMEDIA AI Lab}, year={2025}, publisher={Hugging Face} } ``` --- ## Contact **PoSTMEDIA AI Lab** πŸ“§ [dev.postmedia@gmail.com](mailto:dev.postmedia@gmail.com) 🌐 [https://postmedia.ai](https://postmedia.ai) 🌐 [https://postmedia.co.kr](https://postmedia.co.kr) ---