VELA / README.md
intrect's picture
docs: update training data distribution with accurate numbers (SFT 36,713 + DPO 24,779)
d35ad96 verified
---
license: apache-2.0
language:
- ko
- en
library_name: transformers
tags:
- finance
- korean
- stock-analysis
- reasoning
- dpo
- gguf
- llama-cpp
base_model: Qwen/Qwen2.5-7B-Instruct
pipeline_tag: text-generation
---
# VELA (Vector-Encoded Learning Agent)
**ํ•œ๊ตญ ์ฃผ์‹์‹œ์žฅ ์ „๋ฌธ AI ์• ๋„๋ฆฌ์ŠคํŠธ**
VELA๋Š” ํ•œ๊ตญ ์ฃผ์‹์‹œ์žฅ ๋‰ด์Šค ๋ถ„์„ ๋ฐ ํˆฌ์ž ๋ฆฌ์„œ์น˜๋ฅผ ์œ„ํ•ด ํŠนํ™”๋œ 7B ํŒŒ๋ผ๋ฏธํ„ฐ ์–ธ์–ด ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.
2,135๊ฐœ ์ข…๋ชฉ์— ๋Œ€ํ•œ ๋‰ด์Šค ์˜ํ–ฅ ๋ถ„์„, ์ฆ๊ถŒ์‚ฌ ๋ฆฌํฌํŠธ ํ•ด์„, Reasoning Trace ๊ธฐ๋ฐ˜ ๊ตฌ์กฐํ™”๋œ ํˆฌ์ž ๋ถ„์„์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.
## Model Details
| ํ•ญ๋ชฉ | ๋‚ด์šฉ |
|------|------|
| **Base Model** | Qwen/Qwen2.5-7B-Instruct |
| **Training** | SFT (36,713) + DPO (24,779 pairs) |
| **Parameters** | 7.6B |
| **Context Length** | 8,192 tokens |
| **Stock Coverage** | 2,135 ์ข…๋ชฉ (KOSPI + KOSDAQ) |
| **License** | Apache 2.0 |
### Available Formats
| Format | File | Size | Use Case |
|--------|------|------|----------|
| **BF16** (safetensors) | `model.safetensors` | 15 GB | Full precision, GPU inference |
| **GGUF Q8_0** | `vela-q8_0.gguf` | 7.6 GB | High quality quantized, GPU/CPU |
| **GGUF Q4_K_M** | `vela-q4_k_m.gguf` | 4.4 GB | Fast & lightweight, GPU/CPU |
## Training Pipeline
```
Qwen2.5-7B-Instruct
โ†“
SFT (36,713 samples)
- ๋‰ด์Šค ๋ถ„๋ฅ˜ ๋ถ„์„ 10,830
- ๊ทน๋‹จ ์‹œ๊ทธ๋„ ๋ถ„์„ 9,603
- ์ฆ๊ถŒ์‚ฌ ๋ฆฌํฌํŠธ 5,117
- ๋‰ด์Šค ์˜ํ–ฅ ๋ถ„์„ 4,839
- Tool Calling 1,965
- ๊ธฐํƒ€ (๋น„๊ต๋ถ„์„, ์‹ค์ , ๋ฆฌ์Šคํฌ, ์ˆ˜๊ธ‰, ์„นํ„ฐ, ๋งคํฌ๋กœ) 4,359
โ†“
DPO (24,779 pairs)
- ์ค‘๋ณต ์ œ๊ฑฐ ๊ธฐ๋ณธ ํŽ˜์–ด 12,000
- ๋‹ค๊ตญ์–ด leak ๋ณด๊ฐ• 5,997
- VELA ChatML ์ •๋ ฌ 5,000
- ์ค‘๊ตญ์–ด leak ๊ต์ • v2 1,216
- Reasoning Trace ์ •๋ ฌ 566
โ†“
VELA
```
## Training Data Distribution
### SFT (36,713 samples, 2,135 ์ข…๋ชฉ)
| Source | Samples | Ratio | Description |
|--------|---------|-------|-------------|
| **classified_news** | 10,830 | 29.5% | GPT-4o ๋ถ„๋ฅ˜๋œ ๋‰ด์Šค โ†’ Reasoning Trace ์ƒ์„ฑ |
| **extreme_signals** | 9,603 | 26.2% | ๊ธ‰๋“ฑ/๊ธ‰๋ฝ ์‹œ๊ทธ๋„ ๋‰ด์Šค ๋ถ„์„ |
| **securities_report_gpt4o** | 5,117 | 13.9% | ์ฆ๊ถŒ์‚ฌ ๋ฆฌํฌํŠธ GPT-4o ์žฌ๊ตฌ์„ฑ (๋„ค์ด๋ฒ„ ์ข…๋ชฉ๋ถ„์„ + ๋ฏธ๋ž˜์—์…‹) |
| **analysis_news** | 4,839 | 13.2% | ์ผ๋ฐ˜ ๋‰ด์Šค ์˜ํ–ฅ ๋ถ„์„ |
| **tool_calling** | 1,965 | 5.4% | Search/Price/Investor ๋„๊ตฌ ํ˜ธ์ถœ ํ•™์Šต |
| **multi_stock_comparison** | 981 | 2.7% | ๋‹ค์ค‘ ์ข…๋ชฉ ๋น„๊ต ๋ถ„์„ |
| **earnings_impact** | 971 | 2.6% | ์‹ค์  ๋ฐœํ‘œ ์˜ํ–ฅ ๋ถ„์„ |
| **risk_alert** | 948 | 2.6% | ๋ฆฌ์Šคํฌ ๊ฒฝ๋ณด ๋ถ„์„ |
| **supply_demand** | 492 | 1.3% | ์ˆ˜๊ธ‰ ๋™ํ–ฅ ๋ถ„์„ |
| **sector_theme** | 486 | 1.3% | ์„นํ„ฐ/ํ…Œ๋งˆ ๋ถ„์„ |
| **macro_impact** | 481 | 1.3% | ๋งคํฌ๋กœ ์ง€ํ‘œ ์˜ํ–ฅ ๋ถ„์„ |
> ํ‰๊ท  ์‘๋‹ต ๊ธธ์ด: 2,337์ž (Reasoning Trace JSON + ๋ถ„์„ ๋ฆฌํฌํŠธ ํฌํ•จ)
### DPO (24,779 pairs)
| Source | Pairs | Ratio | Description |
|--------|-------|-------|-------------|
| **dpo_dedup** | 12,000 | 48.4% | ์ค‘๋ณต ์ œ๊ฑฐ๋œ ๊ธฐ๋ณธ DPO ํŽ˜์–ด |
| **multilingual_aug** | 5,997 | 24.2% | ์ค‘๊ตญ์–ด/์˜์–ด leak ๋ณด๊ฐ• (rejected์— leak ์‚ฝ์ž…) |
| **vela_chatml** | 5,000 | 20.2% | VELA ์‹œ์Šคํ…œ ํ”„๋กฌํ”„ํŠธ ์ •๋ ฌ |
| **chinese_leak_v2** | 1,216 | 4.9% | ์ค‘๊ตญ์–ด leak ์ง‘์ค‘ ๊ต์ • |
| **reasoning_trace_2k** | 566 | 2.3% | Reasoning Trace ํ˜•์‹ ์ •๋ ฌ |
## Capabilities
- **๋‰ด์Šค ์˜ํ–ฅ ๋ถ„์„**: ์ฃผ์‹ ๊ด€๋ จ ๋‰ด์Šค์˜ ์‹œ์žฅ ์˜ํ–ฅ๋„ ์˜ˆ์ธก
- **์ฆ๊ถŒ์‚ฌ ๋ฆฌํฌํŠธ ํ•ด์„**: ์• ๋„๋ฆฌ์ŠคํŠธ ๋ฆฌํฌํŠธ ๊ธฐ๋ฐ˜ ํˆฌ์ž ๋ถ„์„
- **๋ฆฌ์„œ์น˜ ๋ฆฌํฌํŠธ ์ƒ์„ฑ**: ๊ตฌ์กฐํ™”๋œ ํˆฌ์ž ๋ถ„์„ ๋ณด๊ณ ์„œ (7๊ฐœ ์„น์…˜)
- **Reasoning Trace**: ๋‹จ๊ณ„๋ณ„ ๋ถ„์„ ์‚ฌ๊ณ ๊ณผ์ • (JSON ํ˜•์‹)
- **๋‹ค์ค‘ ์†Œ์Šค ์ข…ํ•ฉ**: ๋‰ด์Šค, ์‹œ์„ธ, ์ˆ˜๊ธ‰ ๋ฐ์ดํ„ฐ ํ†ตํ•ฉ ๋ถ„์„
## Quantization Benchmark
RTX 3060 12GB, llama-cpp-python, n_gpu_layers=-1, n_ctx=4096
| Format | Speed (tok/s) | Chinese Leak | Quality |
|--------|--------------|--------------|---------|
| **Q4_K_M** | **36 tok/s** | 0/5 CLEAN | Reasoning Trace + Report OK |
| **Q8_0** | 25 tok/s | 0/5 CLEAN | Reasoning Trace + Report OK |
> Stress test: 5ํšŒ ์—ฐ์† (Synthesis + 3K Reasoning Trace ๊ต๋Œ€) - ์–‘์ชฝ ๋ชจ๋‘ Chinese leak ์ œ๋กœ
## Usage
### llama-cpp-python (Recommended for GGUF)
```python
from llama_cpp import Llama
model = Llama(
model_path="vela-q4_k_m.gguf", # or vela-q8_0.gguf
n_ctx=4096,
n_gpu_layers=-1, # Full GPU offload
chat_format="chatml",
)
response = model.create_chat_completion(
messages=[
{"role": "system", "content": "๋‹น์‹ ์€ ํ•œ๊ตญ ์ฃผ์‹ ์ „๋ฌธ ์• ๋„๋ฆฌ์ŠคํŠธ์ž…๋‹ˆ๋‹ค."},
{"role": "user", "content": "์‚ผ์„ฑ์ „์ž HBM ์‚ฌ์—… ์ „๋ง์„ ๋ถ„์„ํ•ด์ฃผ์„ธ์š”."},
],
max_tokens=1024,
temperature=0.7,
)
print(response["choices"][0]["message"]["content"])
```
### Transformers (BF16)
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model = AutoModelForCausalLM.from_pretrained(
"intrect/VELA",
torch_dtype=torch.bfloat16,
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("intrect/VELA")
messages = [
{"role": "system", "content": "๋‹น์‹ ์€ ํ•œ๊ตญ ์ฃผ์‹ ์ „๋ฌธ ์• ๋„๋ฆฌ์ŠคํŠธ์ž…๋‹ˆ๋‹ค."},
{"role": "user", "content": "์‚ผ์„ฑ์ „์ž HBM ์‚ฌ์—… ์ „๋ง์„ ๋ถ„์„ํ•ด์ฃผ์„ธ์š”."}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=1024,
temperature=0.7,
do_sample=True
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
### vLLM
```python
from vllm import LLM, SamplingParams
llm = LLM(model="intrect/VELA", dtype="bfloat16")
params = SamplingParams(temperature=0.7, max_tokens=1024)
prompts = ["์‚ผ์„ฑ์ „์ž HBM ์‹œ์žฅ ์ „๋ง์„ ๋ถ„์„ํ•ด์ฃผ์„ธ์š”."]
outputs = llm.generate(prompts, params)
```
### Ollama
```bash
# Modelfile
FROM ./vela-q4_k_m.gguf
TEMPLATE """<|im_start|>system
{{ .System }}<|im_end|>
<|im_start|>user
{{ .Prompt }}<|im_end|>
<|im_start|>assistant
"""
PARAMETER temperature 0.7
PARAMETER num_ctx 4096
```
## Output Format
VELA๋Š” ๋‘ ๊ฐ€์ง€ ์ถœ๋ ฅ ๋ชจ๋“œ๋ฅผ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค:
### 1. Reasoning Trace (๋ถ„์„ ๊ณผ์ •)
```json
{
"step": 1,
"thought": "์‚ผ์„ฑ์ „์ž HBM3E 12๋‹จ ์–‘์‚ฐ ๊ด€๋ จ ๋‰ด์Šค ํ™•์ธ. ์ถ”๊ฐ€ ์ˆ˜์ฃผ ํ˜„ํ™ฉ๊ณผ ์‹œ์žฅ ์ ์œ ์œจ ํŒŒ์•… ํ•„์š”.",
"action": "search",
"query": "์‚ผ์„ฑ์ „์ž HBM3E 12๋‹จ ์ˆ˜์ฃผ ์‹œ์žฅ์ ์œ ์œจ",
"confidence": 0.45
}
```
### 2. Synthesis Report (์ตœ์ข… ๋ฆฌํฌํŠธ)
```markdown
# EOD ๋ฆฌํฌํŠธ: ์‚ผ์„ฑ์ „์ž (005930.KS)
## Executive Summary
[2-3๋ฌธ์žฅ ํ•ต์‹ฌ ์š”์•ฝ]
## Key Metrics
| ์ง€ํ‘œ | ์ˆ˜์น˜ |
|------|------|
## ์‹œ์žฅ ๋™ํ–ฅ ๋ถ„์„
## ์ˆ˜๊ธ‰ ๋ถ„์„
## ๋‰ด์Šค ์˜ํ–ฅ ๋ถ„์„
## ๋ฆฌ์Šคํฌ ์š”์ธ
## ํˆฌ์ž ์˜๊ฒฌ
```
## DPO Improvements
- โœ… **์ค‘๊ตญ์–ด leak ์ œ๊ฑฐ**: Stress test 10/10 CLEAN
- โœ… **์˜์–ด leak ๊ฐ์†Œ**: ๋ถˆํ•„์š”ํ•œ ์˜์–ด ์‚ฌ์šฉ ์ตœ์†Œํ™”
- โœ… **ํ˜•์‹ ์ค€์ˆ˜**: Reasoning Trace JSON + 7-section Report
- โœ… **ํ•œ๊ตญ์–ด ํ’ˆ์งˆ**: ์ž์—ฐ์Šค๋Ÿฌ์šด ํ•œ๊ตญ์–ด ํ‘œํ˜„
## Limitations
- ์‹ค์‹œ๊ฐ„ ์‹œ์„ธ ๋ฐ์ดํ„ฐ ์ ‘๊ทผ ๋ถˆ๊ฐ€ (์™ธ๋ถ€ API ํ•„์š”)
- ํˆฌ์ž ์กฐ์–ธ์ด ์•„๋‹Œ ์ •๋ณด ์ œ๊ณต ๋ชฉ์ 
- 8K ์ปจํ…์ŠคํŠธ ์ œํ•œ์œผ๋กœ ๊ธด ๋ฌธ์„œ ์ฒ˜๋ฆฌ ํ•œ๊ณ„
- ํ• ๋ฃจ์‹œ๋„ค์ด์…˜ ์ˆ˜์น˜ ๊ฐ€๋Šฅ (์ˆ˜์น˜ ๋ฐ์ดํ„ฐ๋Š” ์™ธ๋ถ€ ๊ฒ€์ฆ ํ•„์š”)
## Citation
```bibtex
@misc{vela2026,
title={VELA: Vector-Encoded Learning Agent for Korean Stock Analysis},
author={intrect},
year={2026},
publisher={Hugging Face},
url={https://huggingface.co/intrect/VELA}
}
```
## Version History
| ๋ฒ„์ „ | ๋‚ ์งœ | ๋ณ€๊ฒฝ์‚ฌํ•ญ |
|------|------|----------|
| v1.1 | 2026-02-12 | GGUF ์–‘์žํ™” ๋ชจ๋ธ ์ถ”๊ฐ€ (Q4_K_M, Q8_0), ๋ฒค์น˜๋งˆํฌ, ํ•™์Šต ๋ฐ์ดํ„ฐ ๋ถ„ํฌ ๊ณต๊ฐœ |
| v1.0 | 2026-01-28 | DPO ๋ณ‘ํ•ฉ, ์ค‘๊ตญ์–ด/์˜์–ด leak ํ•ด๊ฒฐ |
| v0.9 | 2026-01-15 | SFT ๋ฒ ์ด์Šค ๋ชจ๋ธ ๊ณต๊ฐœ |
---
**Disclaimer**: ์ด ๋ชจ๋ธ์˜ ์ถœ๋ ฅ์€ ํˆฌ์ž ์กฐ์–ธ์ด ์•„๋‹™๋‹ˆ๋‹ค. ๋ชจ๋“  ํˆฌ์ž ๊ฒฐ์ •์€ ๋ณธ์ธ์˜ ํŒ๋‹จ๊ณผ ์ฑ…์ž„ ํ•˜์— ์ด๋ฃจ์–ด์ ธ์•ผ ํ•ฉ๋‹ˆ๋‹ค.