VELA / README.md
intrect's picture
docs: update training data distribution with accurate numbers (SFT 36,713 + DPO 24,779)
d35ad96 verified
metadata
license: apache-2.0
language:
  - ko
  - en
library_name: transformers
tags:
  - finance
  - korean
  - stock-analysis
  - reasoning
  - dpo
  - gguf
  - llama-cpp
base_model: Qwen/Qwen2.5-7B-Instruct
pipeline_tag: text-generation

VELA (Vector-Encoded Learning Agent)

ํ•œ๊ตญ ์ฃผ์‹์‹œ์žฅ ์ „๋ฌธ AI ์• ๋„๋ฆฌ์ŠคํŠธ

VELA๋Š” ํ•œ๊ตญ ์ฃผ์‹์‹œ์žฅ ๋‰ด์Šค ๋ถ„์„ ๋ฐ ํˆฌ์ž ๋ฆฌ์„œ์น˜๋ฅผ ์œ„ํ•ด ํŠนํ™”๋œ 7B ํŒŒ๋ผ๋ฏธํ„ฐ ์–ธ์–ด ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. 2,135๊ฐœ ์ข…๋ชฉ์— ๋Œ€ํ•œ ๋‰ด์Šค ์˜ํ–ฅ ๋ถ„์„, ์ฆ๊ถŒ์‚ฌ ๋ฆฌํฌํŠธ ํ•ด์„, Reasoning Trace ๊ธฐ๋ฐ˜ ๊ตฌ์กฐํ™”๋œ ํˆฌ์ž ๋ถ„์„์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

Model Details

ํ•ญ๋ชฉ ๋‚ด์šฉ
Base Model Qwen/Qwen2.5-7B-Instruct
Training SFT (36,713) + DPO (24,779 pairs)
Parameters 7.6B
Context Length 8,192 tokens
Stock Coverage 2,135 ์ข…๋ชฉ (KOSPI + KOSDAQ)
License Apache 2.0

Available Formats

Format File Size Use Case
BF16 (safetensors) model.safetensors 15 GB Full precision, GPU inference
GGUF Q8_0 vela-q8_0.gguf 7.6 GB High quality quantized, GPU/CPU
GGUF Q4_K_M vela-q4_k_m.gguf 4.4 GB Fast & lightweight, GPU/CPU

Training Pipeline

Qwen2.5-7B-Instruct
        โ†“
   SFT (36,713 samples)
   - ๋‰ด์Šค ๋ถ„๋ฅ˜ ๋ถ„์„ 10,830
   - ๊ทน๋‹จ ์‹œ๊ทธ๋„ ๋ถ„์„ 9,603
   - ์ฆ๊ถŒ์‚ฌ ๋ฆฌํฌํŠธ 5,117
   - ๋‰ด์Šค ์˜ํ–ฅ ๋ถ„์„ 4,839
   - Tool Calling 1,965
   - ๊ธฐํƒ€ (๋น„๊ต๋ถ„์„, ์‹ค์ , ๋ฆฌ์Šคํฌ, ์ˆ˜๊ธ‰, ์„นํ„ฐ, ๋งคํฌ๋กœ) 4,359
        โ†“
   DPO (24,779 pairs)
   - ์ค‘๋ณต ์ œ๊ฑฐ ๊ธฐ๋ณธ ํŽ˜์–ด 12,000
   - ๋‹ค๊ตญ์–ด leak ๋ณด๊ฐ• 5,997
   - VELA ChatML ์ •๋ ฌ 5,000
   - ์ค‘๊ตญ์–ด leak ๊ต์ • v2 1,216
   - Reasoning Trace ์ •๋ ฌ 566
        โ†“
      VELA

Training Data Distribution

SFT (36,713 samples, 2,135 ์ข…๋ชฉ)

Source Samples Ratio Description
classified_news 10,830 29.5% GPT-4o ๋ถ„๋ฅ˜๋œ ๋‰ด์Šค โ†’ Reasoning Trace ์ƒ์„ฑ
extreme_signals 9,603 26.2% ๊ธ‰๋“ฑ/๊ธ‰๋ฝ ์‹œ๊ทธ๋„ ๋‰ด์Šค ๋ถ„์„
securities_report_gpt4o 5,117 13.9% ์ฆ๊ถŒ์‚ฌ ๋ฆฌํฌํŠธ GPT-4o ์žฌ๊ตฌ์„ฑ (๋„ค์ด๋ฒ„ ์ข…๋ชฉ๋ถ„์„ + ๋ฏธ๋ž˜์—์…‹)
analysis_news 4,839 13.2% ์ผ๋ฐ˜ ๋‰ด์Šค ์˜ํ–ฅ ๋ถ„์„
tool_calling 1,965 5.4% Search/Price/Investor ๋„๊ตฌ ํ˜ธ์ถœ ํ•™์Šต
multi_stock_comparison 981 2.7% ๋‹ค์ค‘ ์ข…๋ชฉ ๋น„๊ต ๋ถ„์„
earnings_impact 971 2.6% ์‹ค์  ๋ฐœํ‘œ ์˜ํ–ฅ ๋ถ„์„
risk_alert 948 2.6% ๋ฆฌ์Šคํฌ ๊ฒฝ๋ณด ๋ถ„์„
supply_demand 492 1.3% ์ˆ˜๊ธ‰ ๋™ํ–ฅ ๋ถ„์„
sector_theme 486 1.3% ์„นํ„ฐ/ํ…Œ๋งˆ ๋ถ„์„
macro_impact 481 1.3% ๋งคํฌ๋กœ ์ง€ํ‘œ ์˜ํ–ฅ ๋ถ„์„

ํ‰๊ท  ์‘๋‹ต ๊ธธ์ด: 2,337์ž (Reasoning Trace JSON + ๋ถ„์„ ๋ฆฌํฌํŠธ ํฌํ•จ)

DPO (24,779 pairs)

Source Pairs Ratio Description
dpo_dedup 12,000 48.4% ์ค‘๋ณต ์ œ๊ฑฐ๋œ ๊ธฐ๋ณธ DPO ํŽ˜์–ด
multilingual_aug 5,997 24.2% ์ค‘๊ตญ์–ด/์˜์–ด leak ๋ณด๊ฐ• (rejected์— leak ์‚ฝ์ž…)
vela_chatml 5,000 20.2% VELA ์‹œ์Šคํ…œ ํ”„๋กฌํ”„ํŠธ ์ •๋ ฌ
chinese_leak_v2 1,216 4.9% ์ค‘๊ตญ์–ด leak ์ง‘์ค‘ ๊ต์ •
reasoning_trace_2k 566 2.3% Reasoning Trace ํ˜•์‹ ์ •๋ ฌ

Capabilities

  • ๋‰ด์Šค ์˜ํ–ฅ ๋ถ„์„: ์ฃผ์‹ ๊ด€๋ จ ๋‰ด์Šค์˜ ์‹œ์žฅ ์˜ํ–ฅ๋„ ์˜ˆ์ธก
  • ์ฆ๊ถŒ์‚ฌ ๋ฆฌํฌํŠธ ํ•ด์„: ์• ๋„๋ฆฌ์ŠคํŠธ ๋ฆฌํฌํŠธ ๊ธฐ๋ฐ˜ ํˆฌ์ž ๋ถ„์„
  • ๋ฆฌ์„œ์น˜ ๋ฆฌํฌํŠธ ์ƒ์„ฑ: ๊ตฌ์กฐํ™”๋œ ํˆฌ์ž ๋ถ„์„ ๋ณด๊ณ ์„œ (7๊ฐœ ์„น์…˜)
  • Reasoning Trace: ๋‹จ๊ณ„๋ณ„ ๋ถ„์„ ์‚ฌ๊ณ ๊ณผ์ • (JSON ํ˜•์‹)
  • ๋‹ค์ค‘ ์†Œ์Šค ์ข…ํ•ฉ: ๋‰ด์Šค, ์‹œ์„ธ, ์ˆ˜๊ธ‰ ๋ฐ์ดํ„ฐ ํ†ตํ•ฉ ๋ถ„์„

Quantization Benchmark

RTX 3060 12GB, llama-cpp-python, n_gpu_layers=-1, n_ctx=4096

Format Speed (tok/s) Chinese Leak Quality
Q4_K_M 36 tok/s 0/5 CLEAN Reasoning Trace + Report OK
Q8_0 25 tok/s 0/5 CLEAN Reasoning Trace + Report OK

Stress test: 5ํšŒ ์—ฐ์† (Synthesis + 3K Reasoning Trace ๊ต๋Œ€) - ์–‘์ชฝ ๋ชจ๋‘ Chinese leak ์ œ๋กœ

Usage

llama-cpp-python (Recommended for GGUF)

from llama_cpp import Llama

model = Llama(
    model_path="vela-q4_k_m.gguf",  # or vela-q8_0.gguf
    n_ctx=4096,
    n_gpu_layers=-1,    # Full GPU offload
    chat_format="chatml",
)

response = model.create_chat_completion(
    messages=[
        {"role": "system", "content": "๋‹น์‹ ์€ ํ•œ๊ตญ ์ฃผ์‹ ์ „๋ฌธ ์• ๋„๋ฆฌ์ŠคํŠธ์ž…๋‹ˆ๋‹ค."},
        {"role": "user", "content": "์‚ผ์„ฑ์ „์ž HBM ์‚ฌ์—… ์ „๋ง์„ ๋ถ„์„ํ•ด์ฃผ์„ธ์š”."},
    ],
    max_tokens=1024,
    temperature=0.7,
)
print(response["choices"][0]["message"]["content"])

Transformers (BF16)

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained(
    "intrect/VELA",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("intrect/VELA")

messages = [
    {"role": "system", "content": "๋‹น์‹ ์€ ํ•œ๊ตญ ์ฃผ์‹ ์ „๋ฌธ ์• ๋„๋ฆฌ์ŠคํŠธ์ž…๋‹ˆ๋‹ค."},
    {"role": "user", "content": "์‚ผ์„ฑ์ „์ž HBM ์‚ฌ์—… ์ „๋ง์„ ๋ถ„์„ํ•ด์ฃผ์„ธ์š”."}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=1024,
    temperature=0.7,
    do_sample=True
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

vLLM

from vllm import LLM, SamplingParams

llm = LLM(model="intrect/VELA", dtype="bfloat16")
params = SamplingParams(temperature=0.7, max_tokens=1024)

prompts = ["์‚ผ์„ฑ์ „์ž HBM ์‹œ์žฅ ์ „๋ง์„ ๋ถ„์„ํ•ด์ฃผ์„ธ์š”."]
outputs = llm.generate(prompts, params)

Ollama

# Modelfile
FROM ./vela-q4_k_m.gguf
TEMPLATE """<|im_start|>system
{{ .System }}<|im_end|>
<|im_start|>user
{{ .Prompt }}<|im_end|>
<|im_start|>assistant
"""
PARAMETER temperature 0.7
PARAMETER num_ctx 4096

Output Format

VELA๋Š” ๋‘ ๊ฐ€์ง€ ์ถœ๋ ฅ ๋ชจ๋“œ๋ฅผ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค:

1. Reasoning Trace (๋ถ„์„ ๊ณผ์ •)

{
  "step": 1,
  "thought": "์‚ผ์„ฑ์ „์ž HBM3E 12๋‹จ ์–‘์‚ฐ ๊ด€๋ จ ๋‰ด์Šค ํ™•์ธ. ์ถ”๊ฐ€ ์ˆ˜์ฃผ ํ˜„ํ™ฉ๊ณผ ์‹œ์žฅ ์ ์œ ์œจ ํŒŒ์•… ํ•„์š”.",
  "action": "search",
  "query": "์‚ผ์„ฑ์ „์ž HBM3E 12๋‹จ ์ˆ˜์ฃผ ์‹œ์žฅ์ ์œ ์œจ",
  "confidence": 0.45
}

2. Synthesis Report (์ตœ์ข… ๋ฆฌํฌํŠธ)

# EOD ๋ฆฌํฌํŠธ: ์‚ผ์„ฑ์ „์ž (005930.KS)

## Executive Summary
[2-3๋ฌธ์žฅ ํ•ต์‹ฌ ์š”์•ฝ]

## Key Metrics
| ์ง€ํ‘œ | ์ˆ˜์น˜ |
|------|------|

## ์‹œ์žฅ ๋™ํ–ฅ ๋ถ„์„
## ์ˆ˜๊ธ‰ ๋ถ„์„
## ๋‰ด์Šค ์˜ํ–ฅ ๋ถ„์„
## ๋ฆฌ์Šคํฌ ์š”์ธ
## ํˆฌ์ž ์˜๊ฒฌ

DPO Improvements

  • โœ… ์ค‘๊ตญ์–ด leak ์ œ๊ฑฐ: Stress test 10/10 CLEAN
  • โœ… ์˜์–ด leak ๊ฐ์†Œ: ๋ถˆํ•„์š”ํ•œ ์˜์–ด ์‚ฌ์šฉ ์ตœ์†Œํ™”
  • โœ… ํ˜•์‹ ์ค€์ˆ˜: Reasoning Trace JSON + 7-section Report
  • โœ… ํ•œ๊ตญ์–ด ํ’ˆ์งˆ: ์ž์—ฐ์Šค๋Ÿฌ์šด ํ•œ๊ตญ์–ด ํ‘œํ˜„

Limitations

  • ์‹ค์‹œ๊ฐ„ ์‹œ์„ธ ๋ฐ์ดํ„ฐ ์ ‘๊ทผ ๋ถˆ๊ฐ€ (์™ธ๋ถ€ API ํ•„์š”)
  • ํˆฌ์ž ์กฐ์–ธ์ด ์•„๋‹Œ ์ •๋ณด ์ œ๊ณต ๋ชฉ์ 
  • 8K ์ปจํ…์ŠคํŠธ ์ œํ•œ์œผ๋กœ ๊ธด ๋ฌธ์„œ ์ฒ˜๋ฆฌ ํ•œ๊ณ„
  • ํ• ๋ฃจ์‹œ๋„ค์ด์…˜ ์ˆ˜์น˜ ๊ฐ€๋Šฅ (์ˆ˜์น˜ ๋ฐ์ดํ„ฐ๋Š” ์™ธ๋ถ€ ๊ฒ€์ฆ ํ•„์š”)

Citation

@misc{vela2026,
  title={VELA: Vector-Encoded Learning Agent for Korean Stock Analysis},
  author={intrect},
  year={2026},
  publisher={Hugging Face},
  url={https://huggingface.co/intrect/VELA}
}

Version History

๋ฒ„์ „ ๋‚ ์งœ ๋ณ€๊ฒฝ์‚ฌํ•ญ
v1.1 2026-02-12 GGUF ์–‘์žํ™” ๋ชจ๋ธ ์ถ”๊ฐ€ (Q4_K_M, Q8_0), ๋ฒค์น˜๋งˆํฌ, ํ•™์Šต ๋ฐ์ดํ„ฐ ๋ถ„ํฌ ๊ณต๊ฐœ
v1.0 2026-01-28 DPO ๋ณ‘ํ•ฉ, ์ค‘๊ตญ์–ด/์˜์–ด leak ํ•ด๊ฒฐ
v0.9 2026-01-15 SFT ๋ฒ ์ด์Šค ๋ชจ๋ธ ๊ณต๊ฐœ

Disclaimer: ์ด ๋ชจ๋ธ์˜ ์ถœ๋ ฅ์€ ํˆฌ์ž ์กฐ์–ธ์ด ์•„๋‹™๋‹ˆ๋‹ค. ๋ชจ๋“  ํˆฌ์ž ๊ฒฐ์ •์€ ๋ณธ์ธ์˜ ํŒ๋‹จ๊ณผ ์ฑ…์ž„ ํ•˜์— ์ด๋ฃจ์–ด์ ธ์•ผ ํ•ฉ๋‹ˆ๋‹ค.