QuettaLLMs-27B-Koreasoner-V3
Model Overview
QuettaLLMs-27B-Koreasoner-V3 is a Korean large language model (LLM) fine-tuned using data refined by NewenAI. Built on the Qwen/Qwen3.5-27B base model, it underwent LoRA-based fine-tuning to improve its logical reasoning performance.
A notable feature of this model is its output efficiency. While it processes reasoning internally, it is trained to deliver direct and concise final answers without generating verbose explanations. This approach reduces token usage and improves response speed, making it highly efficient for QA and automated evaluation environments.
Evaluation Benchmarks
The V3 model recorded improved scores in Korean reasoning and knowledge evaluation metrics compared to the base model. The average score based on the recent official AI Hub submission is 0.560.
| Model | Avg | KMMLU-Pro | CLIcK | HLE(Ko) | MuSR(Ko) | Com2-main | KMMLU | COPA | HellaSwag |
|---|---|---|---|---|---|---|---|---|---|
| QuettaLLMs-27B-Koreasoner-V3 | 0.560 | 0.676 | 0.794 | 0.070 | 0.604 | 0.654 | 73.8 | 81.9 | 52.4 |
| Qwen3.5-27B (Base) | 0.530 | 0.622 | 0.770 | 0.058 | 0.577 | 0.622 | 68.50 | 79.3 | 49.0 |
Result Summary: Metrics such as KMMLU-Pro (+0.054), CLIcK (+0.024), and KMMLU (+5.30p) increased compared to the base model, indicating an overall improvement in Korean domain knowledge and context understanding.
Training Methodology & Data Composition
The model was trained using datasets collected from external sources, which were then filtered for noise and refined according to internal standards.
1. Data Composition
Over 230,000 refined SFT samples were used. The main components are:
- Korean Knowledge Data: KMMLU-format questions and datasets covering Korean law, culture, and industry.
- Logical Reasoning Data: Public and synthetic datasets requiring multi-step reasoning.
- Math & Quantitative Reasoning Data: Mathematical datasets requiring step-by-step verification and calculation.
2. SFT Environment
- Training Method: LoRA (Low-Rank Adaptation)
- Optimization Goal: Output efficiency, guiding the model to identify the core of the question and generate concise, accurate answers without generating unnecessary tokens.
Key Features
- Reduced Hallucination: Decreased information distortion in questions involving multi-step logic.
- Korean Context Processing: Improved understanding of background knowledge through Korean-specific training data.
- Math Problem Solving: Improved calculation accuracy by deriving results through internal reasoning processes.
- Zero-Shot Performance: Provides consistently formatted answers without requiring additional prompt examples.
License & Usage
- License: apache-2.0
- Commercial Inquiries: sheun@newen.ai
- Downloads last month
- 809