Mindcast Topic Classifier

Model Description

한국어 텍스트의 주제를 분류하는 모델입니다.

이 모델은 LoRA (Low-Rank Adaptation)를 사용하여 효율적으로 파인튜닝되었으며, 최종적으로 base model과 merge되어 배포되었습니다.

Training Date: 2025-12-12

Performance

Test Set Results

Metric Score
Accuracy 0.5583
F1 Score (Macro) 0.1024
F1 Score (Weighted) 0.4001

Confusion Matrix

[[67  0  0  0  0  0  0]
 [24  0  0  0  0  0  0]
 [ 6  0  0  0  0  0  0]
 [15  0  0  0  0  0  0]
 [ 6  0  0  0  0  0  0]
 [ 1  0  0  0  0  0  0]
 [ 1  0  0  0  0  0  0]]

Detailed Classification Report

              precision    recall  f1-score   support

          사회     0.5583    1.0000    0.7166        67
          정치     0.0000    0.0000    0.0000        24
        생활문화     0.0000    0.0000    0.0000         6
          세계     0.0000    0.0000    0.0000        15
          경제     0.0000    0.0000    0.0000         6
        IT과학     0.0000    0.0000    0.0000         1
         스포츠     0.0000    0.0000    0.0000         1

   micro avg     0.5583    0.5583    0.5583       120
   macro avg     0.0798    0.1429    0.1024       120
weighted avg     0.3117    0.5583    0.4001       120

Training Details

Hyperparameters

Hyperparameter Value
Base Model klue/roberta-base
Batch Size 64
Epochs 1
Learning Rate 0.0001
Warmup Ratio 0.1
Weight Decay 0.01
LoRA r 8
LoRA alpha 16
LoRA dropout 0.05

Training Data

  • Train samples: 970
  • Valid samples: 108
  • Test samples: 120
  • Number of labels: 7
  • Labels: 사회, 정치, 생활문화, 세계, 경제, IT과학, 스포츠

Usage

Installation

pip install transformers torch

Quick Start

from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline

# Load model
model_name = "merrybabyxmas/mindcast-topic-classifier"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# Create pipeline
classifier = pipeline("text-classification", model=model, tokenizer=tokenizer)

# Predict
text = "오늘 날씨가 정말 좋네요"
result = classifier(text)
print(result)

Model Architecture

  • Base Model: klue/roberta-base
  • Task: Sequence Classification
  • Number of Labels: N/A

Citation

If you use this model, please cite:

@misc{mindcast-model,
  author = {Mindcast Team},
  title = {Mindcast Topic Classifier},
  year = {2025},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/merrybabyxmas/mindcast-emotion-sc-only}},
}

Contact

For questions or feedback, please open an issue on the model repository.


This model card was automatically generated.

Downloads last month
-
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support