File size: 2,745 Bytes
da3f3e5
 
4a9fca0
 
da3f3e5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ef37022
 
da3f3e5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
---
license: other
license_name: exaone
license_link: https://huggingface.co/LGAI-EXAONE/EXAONE-Deep-2.4B/blob/main/LICENSE
base_model: LGAI-EXAONE/EXAONE-4.0-1.2B
tags:
  - exaone
  - lora
  - finetune
  - korean
  - tagger
  - text-classification
  - text-generation
library_name: transformers
---

# EXAONE-4.0-1.2B Tagger (Merged)

This repository contains a **merged** checkpoint of:
- **Base**: `LGAI-EXAONE/EXAONE-4.0-1.2B`
- **LoRA fine-tune**: a lightweight SFT adapter trained to behave as a **Korean tag generator**.

The model is designed to output **a JSON array of 3–10 high-level tags** for a given Korean sentence.

GGUF : https://huggingface.co/FloatDo/exaone-4.0-1.2b-float-right-tagger-GGUF

## Intended Behavior

Given an input sentence, the model should output **ONLY** a JSON array:
- 3–10 tags
- high-level topics (not overly detailed)
- no underscores `_`
- **no extra text** (ideally)

In practice, some runs may emit extra text (e.g., reasoning markers).  
For production, parse the first JSON array from the output.

## Quick Start (Transformers)

```python
import re, json, torch
from transformers import AutoTokenizer, AutoModelForCausalLM

MODEL = "<this_repo_or_local_path>"

def extract_first_json_array(s: str):
    m = re.search(r"$begin:math:display$\[\\s\\S\]\*\?$end:math:display$", s)
    return json.loads(m.group(0)) if m else None

tok = AutoTokenizer.from_pretrained(MODEL, trust_remote_code=True, use_fast=True)
if tok.pad_token is None:
    tok.pad_token = tok.eos_token

model = AutoModelForCausalLM.from_pretrained(
    MODEL, trust_remote_code=True, torch_dtype="auto", device_map="cuda"
).eval()

messages = [
  {"role":"system","content":"너는 태그 생성기다. 반드시 JSON 배열만 출력한다. 다른 글자 금지."},
  {"role":"user","content":"규칙: 태그 3~10개, 큰 주제, 언더스코어 금지, JSON 배열만. 문장: 직장 상사가 계속 야근을 시켜서 스트레스 받는다. 퇴사 고민 중."}
]

prompt = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
enc = tok(prompt, return_tensors="pt").to("cuda")

out = model.generate(**enc, max_new_tokens=64, do_sample=False, temperature=0.0,
                     pad_token_id=tok.pad_token_id, eos_token_id=tok.eos_token_id)

text = tok.decode(out[0], skip_special_tokens=True)
tags = extract_first_json_array(text)
print("RAW:", text)
print("TAGS:", tags)


Training Notes
	•	This is not a general chat model tuning.
	•	The objective is to improve consistency of tag-only outputs for Korean input.
	•	If you need strict JSON-only output, use a post-processor that extracts the first JSON array.

Quantization / GGUF

A GGUF / quantized release may be provided separately.