| | --- |
| | license: other |
| | license_name: exaone |
| | license_link: https://huggingface.co/LGAI-EXAONE/EXAONE-Deep-2.4B/blob/main/LICENSE |
| | base_model: LGAI-EXAONE/EXAONE-4.0-1.2B |
| | tags: |
| | - exaone |
| | - lora |
| | - finetune |
| | - korean |
| | - tagger |
| | - text-classification |
| | - text-generation |
| | library_name: transformers |
| | --- |
| | |
| | # EXAONE-4.0-1.2B Tagger (Merged) |
| |
|
| | This repository contains a **merged** checkpoint of: |
| | - **Base**: `LGAI-EXAONE/EXAONE-4.0-1.2B` |
| | - **LoRA fine-tune**: a lightweight SFT adapter trained to behave as a **Korean tag generator**. |
| |
|
| | The model is designed to output **a JSON array of 3–10 high-level tags** for a given Korean sentence. |
| |
|
| | GGUF : https://huggingface.co/FloatDo/exaone-4.0-1.2b-float-right-tagger-GGUF |
| |
|
| | ## Intended Behavior |
| |
|
| | Given an input sentence, the model should output **ONLY** a JSON array: |
| | - 3–10 tags |
| | - high-level topics (not overly detailed) |
| | - no underscores `_` |
| | - **no extra text** (ideally) |
| |
|
| | In practice, some runs may emit extra text (e.g., reasoning markers). |
| | For production, parse the first JSON array from the output. |
| |
|
| | ## Quick Start (Transformers) |
| |
|
| | ```python |
| | import re, json, torch |
| | from transformers import AutoTokenizer, AutoModelForCausalLM |
| | |
| | MODEL = "<this_repo_or_local_path>" |
| | |
| | def extract_first_json_array(s: str): |
| | m = re.search(r"$begin:math:display$\[\\s\\S\]\*\?$end:math:display$", s) |
| | return json.loads(m.group(0)) if m else None |
| | |
| | tok = AutoTokenizer.from_pretrained(MODEL, trust_remote_code=True, use_fast=True) |
| | if tok.pad_token is None: |
| | tok.pad_token = tok.eos_token |
| | |
| | model = AutoModelForCausalLM.from_pretrained( |
| | MODEL, trust_remote_code=True, torch_dtype="auto", device_map="cuda" |
| | ).eval() |
| | |
| | messages = [ |
| | {"role":"system","content":"너는 태그 생성기다. 반드시 JSON 배열만 출력한다. 다른 글자 금지."}, |
| | {"role":"user","content":"규칙: 태그 3~10개, 큰 주제, 언더스코어 금지, JSON 배열만. 문장: 직장 상사가 계속 야근을 시켜서 스트레스 받는다. 퇴사 고민 중."} |
| | ] |
| | |
| | prompt = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) |
| | enc = tok(prompt, return_tensors="pt").to("cuda") |
| | |
| | out = model.generate(**enc, max_new_tokens=64, do_sample=False, temperature=0.0, |
| | pad_token_id=tok.pad_token_id, eos_token_id=tok.eos_token_id) |
| | |
| | text = tok.decode(out[0], skip_special_tokens=True) |
| | tags = extract_first_json_array(text) |
| | print("RAW:", text) |
| | print("TAGS:", tags) |
| | |
| | |
| | Training Notes |
| | • This is not a general chat model tuning. |
| | • The objective is to improve consistency of tag-only outputs for Korean input. |
| | • If you need strict JSON-only output, use a post-processor that extracts the first JSON array. |
| | |
| | Quantization / GGUF |
| | |
| | A GGUF / quantized release may be provided separately. |
| | |
| | |