Spaces:

dev-strender
/

proofread-20261h-demo

Sleeping

dev-strender Claude Opus 4.7 (1M context) commited on 25 days ago

Commit

c318017

1 Parent(s): 7cd2a55

feat(title-proofread): address customer feedback round 1

- Add Chosun style-book thousands-comma rule (PRIORITY 0, MANDATORY).
Examples: 7870 -> 7,870; 1만2000 -> 1만2,000.
Year/code exceptions documented (2030, KF-21).
- Strip stray </think> tokens from model output.
Handles both paired <think>...</think> and orphan </think> patterns
occasionally emitted by solar-pro2.
- Lock model to solar-pro2 (production extension target) and drop the
pro3 comparison column. Single-column result UI only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Files changed (3) hide show

title_proofread/prompts/prompt_dev_v1/system.txt +24 -1
title_proofread/runner.py +36 -14
title_proofread/ui.py +28 -68

title_proofread/prompts/prompt_dev_v1/system.txt CHANGED Viewed

@@ -21,6 +21,28 @@
 # CRITICAL 보존 규칙 (PRIORITY 순서)
 ## PRIORITY 1: 의미·구조 절대 보존 (MANDATORY)
 NEVER 다음을 수행한다:
@@ -42,8 +64,9 @@ NEVER 다음을 수행한다:
   → 풀어쓰지 않는다. NEVER `高수익` → `고수익`
 - **말줄임표 `…`**: 팩트와 시사점·결론을 연결하는 의도된 구두점
   → 위치·개수 그대로 유지
-- **수치 표기 양식**: `40조원`, `1만9585건`, `22만8000명`, `30%`, `2배`
   → 만·억·조 한글 병기 양식 변경 금지, 단위·기호 (`%`·`배`·`명`·`원`) 변경 금지
 - **따옴표 종류 변환 절대 금지**: 큰따옴표 ↔ 작은따옴표 변환은 의미를 완전히 바꾼다
 ## PRIORITY 3: 정보 추가/삭제 금지 (MANDATORY)

 # CRITICAL 보존 규칙 (PRIORITY 순서)
+## PRIORITY 0: 조선 스타일북 의무 교정 (MANDATORY — 보존이 아니라 *적극 교정*)
+다음 패턴은 조선일보 스타일북 위반이므로 **반드시 교정**한다. 보존 원칙보다 우선.
+### 천단위 콤마 (4자리 이상 아라비아 숫자)
+원문에 콤마가 빠진 4자리 이상 숫자는 *반드시* 천단위 콤마를 보충한다:
+- `7870` → `7,870`
+- `12345` → `12,345`
+- `1234567` → `1,234,567`
+한글 단위(만/억/조) 뒤에 붙는 숫자에도 동일 적용:
+- `1만2000` → `1만2,000`
+- `5억3000만` → `5억3,000만`
+- `2만8000명` → `2만8,000명`
+**예외 — 콤마 추가하지 않음**:
+- 연도: `2030`, `1980`, `2024학번`
+- 코드·번호·식별자 성격: `KF-21`, `B-1`, `5G`
+`% / %p / 배 / 명 / 원 / 건 / km / kg` 등 단위 *앞* 숫자 모두에 적용.
 ## PRIORITY 1: 의미·구조 절대 보존 (MANDATORY)
 NEVER 다음을 수행한다:
   → 풀어쓰지 않는다. NEVER `高수익` → `고수익`
 - **말줄임표 `…`**: 팩트와 시사점·결론을 연결하는 의도된 구두점
   → 위치·개수 그대로 유지
+- **수치 표기 양식**: `40조원`, `1만9,585건`, `22만8,000명`, `30%`, `2배`
   → 만·억·조 한글 병기 양식 변경 금지, 단위·기호 (`%`·`배`·`명`·`원`) 변경 금지
+  → 천단위 콤마 규칙은 아래 *PRIORITY 0* 의 의무 교정 대상이므로, *원문에 콤마가 빠져 있으면 반드시 보충* 한다
 - **따옴표 종류 변환 절대 금지**: 큰따옴표 ↔ 작은따옴표 변환은 의미를 완전히 바꾼다
 ## PRIORITY 3: 정보 추가/삭제 금지 (MANDATORY)

title_proofread/runner.py CHANGED Viewed

@@ -1,18 +1,40 @@
-"""제목 교열 단일 호출 러너 — solar-pro2 / solar-pro3 비교용.
-UI 가 시스템·유저 프롬프트를 직접 편집할 수 있도록 prompt 인자를 전부 외부에서 받음.
-모델은 호출 시 인자로 지정 (UI 에서 두 모델을 병렬로 호출).
 """
 from __future__ import annotations
 import time
 from pathlib import Path
 from typing import Any
-MODELS = ["solar-pro2", "solar-pro3"]
 DEFAULT_PROMPT_DIR = Path(__file__).resolve().parent / "prompts" / "prompt_dev_v1"
 def load_default_prompts() -> tuple[str, str]:
     """`prompt_dev_v1` 의 system.txt + user.txt 를 그대로 반환."""
@@ -33,28 +55,27 @@ def run_title_proofread(
     category: str,
     system_prompt: str,
     user_template: str,
-    model: str,
     temperature: float = 0.0,
     reasoning_effort: str = "low",
     max_tokens: int = 2000,
 ) -> dict[str, Any]:
-    """단일 LLM 호출. UI 에서 모델당 한 번씩 (병렬) 호출.
     Returns:
         {
-          "output": str,        # 모델 응답 (strip 처리)
-          "user_message": str,  # placeholder 치환된 실 user content (디버깅용)
           "model": str,
           "latency_ms": int,
           "usage": dict,        # {prompt_tokens, completion_tokens, total_tokens}
-          "error": str | None,  # 에러 시 메시지
         }
     """
     user_msg = render_user_message(user_template, original, category)
     start = time.time()
     try:
         kwargs: dict[str, Any] = {
-            "model": model,
             "messages": [
                 {"role": "system", "content": system_prompt},
                 {"role": "user", "content": user_msg},
@@ -69,14 +90,15 @@ def run_title_proofread(
         return {
             "output": "",
             "user_message": user_msg,
-            "model": model,
             "latency_ms": int((time.time() - start) * 1000),
             "usage": {},
             "error": f"{type(exc).__name__}: {exc}",
         }
     elapsed_ms = int((time.time() - start) * 1000)
-    raw = (resp.choices[0].message.content or "").strip()
     usage = getattr(resp, "usage", None)
     usage_dict: dict[str, int] = {}
     if usage:
@@ -86,9 +108,9 @@ def run_title_proofread(
                 usage_dict[k] = v
     return {
-        "output": raw,
         "user_message": user_msg,
-        "model": model,
         "latency_ms": elapsed_ms,
         "usage": usage_dict,
         "error": None,

+"""제목 교열 단일 호출 러너 — solar-pro2 고정.
+production 환경(extension) 이 solar-pro2 를 사용하므로 데모도 동일 모델로 고정.
 """
 from __future__ import annotations
+import re
 import time
 from pathlib import Path
 from typing import Any
+MODEL = "solar-pro2"
 DEFAULT_PROMPT_DIR = Path(__file__).resolve().parent / "prompts" / "prompt_dev_v1"
+# solar-pro2 가 가끔 응답 본문 앞에 reasoning trace 를 emit 하고 `</think>` 로 닫는
+# 경우가 있음 (보통은 paired `<think>...</think>` 인데 unpaired 가 발생). upstage
+# provider 의 paired-tag strip 도 우회되므로, 데모 단에서 결정적으로 제거한다.
+_PAIRED_THINK = re.compile(r"<think>.*?</think>", re.DOTALL)
+_ORPHAN_THINK_PREFIX = re.compile(r"^.*?</think>\s*", re.DOTALL)
+def _strip_think(raw: str) -> str:
+    """`<think>...</think>` 및 unpaired `</think>` 앞부분 모두 제거.
+    가드 순서:
+      1. paired `<think>...</think>` 블록 제거
+      2. 그래도 `</think>` 가 남아 있으면 → 첫 등장 위치 이전을 전부 reasoning
+         trace 로 간주하고 잘라냄 (가장 흔한 누출 패턴)
+      3. 남은 `<think>` / `</think>` 토큰 잔존도 제거
+    """
+    s = _PAIRED_THINK.sub("", raw)
+    if "</think>" in s:
+        s = _ORPHAN_THINK_PREFIX.sub("", s, count=1)
+    s = s.replace("</think>", "").replace("<think>", "")
+    return s.strip()
 def load_default_prompts() -> tuple[str, str]:
     """`prompt_dev_v1` 의 system.txt + user.txt 를 그대로 반환."""
     category: str,
     system_prompt: str,
     user_template: str,
     temperature: float = 0.0,
     reasoning_effort: str = "low",
     max_tokens: int = 2000,
 ) -> dict[str, Any]:
+    """단일 LLM 호출. 모델은 항상 `solar-pro2`.
     Returns:
         {
+          "output": str,        # 모델 응답 (strip + think-token 제거 후)
+          "user_message": str,  # placeholder 치환된 실 user content
           "model": str,
           "latency_ms": int,
           "usage": dict,        # {prompt_tokens, completion_tokens, total_tokens}
+          "error": str | None,
         }
     """
     user_msg = render_user_message(user_template, original, category)
     start = time.time()
     try:
         kwargs: dict[str, Any] = {
+            "model": MODEL,
             "messages": [
                 {"role": "system", "content": system_prompt},
                 {"role": "user", "content": user_msg},
         return {
             "output": "",
             "user_message": user_msg,
+            "model": MODEL,
             "latency_ms": int((time.time() - start) * 1000),
             "usage": {},
             "error": f"{type(exc).__name__}: {exc}",
         }
     elapsed_ms = int((time.time() - start) * 1000)
+    raw = resp.choices[0].message.content or ""
+    cleaned = _strip_think(raw)
     usage = getattr(resp, "usage", None)
     usage_dict: dict[str, int] = {}
     if usage:
                 usage_dict[k] = v
     return {
+        "output": cleaned,
         "user_message": user_msg,
+        "model": MODEL,
         "latency_ms": elapsed_ms,
         "usage": usage_dict,
         "error": None,

title_proofread/ui.py CHANGED Viewed

@@ -1,22 +1,19 @@
-"""Gradio UI — 제목 교열 sandbox 탭.
-사용자에게 system/user 프롬프트와 호출 메타데이터를 노출하지 않고,
-입력 제목·카테고리·추론 옵션만 받아 solar-pro2 / solar-pro3 결과를
-좌우 컬럼에서 동시에 비교한다.
 """
 from __future__ import annotations
-from concurrent.futures import ThreadPoolExecutor
 from typing import Any
 import gradio as gr
 from diff_utils import highlight_diff
-from .runner import MODELS, load_default_prompts, run_title_proofread
 # UI 단순화를 위해 category 는 입력 받지 않고 내부 고정.
-# 모델 거동상 카테고리별로 의미 있는 차이가 관찰되지 않아 선택 UI 제거.
 DEFAULT_CATEGORY = "일반기사"
@@ -26,13 +23,9 @@ def build_title_proofread_tab(client: Any) -> None:
     Args:
         client: openai.OpenAI 호환 클라이언트 (Upstage base_url 설정).
     """
-    # Default prompts loaded once at module init — never exposed to users.
     default_system, default_user = load_default_prompts()
-    gr.Markdown(
-        "## 제목 교열\n"
-        f"한국 신문 제목 교열 결과를 `{MODELS[0]}` 와 `{MODELS[1]}` 두 모델로 비교."
-    )
     original = gr.Textbox(
         label="제목 입력",
@@ -40,7 +33,7 @@ def build_title_proofread_tab(client: Any) -> None:
         lines=1,
     )
-    with gr.Accordion("추론 옵션 (양쪽 모델 공통 적용)", open=False):
         with gr.Row():
             temperature = gr.Slider(
                 minimum=0.0,
@@ -61,24 +54,12 @@ def build_title_proofread_tab(client: Any) -> None:
         elem_id="title-proofread-run-btn",
     )
-    # ─── 결과 — 두 모델을 좌우 컬럼으로 비교 ───
-    with gr.Row(equal_height=False):
-        with gr.Column():
-            gr.Markdown(f"### {MODELS[0]}")
-            output_pro2 = gr.Textbox(label="교정 결과", lines=2, interactive=False)
-            diff_pro2 = gr.HTML(label="원본 대비 diff")
-        with gr.Column():
-            gr.Markdown(f"### {MODELS[1]}")
-            output_pro3 = gr.Textbox(label="교정 결과", lines=2, interactive=False)
-            diff_pro3 = gr.HTML(label="원본 대비 diff")
     def _empty():
-        return (
-            gr.update(value=""),
-            gr.update(value=""),
-            gr.update(value=""),
-            gr.update(value=""),
-        )
     def _on_run(
         original_text: str,
@@ -94,43 +75,24 @@ def build_title_proofread_tab(client: Any) -> None:
         orig_clean = original_text.strip()
-        # 두 모델 병렬 호출 (총 latency = max(pro2, pro3)).
-        # 시스템·유저 프롬프트는 내부 default 고정 — UI 에 노출하지 않음.
-        with ThreadPoolExecutor(max_workers=2) as ex:
-            futures = {
-                m: ex.submit(
-                    run_title_proofread,
-                    client=client,
-                    original=orig_clean,
-                    category=DEFAULT_CATEGORY,
-                    system_prompt=default_system,
-                    user_template=default_user,
-                    model=m,
-                    temperature=temperature_val,
-                    reasoning_effort=reasoning_effort_val,
-                )
-                for m in MODELS
-            }
-            results = {m: f.result() for m, f in futures.items()}
-        r2 = results[MODELS[0]]
-        r3 = results[MODELS[1]]
-        # 모델 호출 자체가 실패하면 toast 로 알리고 결과 비움.
-        for m, r in ((MODELS[0], r2), (MODELS[1], r3)):
-            if r.get("error"):
-                gr.Warning(f"{m} 호출 실패: {r['error']}")
-        def _diff(out: str) -> str:
-            return highlight_diff(orig_clean, out) if out else ""
-        return (
-            gr.update(value=r2["output"]),
-            gr.update(value=_diff(r2["output"])),
-            gr.update(value=r3["output"]),
-            gr.update(value=_diff(r3["output"])),
         )
     run_btn.click(
         _on_run,
         inputs=[
@@ -139,9 +101,7 @@ def build_title_proofread_tab(client: Any) -> None:
             reasoning_effort,
         ],
         outputs=[
-            output_pro2,
-            diff_pro2,
-            output_pro3,
-            diff_pro3,
         ],
     )

+"""Gradio UI — 제목 교열 sandbox 탭 (solar-pro2 단일 모델).
+production extension 과 동일 모델(solar-pro2) 로 고정. 시스템/유저 프롬프트는
+내부 default 사용 — UI 에 노출하지 않음. 호출 메타데이터도 표시하지 않음.
 """
 from __future__ import annotations
 from typing import Any
 import gradio as gr
 from diff_utils import highlight_diff
+from .runner import MODEL, load_default_prompts, run_title_proofread
 # UI 단순화를 위해 category 는 입력 받지 않고 내부 고정.
 DEFAULT_CATEGORY = "일반기사"
     Args:
         client: openai.OpenAI 호환 클라이언트 (Upstage base_url 설정).
     """
     default_system, default_user = load_default_prompts()
+    gr.Markdown(f"## 제목 교열 ({MODEL})")
     original = gr.Textbox(
         label="제목 입력",
         lines=1,
     )
+    with gr.Accordion("추론 옵션", open=False):
         with gr.Row():
             temperature = gr.Slider(
                 minimum=0.0,
         elem_id="title-proofread-run-btn",
     )
+    # ─── 결과 — 단일 컬럼 ───
+    output = gr.Textbox(label="교정 결과", lines=2, interactive=False)
+    diff_html = gr.HTML(label="원본 대비 diff")
     def _empty():
+        return gr.update(value=""), gr.update(value="")
     def _on_run(
         original_text: str,
         orig_clean = original_text.strip()
+        result = run_title_proofread(
+            client=client,
+            original=orig_clean,
+            category=DEFAULT_CATEGORY,
+            system_prompt=default_system,
+            user_template=default_user,
+            temperature=temperature_val,
+            reasoning_effort=reasoning_effort_val,
         )
+        if result.get("error"):
+            gr.Warning(f"{MODEL} 호출 실패: {result['error']}")
+            return _empty()
+        out_text = result["output"]
+        diff = highlight_diff(orig_clean, out_text) if out_text else ""
+        return gr.update(value=out_text), gr.update(value=diff)
     run_btn.click(
         _on_run,
         inputs=[
             reasoning_effort,
         ],
         outputs=[
+            output,
+            diff_html,
         ],
     )