Text Generation
Transformers
Safetensors
GGUF
Korean
English
llama
3b
korean
from-scratch
orpo
instruction-tuned
preference-aligned
fp8
b200
Eval Results (legacy)
text-generation-inference
Instructions to use pathcosmos/frankenstallm with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use pathcosmos/frankenstallm with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="pathcosmos/frankenstallm")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("pathcosmos/frankenstallm") model = AutoModelForCausalLM.from_pretrained("pathcosmos/frankenstallm") - llama-cpp-python
How to use pathcosmos/frankenstallm with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="pathcosmos/frankenstallm", filename="gguf/frankenstallm-3b-Q4_K_M.gguf", )
output = llm( "Once upon a time,", max_tokens=512, echo=True ) print(output)
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use pathcosmos/frankenstallm with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf pathcosmos/frankenstallm:Q4_K_M # Run inference directly in the terminal: llama-cli -hf pathcosmos/frankenstallm:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf pathcosmos/frankenstallm:Q4_K_M # Run inference directly in the terminal: llama-cli -hf pathcosmos/frankenstallm:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf pathcosmos/frankenstallm:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf pathcosmos/frankenstallm:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf pathcosmos/frankenstallm:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf pathcosmos/frankenstallm:Q4_K_M
Use Docker
docker model run hf.co/pathcosmos/frankenstallm:Q4_K_M
- LM Studio
- Jan
- vLLM
How to use pathcosmos/frankenstallm with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "pathcosmos/frankenstallm" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "pathcosmos/frankenstallm", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/pathcosmos/frankenstallm:Q4_K_M
- SGLang
How to use pathcosmos/frankenstallm with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "pathcosmos/frankenstallm" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "pathcosmos/frankenstallm", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "pathcosmos/frankenstallm" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "pathcosmos/frankenstallm", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Ollama
How to use pathcosmos/frankenstallm with Ollama:
ollama run hf.co/pathcosmos/frankenstallm:Q4_K_M
- Unsloth Studio new
How to use pathcosmos/frankenstallm with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for pathcosmos/frankenstallm to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for pathcosmos/frankenstallm to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for pathcosmos/frankenstallm to start chatting
- Docker Model Runner
How to use pathcosmos/frankenstallm with Docker Model Runner:
docker model run hf.co/pathcosmos/frankenstallm:Q4_K_M
- Lemonade
How to use pathcosmos/frankenstallm with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull pathcosmos/frankenstallm:Q4_K_M
Run and chat with the model
lemonade run user.frankenstallm-Q4_K_M
List all available models
lemonade list
| #!/usr/bin/env python3 | |
| """ | |
| prepare_preference_combined.py โ Preference ๋ฐ์ดํฐ ํตํฉ + ํฌ๋งท ์ ๊ทํ ์คํฌ๋ฆฝํธ | |
| Phase 0F: ORPO ํ์ดํ๋ผ์ธ ์ค๋น | |
| ์ ๋ ฅ ๋๋ ํ ๋ฆฌ: data/preference/ | |
| ์ถ๋ ฅ ํ์ผ: data/preference/combined_preference.jsonl | |
| ์ง์ ํฌ๋งท: | |
| - {prompt, chosen, rejected} (ํ์ค DPO/ORPO ํฌ๋งท) | |
| - {question, chosen, rejected, [system]} (heegyu, kuotient orca-math ๊ณ์ด) | |
| - {instruction, chosen, rejected} (instruction ํค ๋ณํ) | |
| - {orig_instruction, orig_response_A/B, orig_preference} (nayohan preference-collection) | |
| - {prompt, response_a, response_b, preferred} (response_a/b + preferred ํค) | |
| - {prompt, response_a, response_b, winner} (winner ํค ๋ณํ) | |
| - {instruction, preferred, dispreferred} (preferred/dispreferred ํค) | |
| - {prompt, winning_response, losing_response} (Ultrafeedback ๊ณ์ด) | |
| - {conversations, chosen, rejected} (conversations ๋ฆฌ์คํธ ํฌ๋งท) | |
| ํ์ง ํํฐ: | |
| - chosen, rejected ๋ชจ๋ ๋น์ด์์ง ์์ ๊ฒ | |
| - chosen != rejected | |
| - ์ต์ 20์ ์ด์ (chosen ๊ธฐ์ค) | |
| Usage: | |
| python data/prepare_preference_combined.py [--input_dir data/preference] [--output data/preference/combined_preference.jsonl] | |
| """ | |
| from __future__ import annotations | |
| import argparse | |
| import json | |
| import logging | |
| import sys | |
| from pathlib import Path | |
| from typing import Optional | |
| logging.basicConfig( | |
| level=logging.INFO, | |
| format="%(asctime)s [%(levelname)s] %(message)s", | |
| datefmt="%Y-%m-%d %H:%M:%S", | |
| ) | |
| log = logging.getLogger(__name__) | |
| # --------------------------------------------------------------------------- | |
| # ํ๋๋ช ์๋ ๊ฐ์ง ๋ก์ง | |
| # --------------------------------------------------------------------------- | |
| def _extract_text(val) -> str: | |
| """๊ฐ์ด str์ด๋ฉด ๊ทธ๋๋ก, list(conversations ํฌ๋งท)์ด๋ฉด ๋ง์ง๋ง content ์ถ์ถ.""" | |
| if isinstance(val, str): | |
| return val.strip() | |
| if isinstance(val, list): | |
| # [{"role": ..., "content": ...}, ...] ํํ | |
| parts = [] | |
| for item in val: | |
| if isinstance(item, dict): | |
| content = item.get("content") or item.get("value") or item.get("text") or "" | |
| parts.append(str(content)) | |
| else: | |
| parts.append(str(item)) | |
| return "\n".join(parts).strip() | |
| if isinstance(val, dict): | |
| return (val.get("content") or val.get("value") or val.get("text") or "").strip() | |
| return str(val).strip() | |
| def _build_prompt(record: dict) -> str: | |
| """๋ ์ฝ๋์์ prompt ๋ฌธ์์ด์ ์ถ์ถํ๋ค.""" | |
| # ํ์ค prompt ํค | |
| for key in ("prompt", "instruction", "question", "input", "user_prompt", "orig_instruction"): | |
| if key in record and record[key]: | |
| val = _extract_text(record[key]) | |
| if val: | |
| # system ํ๋๊ฐ ์์ผ๋ฉด ์์ ๋ถ์ | |
| system = record.get("system", "") | |
| if system: | |
| return f"{system.strip()}\n{val}" | |
| return val | |
| # conversations ํฌ๋งท: ์ฒซ ๋ฒ์งธ human ํด | |
| if "conversations" in record: | |
| convs = record["conversations"] | |
| if isinstance(convs, list): | |
| for item in convs: | |
| role = (item.get("role") or item.get("from") or "").lower() | |
| if role in ("human", "user"): | |
| return _extract_text(item.get("content") or item.get("value") or "") | |
| return "" | |
| def normalize_record(record: dict, source_name: str) -> Optional[dict]: | |
| """ | |
| ๋จ์ผ ๋ ์ฝ๋๋ฅผ {prompt, chosen, rejected} ๋ก ์ ๊ทํ. | |
| ๋ณํ ๋ถ๊ฐ ์ None ๋ฐํ. | |
| """ | |
| chosen = "" | |
| rejected = "" | |
| # --- ํจํด 1: ํ์ค {chosen, rejected} --- | |
| if "chosen" in record and "rejected" in record: | |
| chosen = _extract_text(record["chosen"]) | |
| rejected = _extract_text(record["rejected"]) | |
| # --- ํจํด 2: nayohan preference-collection (orig_preference + orig_response_A/B) --- | |
| elif "orig_preference" in record: | |
| resp_a = _extract_text(record.get("orig_response_A", record.get("response_A", ""))) | |
| resp_b = _extract_text(record.get("orig_response_B", record.get("response_B", ""))) | |
| pref = str(record.get("orig_preference", "")).strip().upper() | |
| if pref == "B": | |
| chosen, rejected = resp_b, resp_a | |
| else: | |
| chosen, rejected = resp_a, resp_b | |
| # --- ํจํด 3: preferred/dispreferred --- | |
| elif "preferred" in record and "dispreferred" in record: | |
| chosen = _extract_text(record["preferred"]) | |
| rejected = _extract_text(record["dispreferred"]) | |
| # --- ํจํด 4: response_a/b + preferred or winner ํค --- | |
| elif "response_a" in record and "response_b" in record: | |
| resp_a = _extract_text(record["response_a"]) | |
| resp_b = _extract_text(record["response_b"]) | |
| winner_key = record.get("preferred") or record.get("winner") or "" | |
| winner = str(winner_key).strip().lower() | |
| if winner in ("b", "response_b", "model_b"): | |
| chosen, rejected = resp_b, resp_a | |
| else: | |
| # ๊ธฐ๋ณธ: A๊ฐ chosen | |
| chosen, rejected = resp_a, resp_b | |
| # --- ํจํด 5: winning_response / losing_response (Ultrafeedback ๊ณ์ด) --- | |
| elif "winning_response" in record and "losing_response" in record: | |
| chosen = _extract_text(record["winning_response"]) | |
| rejected = _extract_text(record["losing_response"]) | |
| # --- ํจํด 6: completions ๋ฆฌ์คํธ (์ผ๋ถ HH-RLHF ๋ณํ) --- | |
| elif "completions" in record: | |
| completions = record["completions"] | |
| if isinstance(completions, list) and len(completions) >= 2: | |
| # rating ์์ผ๋ฉด ๋ด๋ฆผ์ฐจ์ ์ ๋ ฌ | |
| def rating(c): | |
| return c.get("rating", c.get("score", 0)) if isinstance(c, dict) else 0 | |
| sorted_c = sorted(completions, key=rating, reverse=True) | |
| chosen = _extract_text(sorted_c[0].get("text", sorted_c[0]) if isinstance(sorted_c[0], dict) else sorted_c[0]) | |
| rejected = _extract_text(sorted_c[-1].get("text", sorted_c[-1]) if isinstance(sorted_c[-1], dict) else sorted_c[-1]) | |
| else: | |
| return None # ์ ์ ์๋ ํฌ๋งท | |
| prompt = _build_prompt(record) | |
| return {"prompt": prompt, "chosen": chosen, "rejected": rejected} | |
| # --------------------------------------------------------------------------- | |
| # ํ์ง ํํฐ | |
| # --------------------------------------------------------------------------- | |
| MIN_LEN = 20 | |
| def passes_quality_filter(record: dict) -> bool: | |
| """ํ์ง ํํฐ: chosen/rejected ๋น์ด์์ง ์๊ณ , ๋ค๋ฅด๊ณ , ์ต์ ๊ธธ์ด ์ถฉ์กฑ.""" | |
| prompt = record.get("prompt", "") | |
| chosen = record.get("chosen", "") | |
| rejected = record.get("rejected", "") | |
| if not chosen or not rejected: | |
| return False | |
| if chosen == rejected: | |
| return False | |
| if len(chosen) < MIN_LEN: | |
| return False | |
| if not prompt: | |
| # prompt ์์ผ๋ฉด ๊ฒฝ๊ณ ๋ง โ ์์ ํ ๋ฒ๋ฆฌ์ง๋ ์์ (ORPO๋ prompt ํ์์ด๋ฏ๋ก ์ค์ ๋ก ์ ์ธ) | |
| return False | |
| return True | |
| # --------------------------------------------------------------------------- | |
| # ํ์ผ๋ณ ๋ก๋ | |
| # --------------------------------------------------------------------------- | |
| def load_jsonl(path: Path): | |
| """JSONL ํ์ผ์ ์์ฐจ์ ์ผ๋ก ํ์ฑํ๋ ์ ๋๋ ์ดํฐ.""" | |
| with path.open("r", encoding="utf-8") as f: | |
| for lineno, line in enumerate(f, 1): | |
| line = line.strip() | |
| if not line: | |
| continue | |
| try: | |
| yield json.loads(line) | |
| except json.JSONDecodeError as e: | |
| log.warning(f" JSON ํ์ฑ ์ค๋ฅ {path.name}:{lineno} โ {e}") | |
| def process_file(src_path: Path, out_f, stats: dict) -> None: | |
| """๋จ์ผ JSONL ํ์ผ์ ์ฝ์ด ์ ๊ทํ ํ out_f์ ์ด๋ค. stats ๋์ ๋๋ฆฌ ๊ฐฑ์ .""" | |
| source_name = src_path.stem | |
| loaded = 0 | |
| written = 0 | |
| skipped_format = 0 | |
| skipped_quality = 0 | |
| log.info(f" ๋ก๋ฉ: {src_path.name}") | |
| for record in load_jsonl(src_path): | |
| loaded += 1 | |
| normalized = normalize_record(record, source_name) | |
| if normalized is None: | |
| skipped_format += 1 | |
| continue | |
| if not passes_quality_filter(normalized): | |
| skipped_quality += 1 | |
| continue | |
| out_f.write(json.dumps(normalized, ensure_ascii=False) + "\n") | |
| written += 1 | |
| log.info( | |
| f" {source_name}: ๋ก๋ฉ {loaded:,} โ ํฌ๋งท ์คํต {skipped_format:,} โ ํ์ง ์คํต {skipped_quality:,} โ ์ถ๋ ฅ {written:,}" | |
| ) | |
| stats[source_name] = { | |
| "loaded": loaded, | |
| "skipped_format": skipped_format, | |
| "skipped_quality": skipped_quality, | |
| "written": written, | |
| } | |
| # --------------------------------------------------------------------------- | |
| # ๋ฉ์ธ | |
| # --------------------------------------------------------------------------- | |
| # ์ฒ๋ฆฌํ ํ์ผ ๋ชฉ๋ก (์์ ๊ณ ์ โ ์ฌํ์ฑ) | |
| TARGET_FILES = [ | |
| "heegyu_orca-math-korean-preference-cleaned.jsonl", | |
| "kuotient_orca-math-korean-dpo-pairs.jsonl", | |
| "nayohan_preference-collection-ko-full.jsonl", | |
| "maywell_ko_Ultrafeedback_binarized.jsonl", | |
| "jojo0217_korean_rlhf_dataset.jsonl", | |
| "lemon-mint_korean-realqa-reasoning-v01-preference.jsonl", | |
| "tellang_yeji-preference-ko-v1.jsonl", | |
| ] | |
| def main(): | |
| parser = argparse.ArgumentParser( | |
| description="Preference ๋ฐ์ดํฐ ํตํฉ + ํฌ๋งท ์ ๊ทํ (ORPO ํธํ)" | |
| ) | |
| parser.add_argument( | |
| "--input_dir", | |
| type=str, | |
| default="data/preference", | |
| help="์ ๋ ฅ ๋๋ ํ ๋ฆฌ (๊ธฐ๋ณธ: data/preference)", | |
| ) | |
| parser.add_argument( | |
| "--output", | |
| type=str, | |
| default="data/preference/combined_preference.jsonl", | |
| help="์ถ๋ ฅ ํ์ผ ๊ฒฝ๋ก", | |
| ) | |
| parser.add_argument( | |
| "--include_all", | |
| action="store_true", | |
| help="TARGET_FILES ๋ชฉ๋ก ์ธ์ .jsonl ํ์ผ๋ ํฌํจ", | |
| ) | |
| args = parser.parse_args() | |
| input_dir = Path(args.input_dir) | |
| output_path = Path(args.output) | |
| if not input_dir.is_dir(): | |
| log.error(f"์ ๋ ฅ ๋๋ ํ ๋ฆฌ ์์: {input_dir}") | |
| sys.exit(1) | |
| # ์ฒ๋ฆฌ ํ์ผ ๊ฒฐ์ | |
| if args.include_all: | |
| src_files = sorted(input_dir.glob("*.jsonl")) | |
| # combined_preference.jsonl ์๊ธฐ ์์ ์ ์ธ | |
| src_files = [f for f in src_files if f.name != output_path.name] | |
| else: | |
| src_files = [] | |
| for fname in TARGET_FILES: | |
| p = input_dir / fname | |
| if p.exists(): | |
| src_files.append(p) | |
| else: | |
| log.warning(f"ํ์ผ ์์ (์คํต): {p}") | |
| if not src_files: | |
| log.error("์ฒ๋ฆฌํ JSONL ํ์ผ์ด ์์ต๋๋ค.") | |
| sys.exit(1) | |
| output_path.parent.mkdir(parents=True, exist_ok=True) | |
| log.info("=" * 60) | |
| log.info("Phase 0F: Preference ๋ฐ์ดํฐ ํตํฉ") | |
| log.info(f" ์ ๋ ฅ ํ์ผ ์ : {len(src_files)}") | |
| log.info(f" ์ถ๋ ฅ ํ์ผ : {output_path}") | |
| log.info(f" ์ต์ ๊ธธ์ด ๊ธฐ์ค: {MIN_LEN}์") | |
| log.info("=" * 60) | |
| stats: dict = {} | |
| total_written = 0 | |
| with output_path.open("w", encoding="utf-8") as out_f: | |
| for src_path in src_files: | |
| process_file(src_path, out_f, stats) | |
| total_written += stats.get(src_path.stem, {}).get("written", 0) | |
| # ์ต์ข ํต๊ณ ์์ฝ | |
| log.info("") | |
| log.info("=" * 60) | |
| log.info("์ต์ข ํต๊ณ ์์ฝ") | |
| log.info("=" * 60) | |
| log.info(f"{'๋ฐ์ดํฐ์ ':<50} {'๋ก๋ฉ':>8} {'ํฌ๋งท์คํต':>8} {'ํ์ง์คํต':>8} {'์ถ๋ ฅ':>8}") | |
| log.info("-" * 86) | |
| grand_loaded = 0 | |
| grand_fmt_skip = 0 | |
| grand_qual_skip = 0 | |
| for name, s in stats.items(): | |
| log.info( | |
| f"{name:<50} {s['loaded']:>8,} {s['skipped_format']:>8,} {s['skipped_quality']:>8,} {s['written']:>8,}" | |
| ) | |
| grand_loaded += s["loaded"] | |
| grand_fmt_skip += s["skipped_format"] | |
| grand_qual_skip += s["skipped_quality"] | |
| log.info("-" * 86) | |
| log.info( | |
| f"{'ํฉ๊ณ':<50} {grand_loaded:>8,} {grand_fmt_skip:>8,} {grand_qual_skip:>8,} {total_written:>8,}" | |
| ) | |
| log.info("=" * 60) | |
| log.info(f"์ถ๋ ฅ ์๋ฃ: {output_path} ({total_written:,}๊ฐ ๋ ์ฝ๋)") | |
| if __name__ == "__main__": | |
| main() | |