Instructions to use kistepAI/SPARK-Summarization-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use kistepAI/SPARK-Summarization-GGUF with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="kistepAI/SPARK-Summarization-GGUF",
	filename="kistep-mistral-nemo-summarization-bf16.gguf",
)

llm.create_chat_completion(
	messages = "\"The tower is 324 metres (1,063 ft) tall, about the same height as an 81-storey building, and the tallest structure in Paris. Its base is square, measuring 125 metres (410 ft) on each side. During its construction, the Eiffel Tower surpassed the Washington Monument to become the tallest man-made structure in the world, a title it held for 41 years until the Chrysler Building in New York City was finished in 1930. It was the first structure to reach a height of 300 metres. Due to the addition of a broadcasting aerial at the top of the tower in 1957, it is now taller than the Chrysler Building by 5.2 metres (17 ft). Excluding transmitters, the Eiffel Tower is the second tallest free-standing structure in France after the Millau Viaduct.\""
)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use kistepAI/SPARK-Summarization-GGUF with llama.cpp:

Install (macOS, Linux)

curl -LsSf https://llama.app/install.sh | sh
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf kistepAI/SPARK-Summarization-GGUF:BF16
# Run inference directly in the terminal:
llama cli -hf kistepAI/SPARK-Summarization-GGUF:BF16

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf kistepAI/SPARK-Summarization-GGUF:BF16
# Run inference directly in the terminal:
llama cli -hf kistepAI/SPARK-Summarization-GGUF:BF16

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf kistepAI/SPARK-Summarization-GGUF:BF16
# Run inference directly in the terminal:
./llama-cli -hf kistepAI/SPARK-Summarization-GGUF:BF16

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf kistepAI/SPARK-Summarization-GGUF:BF16
# Run inference directly in the terminal:
./build/bin/llama-cli -hf kistepAI/SPARK-Summarization-GGUF:BF16

Use Docker

docker model run hf.co/kistepAI/SPARK-Summarization-GGUF:BF16

LM Studio
Jan
Ollama
How to use kistepAI/SPARK-Summarization-GGUF with Ollama:
```
ollama run hf.co/kistepAI/SPARK-Summarization-GGUF:BF16
```

Unsloth Studio

How to use kistepAI/SPARK-Summarization-GGUF with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for kistepAI/SPARK-Summarization-GGUF to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for kistepAI/SPARK-Summarization-GGUF to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for kistepAI/SPARK-Summarization-GGUF to start chatting

How to use kistepAI/SPARK-Summarization-GGUF with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama serve -hf kistepAI/SPARK-Summarization-GGUF:BF16

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "kistepAI/SPARK-Summarization-GGUF:BF16"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use kistepAI/SPARK-Summarization-GGUF with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama serve -hf kistepAI/SPARK-Summarization-GGUF:BF16

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default kistepAI/SPARK-Summarization-GGUF:BF16

Run Hermes

hermes

Atomic Chat new

OpenClaw new

How to use kistepAI/SPARK-Summarization-GGUF with OpenClaw:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama serve -hf kistepAI/SPARK-Summarization-GGUF:BF16

Configure OpenClaw

# Install OpenClaw:
npm install -g openclaw@latest
# Register the local server and set it as the default model:
openclaw onboard --non-interactive --mode local \
  --auth-choice custom-api-key \
  --custom-base-url http://127.0.0.1:8080/v1 \
  --custom-model-id "kistepAI/SPARK-Summarization-GGUF:BF16" \
  --custom-provider-id llama-cpp \
  --custom-compatibility openai \
  --custom-text-input \
  --accept-risk \
  --skip-health

Run OpenClaw

openclaw agent --local --agent main --message "Hello from Hugging Face"

Docker Model Runner
How to use kistepAI/SPARK-Summarization-GGUF with Docker Model Runner:
```
docker model run hf.co/kistepAI/SPARK-Summarization-GGUF:BF16
```

Lemonade

How to use kistepAI/SPARK-Summarization-GGUF with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull kistepAI/SPARK-Summarization-GGUF:BF16

Run and chat with the model

lemonade run user.SPARK-Summarization-GGUF-BF16

List all available models

lemonade list

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Usage Guide

개인은 자유롭게 사용할 수 있습니다.
기업 및 기관은 비상업적 목적으로 이용해 주시기 바랍니다.
또한, 추후 협업 및 네트워크 구축을 위해 기관 정보와 AI 모델 사용 담당자 정보를 메일로 보내주시면 연락드리겠습니다.

CONTACT : kistep_ax@kistep.re.kr

Individuals are free to use this without restrictions.
For companies and institutions, please use it for non-commercial purposes.
Additionally, to facilitate future collaboration and network building, please send us an email with your institution's information and the contact details of the person responsible for using the AI model. We will get in touch with you.

1. Description

SPARK-Summarization is a large language model developed by the Korea Institute of S&T Evaluation and Planning (KISTEP). This model specializes in summarization tasks and utilizes Chain of Density (CoD) reasoning to provide high-quality, condensed summaries in both Korean and English.

2. Key Features

Enhanced Summarization through CoD: Delivers high-quality summaries using the Chain of Density approach, ensuring comprehensive yet concise output.
Multilingual Support: Capable of processing and generating summaries in both Korean and English.
Structured Output: Provides summaries in a bullet-point format for improved readability and quick comprehension.
Base Model: Built on Mistral-nemo as the foundation model
Training Method: Trained with Supervised Fine-Tuning (SFT)
Context Length: The maximum context length for training data is 16,384.

3. Data

source	KISTEP Documents
count	24,417

4. Usage

When using ollama, you can utilize the Modelfile.
Recommended Prompt Template (input: {TITLE}, {DOCUMENT})

propmt_template: |
    당신은 요약 전문가입니다. 주어진 텍스트를 참고하여 요약을 작성하세요.
    
    ## 요약 단계:
    1. 텍스트 분석:
        - 문서 제목과 텍스트를 주의 깊게 읽고, 문서의 주요 주제를 파악하세요.
    2. 주요 주장(key_argument) 식별:
        - 다음 질문에 답변하기: "이 텍스트의 주요 주장 또는 핵심 논점은 무엇인가?"
    3. 주요 개체(entities) 추출: 
        - 5단어 이하의 주요 개체 3개를 뽑아주세요.
    4. 요약문의 주제(title) 생성: 
        - 제공된 텍스트에 대한 간결한 한문장의 주제를 생성하세요.
    5. 요약(summary) 작성: 
        - 주요 주장과 주요 개체, 주제를 참고하여 텍스트의 주요 내용을 요약하세요.
        
    ## 향상 단계
    6. 밀도 향상:
        - 초기 요약에 포함되지 않은 1~3개의 추가 설명 개체를 식별하세요.
        - 이전 및 새 개체를 모두 통합하여 요약의 밀도가 높은 버전을 작성하세요.
    7. 중요도 평가:
        - 이전 요약에서 필수적인 부분을 강조하고 덜 중요한 부분을 줄여서 수정하세요.
        - 새 요약이 주요 주장과 밀접하게 일치하는지 확인하세요.
    8. 유창성 향상:
        - 문법, 단어 선택, 표현을 다듬어 가독성과 자연스러운 흐름을 향상시키세요.
        - 요약 세부내용의 정확성과 완전성을 유지하면서 문장 구조를 개선하세요.
    
    ## 작성 방식:
        - 문서를 소개하는 대신 요약 내용만 작성하세요.
        - 구체적인 데이터나 수치보다는 전체 흐름과 방향을 설명하세요.
        - 주어진 내용에만 기반해 객관적으로 작성하세요.
        - 한국어로 작성하되, 영어 기술 용어와 고유 명사는 그대로 사용하세요.
    
    
    ## 입력:
    ### 문서 제목:
    {TITLE}
    ### 텍스트:
    {DOCUMENT}
    ## 출력 형식:
    <reason>
    초기 주요 주장: [초기 주요 주장]
    초기 주요 개체: [초기 주요 개체 목록]
    초기 제목: [초기 제목]
    초기 요약: [초기 요약 내용]
    
    밀도 향상 단계:
    새로 추가된 주요 개체: [새로 추가된 주요 개체 목록(with bullet points)]
    사고 과정: [주요 개체 선택 및 요약 작성에 대한 설명]
    업데이트 제목: [업데이트 제목]
    업데이트 요약: [업데이트 요약 내용]
    
    중요도 평가 단계:
    사고 과정: [요약 관련성 향상을 위한 중요도 평가 및 변경된 사항에 대한 설명]
    업데이트 제목: [업데이트 제목]
    업데이트 요약: [업데이트 요약 내용]
    
    언어 유청성 단계:
    사고 과정: [언어 명확성과 유창성을 개선하기 위해 변경된 사항에 대한 설명]
    업데이트 제목: [업데이트 제목]
    Updated Summary: [요약의 각 문장 목록(with bullet points)]
    </reason>
    
    <output>
        <key_argument>[주요 주장(한국어)]</key_argument>
        <entities>[주요 개체 목록, 쉼표로 구분]</entities>
        <title>[주제(한국어)]</title>
        <summary>
            <point>[첫번째 요약 문장(한국어)]</point>
            <point>[두번째 요약 문장(한국어)]</point>
            ...
        </summary>
    </output>

5. Benchmark

TBD

Downloads last month: -

GGUF

Model size

12B params

Architecture

llama

Hardware compatibility

16-bit

Model tree for kistepAI/SPARK-Summarization-GGUF

Base model

mistralai/Mistral-Nemo-Base-2407

Finetuned

mistralai/Mistral-Nemo-Instruct-2407

Quantized

(172)

this model