How to use from the
Use from the
llama-cpp-python library
# Gated model: Login with a HF token with gated access permission
hf auth login
# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="kistepAI/SPARK-Summarization-GGUF",
	filename="kistep-mistral-nemo-summarization-bf16.gguf",
)
llm.create_chat_completion(
	messages = "\"The tower is 324 metres (1,063 ft) tall, about the same height as an 81-storey building, and the tallest structure in Paris. Its base is square, measuring 125 metres (410 ft) on each side. During its construction, the Eiffel Tower surpassed the Washington Monument to become the tallest man-made structure in the world, a title it held for 41 years until the Chrysler Building in New York City was finished in 1930. It was the first structure to reach a height of 300 metres. Due to the addition of a broadcasting aerial at the top of the tower in 1957, it is now taller than the Chrysler Building by 5.2 metres (17 ft). Excluding transmitters, the Eiffel Tower is the second tallest free-standing structure in France after the Millau Viaduct.\""
)

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Usage Guide

๊ฐœ์ธ์€ ์ž์œ ๋กญ๊ฒŒ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
๊ธฐ์—… ๋ฐ ๊ธฐ๊ด€์€ ๋น„์ƒ์—…์  ๋ชฉ์ ์œผ๋กœ ์ด์šฉํ•ด ์ฃผ์‹œ๊ธฐ ๋ฐ”๋ž๋‹ˆ๋‹ค.
๋˜ํ•œ, ์ถ”ํ›„ ํ˜‘์—… ๋ฐ ๋„คํŠธ์›Œํฌ ๊ตฌ์ถ•์„ ์œ„ํ•ด ๊ธฐ๊ด€ ์ •๋ณด์™€ AI ๋ชจ๋ธ ์‚ฌ์šฉ ๋‹ด๋‹น์ž ์ •๋ณด๋ฅผ ๋ฉ”์ผ๋กœ ๋ณด๋‚ด์ฃผ์‹œ๋ฉด ์—ฐ๋ฝ๋“œ๋ฆฌ๊ฒ ์Šต๋‹ˆ๋‹ค.

CONTACT : kistep_ax@kistep.re.kr

Individuals are free to use this without restrictions.
For companies and institutions, please use it for non-commercial purposes.
Additionally, to facilitate future collaboration and network building, please send us an email with your institution's information and the contact details of the person responsible for using the AI model. We will get in touch with you.

1. Description

SPARK-Summarization is a large language model developed by the Korea Institute of S&T Evaluation and Planning (KISTEP). This model specializes in summarization tasks and utilizes Chain of Density (CoD) reasoning to provide high-quality, condensed summaries in both Korean and English.

2. Key Features

  • Enhanced Summarization through CoD: Delivers high-quality summaries using the Chain of Density approach, ensuring comprehensive yet concise output.
  • Multilingual Support: Capable of processing and generating summaries in both Korean and English.
  • Structured Output: Provides summaries in a bullet-point format for improved readability and quick comprehension.
  • Base Model: Built on Mistral-nemo as the foundation model
  • Training Method: Trained with Supervised Fine-Tuning (SFT)
  • Context Length: The maximum context length for training data is 16,384.

3. Data

source KISTEP Documents
count 24,417

4. Usage

  • When using ollama, you can utilize the Modelfile.
  • Recommended Prompt Template (input: {TITLE}, {DOCUMENT})
propmt_template: |
    ๋‹น์‹ ์€ ์š”์•ฝ ์ „๋ฌธ๊ฐ€์ž…๋‹ˆ๋‹ค. ์ฃผ์–ด์ง„ ํ…์ŠคํŠธ๋ฅผ ์ฐธ๊ณ ํ•˜์—ฌ ์š”์•ฝ์„ ์ž‘์„ฑํ•˜์„ธ์š”.
    
    ## ์š”์•ฝ ๋‹จ๊ณ„:
    1. ํ…์ŠคํŠธ ๋ถ„์„:
        - ๋ฌธ์„œ ์ œ๋ชฉ๊ณผ ํ…์ŠคํŠธ๋ฅผ ์ฃผ์˜ ๊นŠ๊ฒŒ ์ฝ๊ณ , ๋ฌธ์„œ์˜ ์ฃผ์š” ์ฃผ์ œ๋ฅผ ํŒŒ์•…ํ•˜์„ธ์š”.
    2. ์ฃผ์š” ์ฃผ์žฅ(key_argument) ์‹๋ณ„:
        - ๋‹ค์Œ ์งˆ๋ฌธ์— ๋‹ต๋ณ€ํ•˜๊ธฐ: "์ด ํ…์ŠคํŠธ์˜ ์ฃผ์š” ์ฃผ์žฅ ๋˜๋Š” ํ•ต์‹ฌ ๋…ผ์ ์€ ๋ฌด์—‡์ธ๊ฐ€?"
    3. ์ฃผ์š” ๊ฐœ์ฒด(entities) ์ถ”์ถœ: 
        - 5๋‹จ์–ด ์ดํ•˜์˜ ์ฃผ์š” ๊ฐœ์ฒด 3๊ฐœ๋ฅผ ๋ฝ‘์•„์ฃผ์„ธ์š”.
    4. ์š”์•ฝ๋ฌธ์˜ ์ฃผ์ œ(title) ์ƒ์„ฑ: 
        - ์ œ๊ณต๋œ ํ…์ŠคํŠธ์— ๋Œ€ํ•œ ๊ฐ„๊ฒฐํ•œ ํ•œ๋ฌธ์žฅ์˜ ์ฃผ์ œ๋ฅผ ์ƒ์„ฑํ•˜์„ธ์š”.
    5. ์š”์•ฝ(summary) ์ž‘์„ฑ: 
        - ์ฃผ์š” ์ฃผ์žฅ๊ณผ ์ฃผ์š” ๊ฐœ์ฒด, ์ฃผ์ œ๋ฅผ ์ฐธ๊ณ ํ•˜์—ฌ ํ…์ŠคํŠธ์˜ ์ฃผ์š” ๋‚ด์šฉ์„ ์š”์•ฝํ•˜์„ธ์š”.
        
    ## ํ–ฅ์ƒ ๋‹จ๊ณ„
    6. ๋ฐ€๋„ ํ–ฅ์ƒ:
        - ์ดˆ๊ธฐ ์š”์•ฝ์— ํฌํ•จ๋˜์ง€ ์•Š์€ 1~3๊ฐœ์˜ ์ถ”๊ฐ€ ์„ค๋ช… ๊ฐœ์ฒด๋ฅผ ์‹๋ณ„ํ•˜์„ธ์š”.
        - ์ด์ „ ๋ฐ ์ƒˆ ๊ฐœ์ฒด๋ฅผ ๋ชจ๋‘ ํ†ตํ•ฉํ•˜์—ฌ ์š”์•ฝ์˜ ๋ฐ€๋„๊ฐ€ ๋†’์€ ๋ฒ„์ „์„ ์ž‘์„ฑํ•˜์„ธ์š”.
    7. ์ค‘์š”๋„ ํ‰๊ฐ€:
        - ์ด์ „ ์š”์•ฝ์—์„œ ํ•„์ˆ˜์ ์ธ ๋ถ€๋ถ„์„ ๊ฐ•์กฐํ•˜๊ณ  ๋œ ์ค‘์š”ํ•œ ๋ถ€๋ถ„์„ ์ค„์—ฌ์„œ ์ˆ˜์ •ํ•˜์„ธ์š”.
        - ์ƒˆ ์š”์•ฝ์ด ์ฃผ์š” ์ฃผ์žฅ๊ณผ ๋ฐ€์ ‘ํ•˜๊ฒŒ ์ผ์น˜ํ•˜๋Š”์ง€ ํ™•์ธํ•˜์„ธ์š”.
    8. ์œ ์ฐฝ์„ฑ ํ–ฅ์ƒ:
        - ๋ฌธ๋ฒ•, ๋‹จ์–ด ์„ ํƒ, ํ‘œํ˜„์„ ๋‹ค๋“ฌ์–ด ๊ฐ€๋…์„ฑ๊ณผ ์ž์—ฐ์Šค๋Ÿฌ์šด ํ๋ฆ„์„ ํ–ฅ์ƒ์‹œํ‚ค์„ธ์š”.
        - ์š”์•ฝ ์„ธ๋ถ€๋‚ด์šฉ์˜ ์ •ํ™•์„ฑ๊ณผ ์™„์ „์„ฑ์„ ์œ ์ง€ํ•˜๋ฉด์„œ ๋ฌธ์žฅ ๊ตฌ์กฐ๋ฅผ ๊ฐœ์„ ํ•˜์„ธ์š”.
    
    ## ์ž‘์„ฑ ๋ฐฉ์‹:
        - ๋ฌธ์„œ๋ฅผ ์†Œ๊ฐœํ•˜๋Š” ๋Œ€์‹  ์š”์•ฝ ๋‚ด์šฉ๋งŒ ์ž‘์„ฑํ•˜์„ธ์š”.
        - ๊ตฌ์ฒด์ ์ธ ๋ฐ์ดํ„ฐ๋‚˜ ์ˆ˜์น˜๋ณด๋‹ค๋Š” ์ „์ฒด ํ๋ฆ„๊ณผ ๋ฐฉํ–ฅ์„ ์„ค๋ช…ํ•˜์„ธ์š”.
        - ์ฃผ์–ด์ง„ ๋‚ด์šฉ์—๋งŒ ๊ธฐ๋ฐ˜ํ•ด ๊ฐ๊ด€์ ์œผ๋กœ ์ž‘์„ฑํ•˜์„ธ์š”.
        - ํ•œ๊ตญ์–ด๋กœ ์ž‘์„ฑํ•˜๋˜, ์˜์–ด ๊ธฐ์ˆ  ์šฉ์–ด์™€ ๊ณ ์œ  ๋ช…์‚ฌ๋Š” ๊ทธ๋Œ€๋กœ ์‚ฌ์šฉํ•˜์„ธ์š”.
    
    
    ## ์ž…๋ ฅ:
    ### ๋ฌธ์„œ ์ œ๋ชฉ:
    {TITLE}
    ### ํ…์ŠคํŠธ:
    {DOCUMENT}
    ## ์ถœ๋ ฅ ํ˜•์‹:
    <reason>
    ์ดˆ๊ธฐ ์ฃผ์š” ์ฃผ์žฅ: [์ดˆ๊ธฐ ์ฃผ์š” ์ฃผ์žฅ]
    ์ดˆ๊ธฐ ์ฃผ์š” ๊ฐœ์ฒด: [์ดˆ๊ธฐ ์ฃผ์š” ๊ฐœ์ฒด ๋ชฉ๋ก]
    ์ดˆ๊ธฐ ์ œ๋ชฉ: [์ดˆ๊ธฐ ์ œ๋ชฉ]
    ์ดˆ๊ธฐ ์š”์•ฝ: [์ดˆ๊ธฐ ์š”์•ฝ ๋‚ด์šฉ]
    
    ๋ฐ€๋„ ํ–ฅ์ƒ ๋‹จ๊ณ„:
    ์ƒˆ๋กœ ์ถ”๊ฐ€๋œ ์ฃผ์š” ๊ฐœ์ฒด: [์ƒˆ๋กœ ์ถ”๊ฐ€๋œ ์ฃผ์š” ๊ฐœ์ฒด ๋ชฉ๋ก(with bullet points)]
    ์‚ฌ๊ณ  ๊ณผ์ •: [์ฃผ์š” ๊ฐœ์ฒด ์„ ํƒ ๋ฐ ์š”์•ฝ ์ž‘์„ฑ์— ๋Œ€ํ•œ ์„ค๋ช…]
    ์—…๋ฐ์ดํŠธ ์ œ๋ชฉ: [์—…๋ฐ์ดํŠธ ์ œ๋ชฉ]
    ์—…๋ฐ์ดํŠธ ์š”์•ฝ: [์—…๋ฐ์ดํŠธ ์š”์•ฝ ๋‚ด์šฉ]
    
    ์ค‘์š”๋„ ํ‰๊ฐ€ ๋‹จ๊ณ„:
    ์‚ฌ๊ณ  ๊ณผ์ •: [์š”์•ฝ ๊ด€๋ จ์„ฑ ํ–ฅ์ƒ์„ ์œ„ํ•œ ์ค‘์š”๋„ ํ‰๊ฐ€ ๋ฐ ๋ณ€๊ฒฝ๋œ ์‚ฌํ•ญ์— ๋Œ€ํ•œ ์„ค๋ช…]
    ์—…๋ฐ์ดํŠธ ์ œ๋ชฉ: [์—…๋ฐ์ดํŠธ ์ œ๋ชฉ]
    ์—…๋ฐ์ดํŠธ ์š”์•ฝ: [์—…๋ฐ์ดํŠธ ์š”์•ฝ ๋‚ด์šฉ]
    
    ์–ธ์–ด ์œ ์ฒญ์„ฑ ๋‹จ๊ณ„:
    ์‚ฌ๊ณ  ๊ณผ์ •: [์–ธ์–ด ๋ช…ํ™•์„ฑ๊ณผ ์œ ์ฐฝ์„ฑ์„ ๊ฐœ์„ ํ•˜๊ธฐ ์œ„ํ•ด ๋ณ€๊ฒฝ๋œ ์‚ฌํ•ญ์— ๋Œ€ํ•œ ์„ค๋ช…]
    ์—…๋ฐ์ดํŠธ ์ œ๋ชฉ: [์—…๋ฐ์ดํŠธ ์ œ๋ชฉ]
    Updated Summary: [์š”์•ฝ์˜ ๊ฐ ๋ฌธ์žฅ ๋ชฉ๋ก(with bullet points)]
    </reason>
    
    <output>
        <key_argument>[์ฃผ์š” ์ฃผ์žฅ(ํ•œ๊ตญ์–ด)]</key_argument>
        <entities>[์ฃผ์š” ๊ฐœ์ฒด ๋ชฉ๋ก, ์‰ผํ‘œ๋กœ ๊ตฌ๋ถ„]</entities>
        <title>[์ฃผ์ œ(ํ•œ๊ตญ์–ด)]</title>
        <summary>
            <point>[์ฒซ๋ฒˆ์งธ ์š”์•ฝ ๋ฌธ์žฅ(ํ•œ๊ตญ์–ด)]</point>
            <point>[๋‘๋ฒˆ์งธ ์š”์•ฝ ๋ฌธ์žฅ(ํ•œ๊ตญ์–ด)]</point>
            ...
        </summary>
    </output>

5. Benchmark

TBD

Downloads last month
-
GGUF
Model size
12B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for kistepAI/SPARK-Summarization-GGUF