Mistral-7B-Teletype / Modelfile
Tiara Rodney
Mistral-7B-Teletype 1.0.0: self-directed shell-operation adapter for Mistral-7B
c3e401b unverified
Raw
History Blame Contribute Delete
1.13 kB
# mistral:7b-teletype -- Ollama Modelfile
#
# Applies the LoRA adapter over the base as a GGUF adapter, so the base is
# pulled (not redistributed) and this artifact stays small. The base's own
# default chat template and EOS are used: ccpty serves via the OpenAI protocol
# and the endpoint renders with the model's default template, so train and
# serve must both use that default.
#
# Convert the PEFT adapter to GGUF (llama.cpp):
# python llama.cpp/convert_lora_to_gguf.py . \
# --base mistralai/Mistral-7B-Instruct-v0.2 \
# --outfile teletype-lora-f16.gguf
# then:
# ollama create teletype -f Modelfile
#
# Alternative: merge first (PeftModel.merge_and_unload over the fp16 base),
# convert to a single quantized GGUF, and use `FROM ./merged-q4_k_m.gguf` with
# no ADAPTER line. Standalone but a full ~4GB upload.
FROM mistral:7b-instruct-v0.2
ADAPTER ./teletype-lora-f16.gguf
# The base image already carries Mistral's default chat template and stop
# tokens; do not override them. Operate deterministically -- this is a shell
# driver, not a chat partner.
PARAMETER temperature 0.2
PARAMETER num_ctx 4096