How to use from
llama.cpp
Install from brew
brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf versae/filiberto-7B-instruct-exp1:
# Run inference directly in the terminal:
llama-cli -hf versae/filiberto-7B-instruct-exp1:
Install from WinGet (Windows)
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf versae/filiberto-7B-instruct-exp1:
# Run inference directly in the terminal:
llama-cli -hf versae/filiberto-7B-instruct-exp1:
Use pre-built binary
# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf versae/filiberto-7B-instruct-exp1:
# Run inference directly in the terminal:
./llama-cli -hf versae/filiberto-7B-instruct-exp1:
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf versae/filiberto-7B-instruct-exp1:
# Run inference directly in the terminal:
./build/bin/llama-cli -hf versae/filiberto-7B-instruct-exp1:
Use Docker
docker model run hf.co/versae/filiberto-7B-instruct-exp1:
Quick Links

versae/filiberto-7B-instruct-exp1

This model was converted to MLX format from mistralai/Mistral-7B-Instruct-v0.2 using mlx-lm version 0.9.0. Refer to the original model card for more details on the model.

Use with mlx

pip install mlx-lm
from mlx_lm import load, generate
model, tokenizer = load("versae/filiberto-7B-instruct-exp1")

OCR correction

text = """Otra vez, Don Iuan, me dad,
y otras mil vezes los braços.
Otra, y otras mil sean lazos
de nuestra antigua amistad.
Como venis?
Yo me siento
tan alegre, tan vfano,
tan venturoso, tan vano,
que no podrà el pensamiento
encareceros jamàs
las venturas que posseo,
porque el pensamiento creo"""

prompt = f"""<s>[INST] Dado el siguiente texto OCR, corrige los fallos que encuentres y devuelve el texto corregido:

{text} [/INST]"""

response = generate(model, tokenizer, prompt=prompt, verbose=True)

Stanza identification

text = """Alcázares finjo más altos que montes;
escalo las bóvedas de ingrávido tul
asida a las ruedas de alados Faetones;
ensueño quimeras; oteo horizontes
de nieve, de rosa, de nácar, de azul."""

prompt = f"""<s>[INST] Indique el nombre de la siguiente estrofa:

{text} [/INST]"""

response = generate(model, tokenizer, prompt=prompt, verbose=True)
Downloads last month
8
Safetensors
Model size
7B params
Tensor type
BF16
·
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support