Configuration Parsing Warning:Config file tokenizer_config.json cannot be fetched (too big)

nhe-ai/Llasa-3B-mlx-8Bit

The Model nhe-ai/Llasa-3B-mlx-8Bit was converted to MLX format from HKUSTAudio/Llasa-3B using mlx-lm version 0.22.3.

⚠️ Important: This model was automatically converted for experimentation. The following guide was not designed for this model and may not work as expected. Do not expect to function out of the box. Use at your own experimentation.

Use with mlx

pip install mlx-lm

from mlx_lm import load, generate

model, tokenizer = load("nhe-ai/Llasa-3B-mlx-8Bit")

prompt="hello"

if hasattr(tokenizer, "apply_chat_template") and tokenizer.chat_template is not None:
    messages = [{"role": "user", "content": prompt}]
    prompt = tokenizer.apply_chat_template(
        messages, tokenize=False, add_generation_prompt=True
    )

response = generate(model, tokenizer, prompt=prompt, verbose=True)

Downloads last month: 4

Safetensors

Model size

1.0B params

Tensor type

F16

U32

MLX

Hardware compatibility

8-bit

Model tree for nhe-ai/Llasa-3B-mlx-8Bit

Base model

meta-llama/Llama-3.2-3B-Instruct

Finetuned

HKUSTAudio/Llasa-3B

Quantized

(8)

this model