animaslabs/multitalker-parakeet-streaming-0.6b-v1-mlx-4bit

This model was converted to MLX format, 4-bit quantized from nvidia/multitalker-parakeet-streaming-0.6b-v1 using the scripts in this github repo. Please refer to original model card for more details on the model.

Usage

Quantized models require calling mlx.nn.quantize() before loading weights.

import json
import mlx.nn as nn
from huggingface_hub import hf_hub_download
from parakeet_mlx.utils import from_config

# Download and load config
config_path = hf_hub_download("animaslabs/multitalker-parakeet-streaming-0.6b-v1-mlx-4bit", "config.json")
with open(config_path) as f:
    config = json.load(f)

# Build model and apply quantization structure
model = from_config(config)
nn.quantize(
    model,
    bits=config["quantization"]["bits"],
    group_size=config["quantization"]["group_size"],
)

# Load quantized weights
weights_path = hf_hub_download("animaslabs/multitalker-parakeet-streaming-0.6b-v1-mlx-4bit", "model.safetensors")
model.load_weights(weights_path)

# Transcribe
result = model.transcribe("audio.wav")
print(result.text)

Downloads last month: 20

MLX

Hardware compatibility

Quantized

Model tree for animaslabs/multitalker-parakeet-streaming-0.6b-v1-mlx-4bit

Base model

nvidia/multitalker-parakeet-streaming-0.6b-v1

Quantized

(3)

this model

Collection including animaslabs/multitalker-parakeet-streaming-0.6b-v1-mlx-4bit

MLX Models

Collection

Models for the MLX library. Some of the weight tensors are transposed compared to the original models. • 13 items • Updated Jan 24