Duplicated from salakash/Minimalism

mmrech
/

Minimalism

Text Generation

coding-assistant

Model card Files Files and versions

Minimalism / USAGE.md

mmrech's picture

Duplicate from salakash/Minimalism

28315be 3 months ago

|

history blame contribute delete

823 Bytes

Minimalism Usage

Quick Start

1. Install dependencies

pip install mlx-lm

2. Start the server

# Using the base model with this adapter
python -m mlx_lm.server \
  --model mlx-community/Qwen2.5-Coder-0.5B-Instruct-4bit \
  --adapter-path . \
  --host 127.0.0.1 \
  --port 8080

3. Test with curl

curl http://127.0.0.1:8080/v1/chat/completions \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "Minimalism",
    "messages": [
      {"role": "user", "content": "Write a Python function to add two numbers"}
    ],
    "max_tokens": 256
  }'

Response Format

Minimalism provides runnable-first responses with these sections:

Solution: Main implementation
Usage: Smallest runnable example
Sanity test: Tiny test snippet (when appropriate)