Instructions to use mlx-community/Falcon-H1-Tiny-R-0.6B-8bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use mlx-community/Falcon-H1-Tiny-R-0.6B-8bit with MLX:

# Make sure mlx-lm is installed
# pip install --upgrade mlx-lm

# Generate text with mlx-lm
from mlx_lm import load, generate

model, tokenizer = load("mlx-community/Falcon-H1-Tiny-R-0.6B-8bit")

prompt = "Write a story about Einstein"
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
    messages, add_generation_prompt=True
)

text = generate(model, tokenizer, prompt=prompt, verbose=True)

Notebooks
Google Colab
Kaggle
Local Apps Settings
LM Studio

How to use mlx-community/Falcon-H1-Tiny-R-0.6B-8bit with Pi:

Start the MLX server

# Install MLX LM:
uv tool install mlx-lm
# Start a local OpenAI-compatible server:
mlx_lm.server --model "mlx-community/Falcon-H1-Tiny-R-0.6B-8bit"

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "mlx-lm": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "mlx-community/Falcon-H1-Tiny-R-0.6B-8bit"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use mlx-community/Falcon-H1-Tiny-R-0.6B-8bit with Hermes Agent:

Start the MLX server

# Install MLX LM:
uv tool install mlx-lm
# Start a local OpenAI-compatible server:
mlx_lm.server --model "mlx-community/Falcon-H1-Tiny-R-0.6B-8bit"

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default mlx-community/Falcon-H1-Tiny-R-0.6B-8bit

Run Hermes

hermes

MLX LM

How to use mlx-community/Falcon-H1-Tiny-R-0.6B-8bit with MLX LM:

Generate or start a chat session

# Install MLX LM
uv tool install mlx-lm
# Interactive chat REPL
mlx_lm.chat --model "mlx-community/Falcon-H1-Tiny-R-0.6B-8bit"

Run an OpenAI-compatible server

# Install MLX LM
uv tool install mlx-lm
# Start the server
mlx_lm.server --model "mlx-community/Falcon-H1-Tiny-R-0.6B-8bit"
# Calling the OpenAI-compatible server with curl
curl -X POST "http://localhost:8000/v1/chat/completions" \
   -H "Content-Type: application/json" \
   --data '{
     "model": "mlx-community/Falcon-H1-Tiny-R-0.6B-8bit",
     "messages": [
       {"role": "user", "content": "Hello"}
     ]
   }'

Falcon-H1-Tiny-R-0.6B-8bit / chat_template.jinja

prince-canuma

Add files using upload-large-folder tool

681495c verified 4 months ago

raw

history blame contribute delete

1.73 kB

	{%- if tools %} {{bos_token}}<\|system\|>
	{%- if messages[0]['role'] == 'system' %}
	{{ messages[0]['content'] }}
	{%- set remaining_messages = messages[1:] %}
	{%- else %}
	{%- set remaining_messages = messages %}
	{%- endif %}
	{{ 'You are a Falcon assistant skilled in function calling. You are helpful, respectful, and concise.

	# Tools

	You have access to the following functions. You MUST use them to answer questions when needed. For each function call, you MUST return a JSON object inside <tool_call></tool_call> tags.

	<tools>' + tools\|tojson(indent=2) + '</tools>

	# Output Format

	Your response MUST follow this format when making function calls:
	<tool_call>
	[
	{"name": "function_name", "arguments": {"arg1": "value1", "arg2": "value2"}},
	{"name": "another_function", "arguments": {"arg": "value"}}
	]
	</tool_call>
	If no function calls are needed, respond normally without the tool_call tags.' }}
	{%- for message in remaining_messages %}
	{%- if message['role'] == 'user' %}
	<\|im_start\|>user
	{{ message['content'] }}<\|im_end\|>
	{%- elif message['role'] == 'assistant' %}
	{%- if message.content %}
	<\|im_start\|>assistant
	{{ message['content'] }}
	<\|im_end\|>
	{%- endif %}
	{%- if message.tool_calls %}
	<tool_call>
	{{ message.tool_calls\|tojson(indent=2) }}
	</tool_call>
	{%- endif %}
	{%- elif message['role'] == 'tool' %}
	<\|im_start\|>assistant
	<tool_response>
	{{ message['content'] }}
	</tool_response><\|im_end\|>
	{%- endif %}
	{%- endfor %}
	{{ '<\|im_start\|>assistant
	' if add_generation_prompt }}
	{%- else %} {{bos_token}}{% for message in messages %} {{ '<\|im_start\|>' + message['role'] + '
	' + message['content'] + '<\|im_end\|>
	' }} {% endfor %} {% if add_generation_prompt %}{{ '<\|im_start\|>assistant
	' }}{% endif %} {%- endif %}