ykhrustalev commited on
Commit
18617e6
·
verified ·
1 Parent(s): 4f14c57

Upload folder using huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +187 -2
README.md CHANGED
@@ -1,7 +1,192 @@
1
  ---
2
- language: en
3
- pipeline_tag: text-generation
4
  library_name: mlx
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  tags:
 
 
 
6
  - mlx
 
 
7
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
 
2
  library_name: mlx
3
+ license: other
4
+ license_name: lfm1.0
5
+ license_link: LICENSE
6
+ language:
7
+ - en
8
+ - ja
9
+ - ko
10
+ - fr
11
+ - es
12
+ - de
13
+ - it
14
+ - pt
15
+ - ar
16
+ - zh
17
+ pipeline_tag: text-generation
18
  tags:
19
+ - liquid
20
+ - lfm2.5
21
+ - edge
22
  - mlx
23
+ - reasoning
24
+ base_model: LiquidAI/LFM2.5-1.2B-Thinking
25
  ---
26
+
27
+ <div align="center">
28
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/2b08LKpev0DNEk6DlnWkY.png" alt="Liquid AI" style="width: 100%; max-width: 100%;">
29
+
30
+ <p>
31
+ <a href="https://playground.liquid.ai/"><strong>Try LFM</strong></a> •
32
+ <a href="https://docs.liquid.ai/lfm"><strong>Documentation</strong></a> •
33
+ <a href="https://leap.liquid.ai/"><strong>LEAP</strong></a> •
34
+ <a href="https://www.liquid.ai/blog/"><strong>Blog</strong></a>
35
+ </p>
36
+ </div>
37
+
38
+ # LFM2.5-1.2B-Thinking-5bit
39
+
40
+ MLX export of [LFM2.5-1.2B-Thinking](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Thinking) for Apple Silicon inference.
41
+
42
+ LFM2.5-Thinking is a reasoning model that generates chain-of-thought explanations before providing final answers.
43
+
44
+ ## Model Details
45
+
46
+ | Property | Value |
47
+ |----------|-------|
48
+ | Parameters | 1.2B |
49
+ | Precision | 5-bit |
50
+ | Group Size | 64 |
51
+ | Size | 768 MB |
52
+ | Context Length | 128K |
53
+
54
+ ## Recommended Sampling Parameters
55
+
56
+ | Parameter | Value |
57
+ |-----------|-------|
58
+ | temperature | 0.1 |
59
+ | top_k | 50 |
60
+ | top_p | 0.1 |
61
+ | repetition_penalty | 1.05 |
62
+ | max_tokens | 512 |
63
+
64
+ ## Use with mlx
65
+
66
+ ```bash
67
+ pip install mlx-lm
68
+ ```
69
+
70
+ ```python
71
+ from mlx_lm import load, generate
72
+ from mlx_lm.sample_utils import make_sampler, make_logits_processors
73
+
74
+ model, tokenizer = load("LiquidAI/LFM2.5-1.2B-Thinking-5bit")
75
+
76
+ prompt = "solve 2+2*2"
77
+
78
+ if tokenizer.chat_template is not None:
79
+ messages = [{"role": "user", "content": prompt}]
80
+ prompt = tokenizer.apply_chat_template(
81
+ messages, tokenize=False, add_generation_prompt=True
82
+ )
83
+
84
+ sampler = make_sampler(temp=0.1, top_k=50, top_p=0.1)
85
+ logits_processors = make_logits_processors(repetition_penalty=1.05)
86
+
87
+ response = generate(
88
+ model,
89
+ tokenizer,
90
+ prompt=prompt,
91
+ max_tokens=512,
92
+ sampler=sampler,
93
+ logits_processors=logits_processors,
94
+ verbose=True,
95
+ )
96
+ ```
97
+
98
+ ## Chat Format
99
+
100
+ The model uses the ChatML format:
101
+
102
+ ```
103
+ <|startoftext|><|im_start|>system
104
+ {system_prompt}<|im_end|>
105
+ <|im_start|>user
106
+ {user_message}<|im_end|>
107
+ <|im_start|>assistant
108
+ {assistant_response}<|im_end|>
109
+ ```
110
+
111
+ ## Thinking Format
112
+
113
+ The model outputs chain-of-thought reasoning inside `<think>` tags before the final answer:
114
+
115
+ ```
116
+ <think>
117
+ [Internal reasoning process...]
118
+ </think>
119
+
120
+ [Final answer]
121
+ ```
122
+
123
+ ### Example
124
+
125
+ **Prompt:**
126
+
127
+ ```
128
+ <|startoftext|><|im_start|>user
129
+ solve 2+2*2<|im_end|>
130
+ <|im_start|>assistant
131
+ ```
132
+
133
+ **Model output:**
134
+
135
+ ```
136
+ <think> Okay, let me try to solve this problem: "2+2*2". Hmm, first I need to make sure I understand what exactly is being asked here. The expression is 2 + 2*2. Alright, so maybe it's just a simple arithmetic problem. Let me break it down step by step.
137
+
138
+ First, looking at the expression 2 + 2*2. I know in math, the order of operations is important here. Multiplication comes before addition, right? So the 2*2 would be calculated first. So 2 times 2 is 4. Then add that to the initial 2. So 2 + 4 equals 6.
139
+ </think>
140
+
141
+ The answer is **6**.
142
+
143
+ Following the order of operations (PEMDAS/BODMAS), multiplication is performed before addition:
144
+ - First: 2 × 2 = 4
145
+ - Then: 2 + 4 = 6
146
+ ```
147
+
148
+ ### Stripping Thinking from History
149
+
150
+ The chat template automatically strips `<think>` content from previous assistant messages in multi-turn conversations. To preserve thinking in history, use:
151
+
152
+ ```python
153
+ prompt = tokenizer.apply_chat_template(
154
+ messages,
155
+ tokenize=False,
156
+ add_generation_prompt=True,
157
+ keep_past_thinking=True # Preserve thinking in history
158
+ )
159
+ ```
160
+
161
+ ## Tool Calling
162
+
163
+ The model supports function calling with a specific format.
164
+
165
+ ### Tool Definition
166
+
167
+ Tools are defined as JSON in the system prompt:
168
+
169
+ ```
170
+ List of tools: [{"name": "tool_name", "description": "...", "parameters": {...}}]
171
+ ```
172
+
173
+ ### Tool Call Format
174
+
175
+ The model generates tool calls using special tokens:
176
+
177
+ ```
178
+ <|tool_call_start|>[function_name(arg1="value1", arg2="value2")]<|tool_call_end|>
179
+ ```
180
+
181
+ ### Tool Response Format
182
+
183
+ Tool results are provided in a `tool` role message:
184
+
185
+ ```
186
+ <|im_start|>tool
187
+ [{"result": "..."}]<|im_end|>
188
+ ```
189
+
190
+ ## License
191
+
192
+ This model is released under the [LFM 1.0 License](LICENSE).