furukama commited on
Commit
a32a8d3
·
verified ·
1 Parent(s): d248a5e

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +83 -31
README.md CHANGED
@@ -1,50 +1,102 @@
1
  ---
2
- library_name: mlx
3
- license: other
4
- license_name: lfm1.0
5
- license_link: LICENSE
 
 
 
 
 
 
6
  language:
7
  - en
8
- - ar
9
- - zh
10
- - fr
11
- - de
12
- - ja
13
- - ko
14
- - es
15
  pipeline_tag: text-generation
16
- tags:
17
- - liquid
18
- - lfm2.5
19
- - edge
20
- - mlx
21
- base_model: mlx-community/LFM2.5-1.2B-Instruct-4bit
22
  ---
23
 
24
- # hybridaione/LFM2.5-1.2B-Text2SQL
 
 
 
 
 
 
25
 
26
- This model [hybridaione/LFM2.5-1.2B-Text2SQL](https://huggingface.co/hybridaione/LFM2.5-1.2B-Text2SQL) was
27
- converted to MLX format from [mlx-community/LFM2.5-1.2B-Instruct-4bit](https://huggingface.co/mlx-community/LFM2.5-1.2B-Instruct-4bit)
28
- using mlx-lm version **0.29.1**.
29
 
30
- ## Use with mlx
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
31
 
32
- ```bash
33
- pip install mlx-lm
 
 
 
 
 
 
 
 
 
 
 
 
 
34
  ```
35
 
 
 
 
 
 
 
 
 
 
 
 
36
  ```python
37
  from mlx_lm import load, generate
38
 
39
  model, tokenizer = load("hybridaione/LFM2.5-1.2B-Text2SQL")
 
 
40
 
41
- prompt = "hello"
42
 
43
- if tokenizer.chat_template is not None:
44
- messages = [{"role": "user", "content": prompt}]
45
- prompt = tokenizer.apply_chat_template(
46
- messages, add_generation_prompt=True
47
- )
 
48
 
49
- response = generate(model, tokenizer, prompt=prompt, verbose=True)
 
50
  ```
 
 
 
 
 
1
  ---
2
+ license: apache-2.0
3
+ base_model: LiquidAI/LFM2.5-1.2B-Instruct
4
+ tags:
5
+ - text-to-sql
6
+ - sql
7
+ - fine-tuned
8
+ - mlx
9
+ - lora
10
+ datasets:
11
+ - synthetic
12
  language:
13
  - en
 
 
 
 
 
 
 
14
  pipeline_tag: text-generation
 
 
 
 
 
 
15
  ---
16
 
17
+ # LFM2.5-1.2B-Text2SQL
18
+
19
+ A fine-tuned version of [LiquidAI/LFM2.5-1.2B-Instruct](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Instruct) optimized for text-to-SQL generation.
20
+
21
+ ## Model Description
22
+
23
+ This model was fine-tuned using LoRA on 2000 synthetic text-to-SQL examples generated via knowledge distillation from DeepSeek V3. The fine-tuning was performed using MLX on Apple Silicon.
24
 
25
+ ## Performance
 
 
26
 
27
+ | Metric | Teacher (DeepSeek V3) | Base (LFM2.5 1.2B) | This Model |
28
+ |--------|----------------------|-------------------|------------|
29
+ | **Exact Match** | 60% | 48% | **66%** |
30
+ | **LLM-as-Judge** | 90% | 75% | 87% |
31
+ | **ROUGE-L** | 0.917 | 0.830 | **0.931** |
32
+ | **BLEU** | 0.852 | 0.695 | **0.870** |
33
+ | **Semantic Similarity** | 0.965 | 0.926 | **0.970** |
34
+
35
+ The fine-tuned model **beats the teacher on 4 out of 5 metrics** despite being significantly smaller.
36
+
37
+ ## Training Details
38
+
39
+ - **Base Model:** LiquidAI/LFM2.5-1.2B-Instruct
40
+ - **Fine-tuning Method:** LoRA (rank 8)
41
+ - **Training Data:** 2000 synthetic examples
42
+ - **Epochs:** 2 (checkpoint 1800)
43
+ - **Hardware:** Apple Silicon (MLX)
44
+
45
+ ## Usage
46
+
47
+ ### With vLLM
48
+
49
+ ```python
50
+ from vllm import LLM, SamplingParams
51
 
52
+ llm = LLM(model="hybridaione/LFM2.5-1.2B-Text2SQL")
53
+ sampling_params = SamplingParams(temperature=0, max_tokens=512)
54
+
55
+ prompt = """<|im_start|>system
56
+ You are an expert SQL writer. Given a database schema and natural language question, write the precise SQL query that answers it. Output only the SQL query with no explanation.<|im_end|>
57
+ <|im_start|>user
58
+ Schema:
59
+ CREATE TABLE users (id INTEGER PRIMARY KEY, name TEXT, email TEXT);
60
+
61
+ Question: How many users are there?<|im_end|>
62
+ <|im_start|>assistant
63
+ """
64
+
65
+ output = llm.generate([prompt], sampling_params)
66
+ print(output[0].outputs[0].text)
67
  ```
68
 
69
+ ### With Transformers
70
+
71
+ ```python
72
+ from transformers import AutoModelForCausalLM, AutoTokenizer
73
+
74
+ model = AutoModelForCausalLM.from_pretrained("hybridaione/LFM2.5-1.2B-Text2SQL")
75
+ tokenizer = AutoTokenizer.from_pretrained("hybridaione/LFM2.5-1.2B-Text2SQL")
76
+ ```
77
+
78
+ ### With MLX (Apple Silicon)
79
+
80
  ```python
81
  from mlx_lm import load, generate
82
 
83
  model, tokenizer = load("hybridaione/LFM2.5-1.2B-Text2SQL")
84
+ response = generate(model, tokenizer, prompt="...", max_tokens=512)
85
+ ```
86
 
87
+ ## Prompt Format
88
 
89
+ ```
90
+ <|im_start|>system
91
+ You are an expert SQL writer. Given a database schema and natural language question, write the precise SQL query that answers it. Output only the SQL query with no explanation.<|im_end|>
92
+ <|im_start|>user
93
+ Schema:
94
+ {CREATE TABLE statements}
95
 
96
+ Question: {natural language question}<|im_end|>
97
+ <|im_start|>assistant
98
  ```
99
+
100
+ ## License
101
+
102
+ Apache 2.0