Ayushnaik commited on
Commit
35550df
·
verified ·
1 Parent(s): 0b99b16

Finalize card: correct legal entity (Alpine Pacific Trading Inc.) + review fixes

Browse files
Files changed (1) hide show
  1. README.md +7 -6
README.md CHANGED
@@ -26,7 +26,7 @@ tags:
26
  flash-1-mini is a 4-billion-parameter model fine-tuned from Qwen3.5-4B for Canadian legal tasks. It is built for the parts of legal work that have to be right: producing correctly-formatted legal citations and following detailed instructions, across both of Canada's official languages and both of its legal traditions (common law and Quebec civil law). It retains the full general-reasoning and vision capability of its base model.
27
 
28
  - **Version:** `flash-1-mini-20260602`
29
- - **Developed by:** SimpleDirect Inc.
30
  - **Base model:** Qwen3.5-4B (Apache-2.0)
31
  - **License:** Apache-2.0
32
  - **Languages:** English, Canadian French
@@ -47,7 +47,7 @@ flash-1-mini is a 4-billion-parameter model fine-tuned from Qwen3.5-4B for Canad
47
  Measured against its base model under identical conditions (same prompts, same scoring):
48
 
49
  - **2.7× more reliable legal citations** — citation-integrity accuracy 42.1% vs 15.8% on the CBLRE benchmark.
50
- - **+23 points on instruction-following** — IFEval prompt-strict 53.2% vs 30.3%.
51
  - **Balanced bilingual competence** — privacy-compliance parity ratio of 1.00 (English 90.9% / French 90.9%).
52
  - **Stronger English legal reasoning** — MMLU international law 76.0% vs 70.3%.
53
  - **No loss of general capability** — MMLU unchanged (~69.8%); complex multi-step reasoning improves (BBH 79.0% vs 68.6%).
@@ -104,7 +104,7 @@ When serving via vLLM, pass `--reasoning-parser qwen3`; to disable thinking per
104
 
105
  ### Serving
106
 
107
- The model serves with **vLLM** for production text and multimodal inference (Transformers ≥ 5.5). Greedy decoding (temperature 0) is recommended for legal tasks where determinism matters. For text-only workloads, vLLM's `--language-model-only` flag skips the vision encoder to free memory for additional KV cache.
108
 
109
  ### Quantized / GGUF / Ollama
110
 
@@ -128,9 +128,10 @@ All figures are flash-1-mini vs the Qwen3.5-4B base under identical conditions (
128
 
129
  Specialization carried measurable costs, reported here in full:
130
 
131
- - **Retrieval (RAG):** source-attribution accuracy regressed (0.810.76 on a leak-proof held-out set). flash-1-mini is not a retrieval/RAG leader.
132
  - **Function-calling (BFCL v4):** overall regressed (37.7% → 28.6%), with multi-turn the weakest sub-category.
133
- - **French professional-law MCQ** and a few French-general tasks regressed modestly.
 
134
 
135
  If your workload is primarily retrieval-grounded QA or tool/function-calling orchestration, evaluate carefully against these numbers.
136
 
@@ -155,7 +156,7 @@ flash-1-mini is released under the **Apache License 2.0**. It is a modified deri
155
  ```bibtex
156
  @misc{simpledirect2026flash1mini,
157
  title = {flash-1-mini: A Bilingual Canadian Legal Language Model},
158
- author = {SimpleDirect Inc.},
159
  year = {2026},
160
  note = {Version flash-1-mini-20260602. Derivative of Qwen3.5-4B (Apache-2.0).},
161
  howpublished = {\url{https://huggingface.co/simpledirect/flash-1-mini}}
 
26
  flash-1-mini is a 4-billion-parameter model fine-tuned from Qwen3.5-4B for Canadian legal tasks. It is built for the parts of legal work that have to be right: producing correctly-formatted legal citations and following detailed instructions, across both of Canada's official languages and both of its legal traditions (common law and Quebec civil law). It retains the full general-reasoning and vision capability of its base model.
27
 
28
  - **Version:** `flash-1-mini-20260602`
29
+ - **Developed by:** Alpine Pacific Trading Inc. (operating as SimpleDirect®)
30
  - **Base model:** Qwen3.5-4B (Apache-2.0)
31
  - **License:** Apache-2.0
32
  - **Languages:** English, Canadian French
 
47
  Measured against its base model under identical conditions (same prompts, same scoring):
48
 
49
  - **2.7× more reliable legal citations** — citation-integrity accuracy 42.1% vs 15.8% on the CBLRE benchmark.
50
+ - **+22.9 points on instruction-following** — IFEval prompt-strict 53.2% vs 30.3%.
51
  - **Balanced bilingual competence** — privacy-compliance parity ratio of 1.00 (English 90.9% / French 90.9%).
52
  - **Stronger English legal reasoning** — MMLU international law 76.0% vs 70.3%.
53
  - **No loss of general capability** — MMLU unchanged (~69.8%); complex multi-step reasoning improves (BBH 79.0% vs 68.6%).
 
104
 
105
  ### Serving
106
 
107
+ The model serves with **vLLM** for production text and multimodal inference (Transformers ≥ 5.5). Greedy decoding (temperature 0) is recommended for legal tasks where determinism matters.
108
 
109
  ### Quantized / GGUF / Ollama
110
 
 
128
 
129
  Specialization carried measurable costs, reported here in full:
130
 
131
+ - **Retrieval (RAG):** source-attribution accuracy regressed (80.5%75.5% on a leak-proof held-out set). flash-1-mini is not a retrieval/RAG leader.
132
  - **Function-calling (BFCL v4):** overall regressed (37.7% → 28.6%), with multi-turn the weakest sub-category.
133
+ - **French professional-law MCQ (Global-MMLU FR):** regressed (49.0% → 44.6%).
134
+ - **CBLRE Quebec civil law:** regressed (95.0% → 90.0%).
135
 
136
  If your workload is primarily retrieval-grounded QA or tool/function-calling orchestration, evaluate carefully against these numbers.
137
 
 
156
  ```bibtex
157
  @misc{simpledirect2026flash1mini,
158
  title = {flash-1-mini: A Bilingual Canadian Legal Language Model},
159
+ author = {{Alpine Pacific Trading Inc. (operating as SimpleDirect)}},
160
  year = {2026},
161
  note = {Version flash-1-mini-20260602. Derivative of Qwen3.5-4B (Apache-2.0).},
162
  howpublished = {\url{https://huggingface.co/simpledirect/flash-1-mini}}