ales27pm commited on
Commit
63f4ec0
·
verified ·
1 Parent(s): 2be566f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +105 -3
README.md CHANGED
@@ -1,3 +1,105 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ ---
4
+ ## Model Card: Dolphin3.0-Llama3.2-3B (Core ML)
5
+
6
+ ### Model summary
7
+
8
+ This workflow produces **Core ML model packages (`.mlpackage`)** converted from the Hugging Face model **`cognitivecomputations/Dolphin3.0-Llama3.2-3B`**, outputting three variants:
9
+
10
+ * **FP16**: `Dolphin3.0-Llama3.2-3B-fp16.mlpackage`
11
+ * **INT8**: `Dolphin3.0-Llama3.2-3B-int8.mlpackage`
12
+ * **INT4-LUT**: `Dolphin3.0-Llama3.2-3B-int4-lut.mlpackage` (palettized / lookup-table compressed weights) ([Hugging Face][1])
13
+
14
+ The upstream model is a **Dolphin instruction-tuned** variant built on **Meta Llama 3.2 3B**. ([Hugging Face][1])
15
+
16
+ ---
17
+
18
+ ### Model details
19
+
20
+ * **Model family / architecture:** decoder-only Transformer LLM (Llama family), ~3B parameters (as implied by the model name and base). ([Hugging Face][1])
21
+ * **Primary use mode:** chat / instruction-following using a **ChatML-style** formatting template. ([Hugging Face][1])
22
+ * **Core ML format:** converted as an **`mlprogram`** and therefore saved as a **model package (`.mlpackage`)** rather than `.mlmodel`. ([apple.github.io][2])
23
+
24
+ ---
25
+
26
+ ### What’s in the artifacts
27
+
28
+ * `*.mlpackage`: Core ML “ML Program” packages (weights + program) suitable for on-device inference. ML Programs target **iOS 15 / macOS 12+** by default (unless the converter explicitly overrides). ([apple.github.io][2])
29
+ * `coreml_artifacts.json`: conversion metadata emitted by the conversion script (contents depend on `scripts/convert_to_coreml.py`, but commonly includes conversion settings and model/tokenizer info).
30
+
31
+ ---
32
+
33
+ ### Intended use
34
+
35
+ **Intended:** on-device text generation (assistant/chat, summarization, brainstorming, general Q&A) inside Apple ecosystem apps, with the speed/size tradeoffs offered by FP16 / INT8 / INT4-LUT variants. ([apple.github.io][2])
36
+
37
+ **Not intended / high-risk:** medical/legal/financial decision-making, safety-critical control, or uses restricted by the Llama 3.2 Acceptable Use Policy (see “License & use policy”). ([Oracle Docs][3])
38
+
39
+ ---
40
+
41
+ ### Prompting / chat template
42
+
43
+ The upstream Dolphin model card indicates a **ChatML** template and provides an example “system/user/assistant” structure. Use the same formatting (or an equivalent wrapper in your app) to match expected behaviour. ([Hugging Face][1])
44
+
45
+ ---
46
+
47
+ ### Training / data provenance (upstream)
48
+
49
+ This Core ML model is a **format conversion** of the upstream weights; it does **not** introduce new training data by itself.
50
+
51
+ The upstream Dolphin model card lists a mixture of instruction/chat datasets and related sources used in the fine-tuning pipeline (e.g., FLAN, OASST, Capybara, etc.). ([Hugging Face][1])
52
+
53
+ ---
54
+
55
+ ### Quantization / compression notes (Core ML variants)
56
+
57
+ * **FP16 (`-fp16`)**: float16 weights and execution (Core ML Tools defaults ML Programs to float16 precision unless overridden). ([apple.github.io][2])
58
+ * **INT8 (`-int8`)**: linear quantization of weights to reduce size; Core ML supports INT8 weight quantization as a compression technique. ([apple.github.io][4])
59
+ * **INT4-LUT (`-int4-lut`)**: **palettization (weight clustering)** where weights are represented via indices into a **lookup table (LUT)** of centroids; this can achieve very aggressive compression. ([apple.github.io][5])
60
+
61
+ **Deployment caution:** palettized weight representation for `mlprogram` is available for **iOS 16 / macOS 13+** (per Core ML Tools docs). Plan your app’s minimum OS accordingly if you ship the INT4-LUT package. ([apple.github.io][5])
62
+
63
+ ---
64
+
65
+ ### Limitations
66
+
67
+ Like other LLMs, this model can:
68
+
69
+ * **Hallucinate** facts and citations.
70
+ * Reflect **biases** present in training data.
71
+ * Produce unsafe or policy-violating content if prompted.
72
+
73
+ Additionally, the upstream Dolphin card explicitly positions the model as having **reduced built-in “ethical guardrails”** relative to many assistant-tuned models, meaning **application-level safety controls** (filters, refusal policies, logging, rate limits) are strongly recommended. ([Hugging Face][1])
74
+
75
+ ---
76
+
77
+ ### License & use policy (important)
78
+
79
+ This model inherits licensing obligations from **Meta’s Llama 3.2 Community License** (and any additional terms from the Dolphin distribution, if present).
80
+
81
+ Key requirements highlighted in the Llama 3.2 license text include:
82
+
83
+ * If you redistribute the model (or a derivative), you must **provide a copy of the license** and prominently display **“Built with Llama”** in relevant product/docs. ([Hugging Face][6])
84
+ * Use must comply with the **Llama 3.2 Acceptable Use Policy**, which prohibits (among other things) illegal activity, harassment, disallowed professional practice, malware creation, and other harmful uses. ([Oracle Docs][3])
85
+
86
+ ---
87
+
88
+ ### Evaluation
89
+
90
+ The upstream Dolphin model card lists **evaluations as TBD**. Treat real-world performance (especially after quantization) as **application-specific** and validate on your target device(s). ([Hugging Face][1])
91
+
92
+ ---
93
+
94
+ ### Responsible deployment recommendations
95
+
96
+ * Use the **FP16** model as your baseline for quality testing; measure deltas for **INT8** and **INT4-LUT** on your real prompts.
97
+ * Add **safety and policy enforcement** in the app layer (particularly given Dolphin’s stated stance on guardrails). ([Hugging Face][1])
98
+ * Document OS requirements clearly: **ML Program ⇒ iOS 15+**, **INT4-LUT palettization ⇒ iOS 16+**. ([apple.github.io][2])
99
+
100
+ [1]: https://huggingface.co/dphn/Dolphin3.0-Llama3.2-3B/blob/ebfdc372541d6f699d05e83a2b0e0d4e1fdda828/README.md "README.md · dphn/Dolphin3.0-Llama3.2-3B at ebfdc372541d6f699d05e83a2b0e0d4e1fdda828"
101
+ [2]: https://apple.github.io/coremltools/docs-guides/source/convert-to-ml-program.html "Convert Models to ML Programs — Guide to Core ML Tools"
102
+ [3]: https://docs.oracle.com/cd/E17952_01/mysql-ai-9.5-license-com-en/license-llama-3-2-3b-instruct.html "2.19 Llama-3.2-3B-Instruct"
103
+ [4]: https://apple.github.io/coremltools/docs-guides/source/opt-overview.html?utm_source=chatgpt.com "Overview — Guide to Core ML Tools - Apple"
104
+ [5]: https://apple.github.io/coremltools/docs-guides/source/opt-palettization-overview.html "Palettization Overview — Guide to Core ML Tools"
105
+ [6]: https://huggingface.co/meta-llama/Llama-3.2-3B "meta-llama/Llama-3.2-3B · Hugging Face"