File size: 6,221 Bytes
63f4ec0 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 |
---
license: mit
---
## Model Card: Dolphin3.0-Llama3.2-3B (Core ML)
### Model summary
This workflow produces **Core ML model packages (`.mlpackage`)** converted from the Hugging Face model **`cognitivecomputations/Dolphin3.0-Llama3.2-3B`**, outputting three variants:
* **FP16**: `Dolphin3.0-Llama3.2-3B-fp16.mlpackage`
* **INT8**: `Dolphin3.0-Llama3.2-3B-int8.mlpackage`
* **INT4-LUT**: `Dolphin3.0-Llama3.2-3B-int4-lut.mlpackage` (palettized / lookup-table compressed weights) ([Hugging Face][1])
The upstream model is a **Dolphin instruction-tuned** variant built on **Meta Llama 3.2 3B**. ([Hugging Face][1])
---
### Model details
* **Model family / architecture:** decoder-only Transformer LLM (Llama family), ~3B parameters (as implied by the model name and base). ([Hugging Face][1])
* **Primary use mode:** chat / instruction-following using a **ChatML-style** formatting template. ([Hugging Face][1])
* **Core ML format:** converted as an **`mlprogram`** and therefore saved as a **model package (`.mlpackage`)** rather than `.mlmodel`. ([apple.github.io][2])
---
### What’s in the artifacts
* `*.mlpackage`: Core ML “ML Program” packages (weights + program) suitable for on-device inference. ML Programs target **iOS 15 / macOS 12+** by default (unless the converter explicitly overrides). ([apple.github.io][2])
* `coreml_artifacts.json`: conversion metadata emitted by the conversion script (contents depend on `scripts/convert_to_coreml.py`, but commonly includes conversion settings and model/tokenizer info).
---
### Intended use
**Intended:** on-device text generation (assistant/chat, summarization, brainstorming, general Q&A) inside Apple ecosystem apps, with the speed/size tradeoffs offered by FP16 / INT8 / INT4-LUT variants. ([apple.github.io][2])
**Not intended / high-risk:** medical/legal/financial decision-making, safety-critical control, or uses restricted by the Llama 3.2 Acceptable Use Policy (see “License & use policy”). ([Oracle Docs][3])
---
### Prompting / chat template
The upstream Dolphin model card indicates a **ChatML** template and provides an example “system/user/assistant” structure. Use the same formatting (or an equivalent wrapper in your app) to match expected behaviour. ([Hugging Face][1])
---
### Training / data provenance (upstream)
This Core ML model is a **format conversion** of the upstream weights; it does **not** introduce new training data by itself.
The upstream Dolphin model card lists a mixture of instruction/chat datasets and related sources used in the fine-tuning pipeline (e.g., FLAN, OASST, Capybara, etc.). ([Hugging Face][1])
---
### Quantization / compression notes (Core ML variants)
* **FP16 (`-fp16`)**: float16 weights and execution (Core ML Tools defaults ML Programs to float16 precision unless overridden). ([apple.github.io][2])
* **INT8 (`-int8`)**: linear quantization of weights to reduce size; Core ML supports INT8 weight quantization as a compression technique. ([apple.github.io][4])
* **INT4-LUT (`-int4-lut`)**: **palettization (weight clustering)** where weights are represented via indices into a **lookup table (LUT)** of centroids; this can achieve very aggressive compression. ([apple.github.io][5])
**Deployment caution:** palettized weight representation for `mlprogram` is available for **iOS 16 / macOS 13+** (per Core ML Tools docs). Plan your app’s minimum OS accordingly if you ship the INT4-LUT package. ([apple.github.io][5])
---
### Limitations
Like other LLMs, this model can:
* **Hallucinate** facts and citations.
* Reflect **biases** present in training data.
* Produce unsafe or policy-violating content if prompted.
Additionally, the upstream Dolphin card explicitly positions the model as having **reduced built-in “ethical guardrails”** relative to many assistant-tuned models, meaning **application-level safety controls** (filters, refusal policies, logging, rate limits) are strongly recommended. ([Hugging Face][1])
---
### License & use policy (important)
This model inherits licensing obligations from **Meta’s Llama 3.2 Community License** (and any additional terms from the Dolphin distribution, if present).
Key requirements highlighted in the Llama 3.2 license text include:
* If you redistribute the model (or a derivative), you must **provide a copy of the license** and prominently display **“Built with Llama”** in relevant product/docs. ([Hugging Face][6])
* Use must comply with the **Llama 3.2 Acceptable Use Policy**, which prohibits (among other things) illegal activity, harassment, disallowed professional practice, malware creation, and other harmful uses. ([Oracle Docs][3])
---
### Evaluation
The upstream Dolphin model card lists **evaluations as TBD**. Treat real-world performance (especially after quantization) as **application-specific** and validate on your target device(s). ([Hugging Face][1])
---
### Responsible deployment recommendations
* Use the **FP16** model as your baseline for quality testing; measure deltas for **INT8** and **INT4-LUT** on your real prompts.
* Add **safety and policy enforcement** in the app layer (particularly given Dolphin’s stated stance on guardrails). ([Hugging Face][1])
* Document OS requirements clearly: **ML Program ⇒ iOS 15+**, **INT4-LUT palettization ⇒ iOS 16+**. ([apple.github.io][2])
[1]: https://huggingface.co/dphn/Dolphin3.0-Llama3.2-3B/blob/ebfdc372541d6f699d05e83a2b0e0d4e1fdda828/README.md "README.md · dphn/Dolphin3.0-Llama3.2-3B at ebfdc372541d6f699d05e83a2b0e0d4e1fdda828"
[2]: https://apple.github.io/coremltools/docs-guides/source/convert-to-ml-program.html "Convert Models to ML Programs — Guide to Core ML Tools"
[3]: https://docs.oracle.com/cd/E17952_01/mysql-ai-9.5-license-com-en/license-llama-3-2-3b-instruct.html "2.19 Llama-3.2-3B-Instruct"
[4]: https://apple.github.io/coremltools/docs-guides/source/opt-overview.html?utm_source=chatgpt.com "Overview — Guide to Core ML Tools - Apple"
[5]: https://apple.github.io/coremltools/docs-guides/source/opt-palettization-overview.html "Palettization Overview — Guide to Core ML Tools"
[6]: https://huggingface.co/meta-llama/Llama-3.2-3B "meta-llama/Llama-3.2-3B · Hugging Face"
|