Update README.md
Browse files
README.md
CHANGED
|
@@ -16,15 +16,6 @@ If you intend to use this model outside of React Native ExecuTorch, make sure yo
|
|
| 16 |
|
| 17 |
The **MLX** variant requires a physical Apple Silicon device (it does not run on the iOS simulator). The **XNNPACK** variant runs everywhere.
|
| 18 |
|
| 19 |
-
## Variant Matrix
|
| 20 |
-
|
| 21 |
-
| Delegate | Precision | File | Size | Notes |
|
| 22 |
-
|----------|-----------|------------------------------------------------------------|---------|------------------------------------------------------------------------------------------------------------------------------------------------|
|
| 23 |
-
| XNNPACK | 8da4w | `xnnpack/lfm_2_5_embedding_350m_xnnpack_8da4w.pte` | 431 MB | Int8 dynamic activation + Int4 weight (torchao), group_size=32, fp32 compute. Works on Android / iOS / generic CPU. |
|
| 24 |
-
| MLX | int4 | `mlx/lfm_2_5_embedding_350m_mlx_int4.pte` | 287 MB | Int4 weight (group_size=64) with bf16 compute. Apple GPU; smallest variant. Requires a physical Apple Silicon device. |
|
| 25 |
-
|
| 26 |
-
Both variants reproduce the upstream fp32 embedding with cosine ≈ 0.97 on a held-out set. Pick the variant that matches your platform; the MLX variant is iOS-only.
|
| 27 |
-
|
| 28 |
## Repository Structure
|
| 29 |
|
| 30 |
- `xnnpack/` — `.pte` file partitioned for the XNNPACK delegate.
|
|
|
|
| 16 |
|
| 17 |
The **MLX** variant requires a physical Apple Silicon device (it does not run on the iOS simulator). The **XNNPACK** variant runs everywhere.
|
| 18 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 19 |
## Repository Structure
|
| 20 |
|
| 21 |
- `xnnpack/` — `.pte` file partitioned for the XNNPACK delegate.
|