File size: 1,364 Bytes
5574d2f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
---
library_name: onnx
tags:
  - text-generation
  - phi-3
  - onnx
  - int4
  - cpu
  - onnx
  - inference4j
license: mit
pipeline_tag: text-generation
---

# Phi-3-mini-4k-instruct — ONNX (INT4)

INT4-quantized ONNX export of [Phi-3-mini-4k-instruct](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct), a 3.8B-parameter lightweight language model from Microsoft. Optimized for CPU inference with int4 RTN block-32 quantization.

Mirrored for use with [inference4j](https://github.com/inference4j/inference4j), an inference-only AI library for Java.

## Original Source

- **Repository:** [Microsoft](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-onnx)
- **License:** mit

## Usage with inference4j

```java
try (TextGenerator gen = TextGenerator.builder().build()) {
    GenerationResult result = gen.generate("What is Java in one sentence?");
    System.out.println(result.text());
}
```

## Model Details

| Property | Value |
|----------|-------|
| Architecture | Phi-3 (3.8B parameters, 32 layers, 3072 hidden) |
| Task | Text generation / chat |
| Context length | 4096 tokens |
| Quantization | INT4 RTN block-32 acc-level-4 |
| Original framework | PyTorch (transformers) |

## License

This model is licensed under the [MIT License](https://opensource.org/licenses/MIT). Original model by [Microsoft](https://huggingface.co/microsoft).