vccarvalho11's picture
Upload phi-3-mini ONNX model
5574d2f verified
metadata
library_name: onnx
tags:
  - text-generation
  - phi-3
  - onnx
  - int4
  - cpu
  - onnx
  - inference4j
license: mit
pipeline_tag: text-generation

Phi-3-mini-4k-instruct — ONNX (INT4)

INT4-quantized ONNX export of Phi-3-mini-4k-instruct, a 3.8B-parameter lightweight language model from Microsoft. Optimized for CPU inference with int4 RTN block-32 quantization.

Mirrored for use with inference4j, an inference-only AI library for Java.

Original Source

Usage with inference4j

try (TextGenerator gen = TextGenerator.builder().build()) {
    GenerationResult result = gen.generate("What is Java in one sentence?");
    System.out.println(result.text());
}

Model Details

Property Value
Architecture Phi-3 (3.8B parameters, 32 layers, 3072 hidden)
Task Text generation / chat
Context length 4096 tokens
Quantization INT4 RTN block-32 acc-level-4
Original framework PyTorch (transformers)

License

This model is licensed under the MIT License. Original model by Microsoft.