About

This repository provides a model compiled and optimized for Mobilint NPU hardware.
The model is packaged for deployment on Mobilint’s acceleration stack and is intended to be used within that environment.

Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for mobilint/Llama-3.1-8B-Instruct-Batch32

Quantized
(560)
this model

Collection including mobilint/Llama-3.1-8B-Instruct-Batch32