Add MNN 4-bit quantized model with Model Card

52855b8 verified 18 days ago

1.17 kB

language:
  - en
license: other
tags:
  - mnn
  - on-device
  - android
  - ios
  - quantization
  - int4
  - text-generation
  - qwen3
pipeline_tag: text-generation
library_name: mnn
base_model: WhoIsShe/DS-R1-Qwen3-8B-ArliAI-RpR-v4-Small-MNN

ArliAI/DS-R1-Qwen3-8B-ArliAI-RpR-v4-Small (MNN Quantized)

Original model :

https://huggingface.co/ArliAI/DS-R1-Qwen3-8B-ArliAI-RpR-v4-Small

This is a 4-bit quantized version of the Qwen3-4B-RPG-Roleplay-V2, optimized for on-device inference (Android/iOS) using the Alibaba MNN framework.

🚀 Fast Deployment on Android

1. Download the App

Don't build from scratch! Use the official MNN Chat Android app:

Download APK (GitHub)

2. Setup

Download the files from this repo (llm.mnn, llm.mnn.weight, config.json).
Create a folder on your phone: /sdcard/MNN/DS-R1-Qwen3-8B-ArliAI-RpR-v4-Small.
Copy the files into that folder.
Open the MNN App and select your folder.

💻 Technical Details

Framework: MNN
Quantization: 4-bit Asymmetric (Int4)
Model Type: QWEN3-4B (Uncensored)