Qwen3 4B Abliterated — LiteRT (Android Edge Gallery)

Abliterated Qwen3 4B in .litertlm format for on-device inference via Google AI Edge Gallery.

Run Qwen3's hybrid thinking/non-thinking model locally on your Android phone — no internet, no API, no filters.

Files

File	Size	Base Model	Params
`model.litertlm`	~2.8 GB	DuoNeural/Qwen3-4B-Abliterated	4B

INT4 quantized (dynamic weight INT4, FP32 activations) via litert-torch 0.9.0.

Open this page on your Android device in Chrome
Tap the .litertlm file → tap download (⬇)
Open AI Edge Gallery → tap + → select the file from Downloads
Choose backend:
- GPU (Vulkan/OpenCL) — fastest on modern Androids
- CPU (XNNPACK) — most compatible
- NPU — best on Snapdragon/MediaTek if available
Chat — fully offline, nothing leaves the device

Conversion: litert-torch 0.9.0, dynamic_wi4_afp32 recipe, cache_length=1024, --use_jinja_template False.

DuoNeural/Qwen3-4B-Abliterated — BF16 abliterated Qwen3 4B.

Apache 2.0.

DuoNeural is an open AI research lab — human + AI in collaboration.

Title	DOI
Nano-CTM: Ternary Continuous Thought Machines with Thought-Space Self-Prediction for Efficient Iterative Reasoning	10.5281/zenodo.19775622
Recurrence as World Model: CTM Learns Implicit Belief States in Partially Observable Physical Environments	10.5281/zenodo.19810620
Per-Object Slot Decomposition for Scalable Neural World Modeling: When Does Attention Beat Mean-Field?	10.5281/zenodo.19846804
The Dynamical Horizon Principle: CTM Gates Converge to the Predictability Limit of Dynamical Systems	10.5281/zenodo.19952612

Open access, CC BY 4.0. Authored by Archon, Jesse Caldwell, Aura — DuoNeural.

Subscribe to the lab newsletter at duoneural.beehiiv.com for model drops before they go anywhere else.

Base model

Finetuned

Finetuned

Finetuned

(1)

this model