File size: 1,453 Bytes
1f43176
 
ecee4da
 
 
 
 
 
 
 
 
 
 
 
 
1f43176
ecee4da
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
---
license: apache-2.0
language:
- en
- zh
tags:
- GGUF
- llama.cpp
- apex
- quantized
- Mixture of Experts
base_model:
- AIDC-AI/Marco-Nano-Instruct
- mradermacher/Marco-Nano-Instruct-GGUF
pipeline_tag: text-generation
---

# Marco-Nano-Instruct-APEX APEX Quantized (GGUF)

This repository contains APEX-quantized GGUF files for [AIDC-AI's Marco-Nano-Instruct](https://huggingface.co/AIDC-AI/Marco-Nano-Instruct).

The quantization was performed using the [mudler/apex-quant](https://github.com/mudler/apex-quant) project, focusing on maximizing quality-to-size ratio using importance matrix (imatrix) guided quantization.

## 📥 Source & Credits

- **Base Model**: [AIDC-AI's Marco-Nano-Instruct](https://huggingface.co/AIDC-AI/Marco-Nano-Instruct).
- **F16 GGUF & Imatrix**: The F16 source model and the importance matrix file used for quantization were sourced from [mradermacher's GGUF repository](https://huggingface.co/mradermacher/Marco-Nano-Instruct-i1-GGUF). 

> **Special thanks to [@mradermacher](https://huggingface.co/mradermacher) for providing the high-quality imatrix file!**

## ⚠️ For technical validation only

- Severe accuracy loss due to quantization; outputs may contain hallucinations, gibberish, or fail basic tasks.
- Suitable **only** for researching quantization noise, debugging conversion scripts, or comparing compression artifacts.
- No post-training calibration, fine-tuning, or recovery techniques were applied.