Instructions to use mlx-community/humanizer-1B-OptIQ-4bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use mlx-community/humanizer-1B-OptIQ-4bit with MLX:

# Make sure mlx-lm is installed
# pip install --upgrade mlx-lm

# Generate text with mlx-lm
from mlx_lm import load, generate

model, tokenizer = load("mlx-community/humanizer-1B-OptIQ-4bit")

prompt = "Write a story about Einstein"
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
    messages, add_generation_prompt=True
)

text = generate(model, tokenizer, prompt=prompt, verbose=True)

Notebooks
Google Colab
Kaggle
Local Apps Settings
LM Studio

How to use mlx-community/humanizer-1B-OptIQ-4bit with Pi:

Start the MLX server

# Install MLX LM:
uv tool install mlx-lm
# Start a local OpenAI-compatible server:
mlx_lm.server --model "mlx-community/humanizer-1B-OptIQ-4bit"

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "mlx-lm": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "mlx-community/humanizer-1B-OptIQ-4bit"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use mlx-community/humanizer-1B-OptIQ-4bit with Hermes Agent:

Start the MLX server

# Install MLX LM:
uv tool install mlx-lm
# Start a local OpenAI-compatible server:
mlx_lm.server --model "mlx-community/humanizer-1B-OptIQ-4bit"

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default mlx-community/humanizer-1B-OptIQ-4bit

Run Hermes

hermes

MLX LM

How to use mlx-community/humanizer-1B-OptIQ-4bit with MLX LM:

Generate or start a chat session

# Install MLX LM
uv tool install mlx-lm
# Interactive chat REPL
mlx_lm.chat --model "mlx-community/humanizer-1B-OptIQ-4bit"

Run an OpenAI-compatible server

# Install MLX LM
uv tool install mlx-lm
# Start the server
mlx_lm.server --model "mlx-community/humanizer-1B-OptIQ-4bit"
# Calling the OpenAI-compatible server with curl
curl -X POST "http://localhost:8000/v1/chat/completions" \
   -H "Content-Type: application/json" \
   --data '{
     "model": "mlx-community/humanizer-1B-OptIQ-4bit",
     "messages": [
       {"role": "user", "content": "Hello"}
     ]
   }'

codelion commited on 6 days ago

Commit

76b5ff7

verified ·

1 Parent(s): 093b7c8

humanizer-1B-OptIQ-4bit v0.1.4: stacked SFT + DPO LoRAs on MiniCPM5-1B-OptIQ-4bit

Browse files

Files changed (15) hide show

README.md +128 -0
adapters/humanizer-dpo/adapter_config.json +206 -0
adapters/humanizer-dpo/adapters.safetensors +3 -0
adapters/humanizer-dpo/optiq_lora_config.json +207 -0
adapters/humanizer-sft/adapter_config.json +206 -0
adapters/humanizer-sft/adapters.safetensors +3 -0
adapters/humanizer-sft/optiq_lora_config.json +206 -0
chat_template.jinja +179 -0
config.json +1399 -0
generation_config.json +13 -0
model.safetensors +3 -0
model.safetensors.index.json +567 -0
optiq_metadata.json +688 -0
tokenizer.json +0 -0
tokenizer_config.json +17 -0

README.md ADDED Viewed

	@@ -0,0 +1,128 @@

+---
+license: apache-2.0
+language:
+  - en
+library_name: mlx
+tags:
+  - text-generation
+  - humanizer
+  - ai-detection
+  - lora
+  - mlx
+  - mlx-optiq
+  - apple-silicon
+base_model: mlx-community/MiniCPM5-1B-OptiQ-4bit
+pipeline_tag: text-generation
+---
+# humanizer-1B-OptIQ-4bit
+**A 1 B model that matches human writing on the RADAR AI detector.** Stacked SFT + DPO LoRA adapters on top of `mlx-community/MiniCPM5-1B-OptIQ-4bit` close 100 % of the gap to the human reference on a 200-draft held-out evaluation.
+| | P(AI) (RADAR-Vicuna-7B) |
+| --- | ---: |
+| Source AI drafts (Qwen3.5-4B + Gemma-4-e4b output) | 0.51 |
+| **`humanizer-1B-OptIQ-4bit` (SFT + DPO stacked)** | **0.37** |
+| Human reference (EditLens ICLR 2026, n=200) | 0.37 |
+Build, recipe, and discussion: <https://mlx-optiq.com/blog/humanizer-stacked-lora>
+## What's in this repo
+```
+humanizer-1B-OptIQ-4bit/
+├── model.safetensors + config.json + tokenizer*        base MiniCPM5-1B-OptIQ-4bit
+├── optiq_metadata.json                                  per-layer bit assignments
+└── adapters/
+    ├── humanizer-sft/                                   SFT humanizer LoRA
+    │   ├── adapters.safetensors
+    │   ├── adapter_config.json
+    │   └── optiq_lora_config.json
+    └── humanizer-dpo/                                   DPO continuation LoRA
+        ├── adapters.safetensors
+        ├── adapter_config.json
+        └── optiq_lora_config.json
+```
+- **Base** — `mlx-community/MiniCPM5-1B-OptiQ-4bit`. OptIQ mixed-precision quant of `openbmb/MiniCPM5-1B`. 875 MB on disk, Capability Score 30.28.
+- **SFT adapter** — trained on canonical SFT data derived from the EditLens ICLR 2026 corpus. `--preset large` (ranks 32-64 with `by_bits` overlay), 600 iters, `mask_prompt=True`.
+- **DPO adapter** — trained as a *delta* on top of the SFT via `optiq lora train --method dpo --mount-adapter`. The reference KL is anchored against base + SFT (the textbook SFT → DPO continuation), the saved adapter contains only the DPO delta. 300 iters, β=0.1, LR 5e-5 with linear warmup → cosine decay (the OptIQ DPO defaults).
+The DPO adapter is meaningful **only when applied alongside the SFT adapter** — it's a delta from the SFT distribution, not a standalone LoRA. Apply both at inference for the headline result.
+## Use
+You need `mlx-optiq >= 0.1.4` for the multi-LoRA serving and stacking syntax:
+```bash
+pip install 'mlx-optiq>=0.1.4'
+# Download the repo
+huggingface-cli download mlx-community/humanizer-1B-OptIQ-4bit \
+  --local-dir ./humanizer-1B-OptIQ-4bit
+# Serve with both adapters mounted
+optiq serve \
+  --model ./humanizer-1B-OptIQ-4bit \
+  --adapter ./humanizer-1B-OptIQ-4bit/adapters/humanizer-sft \
+  --adapter ./humanizer-1B-OptIQ-4bit/adapters/humanizer-dpo \
+  --port 8080
+```
+Then send requests with both adapters active via the `+` stacking syntax in the request body:
+```bash
+curl http://localhost:8080/v1/chat/completions \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "./humanizer-1B-OptIQ-4bit",
+    "adapter": "humanizer-sft+humanizer-dpo",
+    "messages": [
+      {"role": "system", "content": "Rewrite AI-generated drafts into natural human-style prose, preserving meaning, facts, names, numbers, citations, URLs, quotes, and formatting."},
+      {"role": "user", "content": "STYLE: direct technical blog\nTONE: analytical, clear, non-corporate\nLENGTH: preserve within 15%\n\nDraft to rewrite:\n\n[your AI-generated draft here]"}
+    ],
+    "temperature": 0.4,
+    "max_tokens": 1600,
+    "chat_template_kwargs": {"enable_thinking": false}
+  }'
+```
+The OpenAI-compatible endpoint is a drop-in for Open WebUI, Continue, Cursor, your own scripts, etc. Send `"adapter": "humanizer-sft"` to use SFT alone, `"adapter": "base"` to bypass adapters entirely (handy for A/B comparisons).
+## Held-out evaluation
+200 AI-generated drafts from the [EditLens ICLR 2026](https://huggingface.co/datasets/pangram/editlens_iclr) held-out set, rewritten by each system and scored by [RADAR-Vicuna-7B](https://huggingface.co/TrustSafeAI/RADAR-Vicuna-7B). Lower P(AI) is more human-like.
+| Pipeline | P(AI) | Delta vs source | Slop / 1 K tokens |
+| --- | ---: | ---: | ---: |
+| Source AI draft (Qwen3.5-4B + Gemma-4-e4b) | 0.51 | — | 0.6 |
+| SFT humanizer alone | 0.50 | -0.01 | 0.2 |
+| **SFT + DPO stacked (this repo)** | **0.37** | **-0.14** | **0.0** |
+| Human reference (target) | 0.37 | -0.14 | 0.1 |
+The stacked pipeline produces fewer slop phrases per 1 K tokens (0.0) than the human reference itself (0.1).
+## Intended use & limitations
+- **Intended use**: rewriting AI-generated drafts (blog posts, articles, reports) into more natural-sounding prose. Preserves facts, names, numbers, URLs, citations.
+- **Trained on**: the EditLens ICLR 2026 corpus filtered through the OptIQ Labs dataset-building pipeline (Qwen3.5-4B and Gemma-4-e4b as the source AI models; the original EditLens human-written prose as target).
+- **AI-detector caveat**: RADAR-Vicuna-7B is one detector among many. Matching the human reference on RADAR means the rewrites land at the same point on RADAR's scale as the EditLens human-written set; other detectors will give different numbers, and detector arms races mean any specific score has a shelf life. The reproducible claim is the **delta from source** and the **gap closure against a fixed human reference**, both held up across the entire 200-draft held-out set.
+- **Length**: the rewrites tend to over-generate (length ratio about 3-4x the source). Apply a max-tokens or post-truncation step if you need length-faithful output.
+- **Capability outside humanization**: this LoRA stack is heavily specialized for the rewrite-this-AI-draft format. Out-of-format prompts will degrade behavior. Serve `"adapter": "base"` for general MiniCPM5-1B inference.
+## License
+- Base model: `openbmb/MiniCPM5-1B` (Apache-2.0).
+- LoRA adapters: Apache-2.0, this release.
+- Training data: derived from [EditLens ICLR 2026](https://huggingface.co/datasets/pangram/editlens_iclr) (research use).
+## Citation
+```bibtex
+@misc{mlxoptiq2026humanizer1b,
+  title  = {humanizer-1B-OptIQ-4bit: a stacked SFT + DPO LoRA on a 1 B model that matches human writing on RADAR},
+  author = {{mlx-optiq team}},
+  year   = {2026},
+  url    = {https://huggingface.co/mlx-community/humanizer-1B-OptIQ-4bit},
+}
+```

adapters/humanizer-dpo/adapter_config.json ADDED Viewed

	@@ -0,0 +1,206 @@

+{
+  "fine_tune_type": "lora",
+  "num_layers": -1,
+  "lora_parameters": {
+    "rank": 32,
+    "scale": 1.0,
+    "dropout": 0.0,
+    "keys": null
+  },
+  "base_model_name_or_path": "optiq_output/openbmb_MiniCPM5-1B/optiq_mixed",
+  "bias": "none",
+  "fan_in_fan_out": false,
+  "inference_mode": false,
+  "init_lora_weights": true,
+  "layers_to_transform": null,
+  "layers_pattern": null,
+  "lora_alpha": 32,
+  "lora_dropout": 0.0,
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "r": 32,
+  "revision": null,
+  "target_modules": [
+    "q_proj",
+    "k_proj",
+    "v_proj",
+    "o_proj",
+    "gate_proj",
+    "up_proj",
+    "down_proj"
+  ],
+  "task_type": "CAUSAL_LM",
+  "optiq": {
+    "rank_scaling": "by_bits",
+    "applied_ranks": {
+      "layer_0.mlp.up_proj": 64,
+      "layer_0.mlp.down_proj": 64,
+      "layer_0.mlp.gate_proj": 64,
+      "layer_0.self_attn.o_proj": 64,
+      "layer_0.self_attn.v_proj": 64,
+      "layer_0.self_attn.k_proj": 64,
+      "layer_0.self_attn.q_proj": 64,
+      "layer_1.mlp.up_proj": 32,
+      "layer_1.mlp.down_proj": 64,
+      "layer_1.mlp.gate_proj": 32,
+      "layer_1.self_attn.o_proj": 64,
+      "layer_1.self_attn.v_proj": 64,
+      "layer_1.self_attn.k_proj": 32,
+      "layer_1.self_attn.q_proj": 32,
+      "layer_2.mlp.up_proj": 32,
+      "layer_2.mlp.down_proj": 32,
+      "layer_2.mlp.gate_proj": 32,
+      "layer_2.self_attn.o_proj": 64,
+      "layer_2.self_attn.v_proj": 64,
+      "layer_2.self_attn.k_proj": 32,
+      "layer_2.self_attn.q_proj": 32,
+      "layer_3.mlp.up_proj": 32,
+      "layer_3.mlp.down_proj": 32,
+      "layer_3.mlp.gate_proj": 32,
+      "layer_3.self_attn.o_proj": 64,
+      "layer_3.self_attn.v_proj": 64,
+      "layer_3.self_attn.k_proj": 32,
+      "layer_3.self_attn.q_proj": 32,
+      "layer_4.mlp.up_proj": 32,
+      "layer_4.mlp.down_proj": 64,
+      "layer_4.mlp.gate_proj": 32,
+      "layer_4.self_attn.o_proj": 64,
+      "layer_4.self_attn.v_proj": 64,
+      "layer_4.self_attn.k_proj": 32,
+      "layer_4.self_attn.q_proj": 32,
+      "layer_5.mlp.up_proj": 32,
+      "layer_5.mlp.down_proj": 32,
+      "layer_5.mlp.gate_proj": 32,
+      "layer_5.self_attn.o_proj": 32,
+      "layer_5.self_attn.v_proj": 64,
+      "layer_5.self_attn.k_proj": 32,
+      "layer_5.self_attn.q_proj": 64,
+      "layer_6.mlp.up_proj": 32,
+      "layer_6.mlp.down_proj": 32,
+      "layer_6.mlp.gate_proj": 32,
+      "layer_6.self_attn.o_proj": 64,
+      "layer_6.self_attn.v_proj": 64,
+      "layer_6.self_attn.k_proj": 32,
+      "layer_6.self_attn.q_proj": 32,
+      "layer_7.mlp.up_proj": 64,
+      "layer_7.mlp.down_proj": 32,
+      "layer_7.mlp.gate_proj": 32,
+      "layer_7.self_attn.o_proj": 64,
+      "layer_7.self_attn.v_proj": 32,
+      "layer_7.self_attn.k_proj": 32,
+      "layer_7.self_attn.q_proj": 64,
+      "layer_8.mlp.up_proj": 32,
+      "layer_8.mlp.down_proj": 32,
+      "layer_8.mlp.gate_proj": 32,
+      "layer_8.self_attn.o_proj": 64,
+      "layer_8.self_attn.v_proj": 32,
+      "layer_8.self_attn.k_proj": 32,
+      "layer_8.self_attn.q_proj": 64,
+      "layer_9.mlp.up_proj": 32,
+      "layer_9.mlp.down_proj": 32,
+      "layer_9.mlp.gate_proj": 32,
+      "layer_9.self_attn.o_proj": 64,
+      "layer_9.self_attn.v_proj": 64,
+      "layer_9.self_attn.k_proj": 32,
+      "layer_9.self_attn.q_proj": 32,
+      "layer_10.mlp.up_proj": 64,
+      "layer_10.mlp.down_proj": 32,
+      "layer_10.mlp.gate_proj": 32,
+      "layer_10.self_attn.o_proj": 64,
+      "layer_10.self_attn.v_proj": 64,
+      "layer_10.self_attn.k_proj": 32,
+      "layer_10.self_attn.q_proj": 32,
+      "layer_11.mlp.up_proj": 32,
+      "layer_11.mlp.down_proj": 32,
+      "layer_11.mlp.gate_proj": 32,
+      "layer_11.self_attn.o_proj": 64,
+      "layer_11.self_attn.v_proj": 64,
+      "layer_11.self_attn.k_proj": 32,
+      "layer_11.self_attn.q_proj": 32,
+      "layer_12.mlp.up_proj": 32,
+      "layer_12.mlp.down_proj": 32,
+      "layer_12.mlp.gate_proj": 32,
+      "layer_12.self_attn.o_proj": 64,
+      "layer_12.self_attn.v_proj": 64,
+      "layer_12.self_attn.k_proj": 32,
+      "layer_12.self_attn.q_proj": 32,
+      "layer_13.mlp.up_proj": 64,
+      "layer_13.mlp.down_proj": 32,
+      "layer_13.mlp.gate_proj": 32,
+      "layer_13.self_attn.o_proj": 64,
+      "layer_13.self_attn.v_proj": 64,
+      "layer_13.self_attn.k_proj": 32,
+      "layer_13.self_attn.q_proj": 32,
+      "layer_14.mlp.up_proj": 32,
+      "layer_14.mlp.down_proj": 32,
+      "layer_14.mlp.gate_proj": 32,
+      "layer_14.self_attn.o_proj": 64,
+      "layer_14.self_attn.v_proj": 64,
+      "layer_14.self_attn.k_proj": 32,
+      "layer_14.self_attn.q_proj": 32,
+      "layer_15.mlp.up_proj": 32,
+      "layer_15.mlp.down_proj": 32,
+      "layer_15.mlp.gate_proj": 32,
+      "layer_15.self_attn.o_proj": 64,
+      "layer_15.self_attn.v_proj": 64,
+      "layer_15.self_attn.k_proj": 32,
+      "layer_15.self_attn.q_proj": 32,
+      "layer_16.mlp.up_proj": 64,
+      "layer_16.mlp.down_proj": 32,
+      "layer_16.mlp.gate_proj": 32,
+      "layer_16.self_attn.o_proj": 64,
+      "layer_16.self_attn.v_proj": 64,
+      "layer_16.self_attn.k_proj": 32,
+      "layer_16.self_attn.q_proj": 32,
+      "layer_17.mlp.up_proj": 32,
+      "layer_17.mlp.down_proj": 32,
+      "layer_17.mlp.gate_proj": 32,
+      "layer_17.self_attn.o_proj": 64,
+      "layer_17.self_attn.v_proj": 64,
+      "layer_17.self_attn.k_proj": 32,
+      "layer_17.self_attn.q_proj": 32,
+      "layer_18.mlp.up_proj": 32,
+      "layer_18.mlp.down_proj": 32,
+      "layer_18.mlp.gate_proj": 32,
+      "layer_18.self_attn.o_proj": 64,
+      "layer_18.self_attn.v_proj": 64,
+      "layer_18.self_attn.k_proj": 32,
+      "layer_18.self_attn.q_proj": 32,
+      "layer_19.mlp.up_proj": 64,
+      "layer_19.mlp.down_proj": 32,
+      "layer_19.mlp.gate_proj": 32,
+      "layer_19.self_attn.o_proj": 64,
+      "layer_19.self_attn.v_proj": 64,
+      "layer_19.self_attn.k_proj": 32,
+      "layer_19.self_attn.q_proj": 32,
+      "layer_20.mlp.up_proj": 32,
+      "layer_20.mlp.down_proj": 32,
+      "layer_20.mlp.gate_proj": 32,
+      "layer_20.self_attn.o_proj": 32,
+      "layer_20.self_attn.v_proj": 64,
+      "layer_20.self_attn.k_proj": 32,
+      "layer_20.self_attn.q_proj": 64,
+      "layer_21.mlp.up_proj": 32,
+      "layer_21.mlp.down_proj": 32,
+      "layer_21.mlp.gate_proj": 32,
+      "layer_21.self_attn.o_proj": 64,
+      "layer_21.self_attn.v_proj": 64,
+      "layer_21.self_attn.k_proj": 32,
+      "layer_21.self_attn.q_proj": 32,
+      "layer_22.mlp.up_proj": 64,
+      "layer_22.mlp.down_proj": 32,
+      "layer_22.mlp.gate_proj": 32,
+      "layer_22.self_attn.o_proj": 64,
+      "layer_22.self_attn.v_proj": 64,
+      "layer_22.self_attn.k_proj": 32,
+      "layer_22.self_attn.q_proj": 32,
+      "layer_23.mlp.up_proj": 64,
+      "layer_23.mlp.down_proj": 64,
+      "layer_23.mlp.gate_proj": 64,
+      "layer_23.self_attn.o_proj": 64,
+      "layer_23.self_attn.v_proj": 64,
+      "layer_23.self_attn.k_proj": 64,
+      "layer_23.self_attn.q_proj": 64
+    }
+  }
+}

adapters/humanizer-dpo/adapters.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:101a8d2833a7ad5f0610a112aadeedfc1571eacb9809464c1ce1f088ab2e269b
+size 119050013

adapters/humanizer-dpo/optiq_lora_config.json ADDED Viewed

	@@ -0,0 +1,207 @@

+{
+  "rank": 32,
+  "scale": 1.0,
+  "dropout": 0.0,
+  "rank_scaling": "by_bits",
+  "method": "dpo",
+  "dpo_beta": 0.1,
+  "dpo_learning_rate": 5e-05,
+  "dpo_warmup_iters": null,
+  "dpo_lr_schedule": "cosine",
+  "target_modules": [
+    "q_proj",
+    "k_proj",
+    "v_proj",
+    "o_proj",
+    "gate_proj",
+    "up_proj",
+    "down_proj"
+  ],
+  "num_layers": -1,
+  "use_dora": false,
+  "mask_prompt": true,
+  "batch_size": 1,
+  "iters": 300,
+  "learning_rate": 5e-05,
+  "max_seq_length": 2048,
+  "grad_accumulation_steps": 1,
+  "grad_checkpoint": true,
+  "val_batches": 25,
+  "steps_per_report": 10,
+  "steps_per_eval": 200,
+  "steps_per_save": 100,
+  "adapter_path": "adapters/humanizer-dpo-minicpm5-1b",
+  "mount_adapter": "adapters/humanizer-minicpm5-1b",
+  "clear_cache_threshold": 0,
+  "applied_ranks": {
+    "layer_0.mlp.up_proj": 64,
+    "layer_0.mlp.down_proj": 64,
+    "layer_0.mlp.gate_proj": 64,
+    "layer_0.self_attn.o_proj": 64,
+    "layer_0.self_attn.v_proj": 64,
+    "layer_0.self_attn.k_proj": 64,
+    "layer_0.self_attn.q_proj": 64,
+    "layer_1.mlp.up_proj": 32,
+    "layer_1.mlp.down_proj": 64,
+    "layer_1.mlp.gate_proj": 32,
+    "layer_1.self_attn.o_proj": 64,
+    "layer_1.self_attn.v_proj": 64,
+    "layer_1.self_attn.k_proj": 32,
+    "layer_1.self_attn.q_proj": 32,
+    "layer_2.mlp.up_proj": 32,
+    "layer_2.mlp.down_proj": 32,
+    "layer_2.mlp.gate_proj": 32,
+    "layer_2.self_attn.o_proj": 64,
+    "layer_2.self_attn.v_proj": 64,
+    "layer_2.self_attn.k_proj": 32,
+    "layer_2.self_attn.q_proj": 32,
+    "layer_3.mlp.up_proj": 32,
+    "layer_3.mlp.down_proj": 32,
+    "layer_3.mlp.gate_proj": 32,
+    "layer_3.self_attn.o_proj": 64,
+    "layer_3.self_attn.v_proj": 64,
+    "layer_3.self_attn.k_proj": 32,
+    "layer_3.self_attn.q_proj": 32,
+    "layer_4.mlp.up_proj": 32,
+    "layer_4.mlp.down_proj": 64,
+    "layer_4.mlp.gate_proj": 32,
+    "layer_4.self_attn.o_proj": 64,
+    "layer_4.self_attn.v_proj": 64,
+    "layer_4.self_attn.k_proj": 32,
+    "layer_4.self_attn.q_proj": 32,
+    "layer_5.mlp.up_proj": 32,
+    "layer_5.mlp.down_proj": 32,
+    "layer_5.mlp.gate_proj": 32,
+    "layer_5.self_attn.o_proj": 32,
+    "layer_5.self_attn.v_proj": 64,
+    "layer_5.self_attn.k_proj": 32,
+    "layer_5.self_attn.q_proj": 64,
+    "layer_6.mlp.up_proj": 32,
+    "layer_6.mlp.down_proj": 32,
+    "layer_6.mlp.gate_proj": 32,
+    "layer_6.self_attn.o_proj": 64,
+    "layer_6.self_attn.v_proj": 64,
+    "layer_6.self_attn.k_proj": 32,
+    "layer_6.self_attn.q_proj": 32,
+    "layer_7.mlp.up_proj": 64,
+    "layer_7.mlp.down_proj": 32,
+    "layer_7.mlp.gate_proj": 32,
+    "layer_7.self_attn.o_proj": 64,
+    "layer_7.self_attn.v_proj": 32,
+    "layer_7.self_attn.k_proj": 32,
+    "layer_7.self_attn.q_proj": 64,
+    "layer_8.mlp.up_proj": 32,
+    "layer_8.mlp.down_proj": 32,
+    "layer_8.mlp.gate_proj": 32,
+    "layer_8.self_attn.o_proj": 64,
+    "layer_8.self_attn.v_proj": 32,
+    "layer_8.self_attn.k_proj": 32,
+    "layer_8.self_attn.q_proj": 64,
+    "layer_9.mlp.up_proj": 32,
+    "layer_9.mlp.down_proj": 32,
+    "layer_9.mlp.gate_proj": 32,
+    "layer_9.self_attn.o_proj": 64,
+    "layer_9.self_attn.v_proj": 64,
+    "layer_9.self_attn.k_proj": 32,
+    "layer_9.self_attn.q_proj": 32,
+    "layer_10.mlp.up_proj": 64,
+    "layer_10.mlp.down_proj": 32,
+    "layer_10.mlp.gate_proj": 32,
+    "layer_10.self_attn.o_proj": 64,
+    "layer_10.self_attn.v_proj": 64,
+    "layer_10.self_attn.k_proj": 32,
+    "layer_10.self_attn.q_proj": 32,
+    "layer_11.mlp.up_proj": 32,
+    "layer_11.mlp.down_proj": 32,
+    "layer_11.mlp.gate_proj": 32,
+    "layer_11.self_attn.o_proj": 64,
+    "layer_11.self_attn.v_proj": 64,
+    "layer_11.self_attn.k_proj": 32,
+    "layer_11.self_attn.q_proj": 32,
+    "layer_12.mlp.up_proj": 32,
+    "layer_12.mlp.down_proj": 32,
+    "layer_12.mlp.gate_proj": 32,
+    "layer_12.self_attn.o_proj": 64,
+    "layer_12.self_attn.v_proj": 64,
+    "layer_12.self_attn.k_proj": 32,
+    "layer_12.self_attn.q_proj": 32,
+    "layer_13.mlp.up_proj": 64,
+    "layer_13.mlp.down_proj": 32,
+    "layer_13.mlp.gate_proj": 32,
+    "layer_13.self_attn.o_proj": 64,
+    "layer_13.self_attn.v_proj": 64,
+    "layer_13.self_attn.k_proj": 32,
+    "layer_13.self_attn.q_proj": 32,
+    "layer_14.mlp.up_proj": 32,
+    "layer_14.mlp.down_proj": 32,
+    "layer_14.mlp.gate_proj": 32,
+    "layer_14.self_attn.o_proj": 64,
+    "layer_14.self_attn.v_proj": 64,
+    "layer_14.self_attn.k_proj": 32,
+    "layer_14.self_attn.q_proj": 32,
+    "layer_15.mlp.up_proj": 32,
+    "layer_15.mlp.down_proj": 32,
+    "layer_15.mlp.gate_proj": 32,
+    "layer_15.self_attn.o_proj": 64,
+    "layer_15.self_attn.v_proj": 64,
+    "layer_15.self_attn.k_proj": 32,
+    "layer_15.self_attn.q_proj": 32,
+    "layer_16.mlp.up_proj": 64,
+    "layer_16.mlp.down_proj": 32,
+    "layer_16.mlp.gate_proj": 32,
+    "layer_16.self_attn.o_proj": 64,
+    "layer_16.self_attn.v_proj": 64,
+    "layer_16.self_attn.k_proj": 32,
+    "layer_16.self_attn.q_proj": 32,
+    "layer_17.mlp.up_proj": 32,
+    "layer_17.mlp.down_proj": 32,
+    "layer_17.mlp.gate_proj": 32,
+    "layer_17.self_attn.o_proj": 64,
+    "layer_17.self_attn.v_proj": 64,
+    "layer_17.self_attn.k_proj": 32,
+    "layer_17.self_attn.q_proj": 32,
+    "layer_18.mlp.up_proj": 32,
+    "layer_18.mlp.down_proj": 32,
+    "layer_18.mlp.gate_proj": 32,
+    "layer_18.self_attn.o_proj": 64,
+    "layer_18.self_attn.v_proj": 64,
+    "layer_18.self_attn.k_proj": 32,
+    "layer_18.self_attn.q_proj": 32,
+    "layer_19.mlp.up_proj": 64,
+    "layer_19.mlp.down_proj": 32,
+    "layer_19.mlp.gate_proj": 32,
+    "layer_19.self_attn.o_proj": 64,
+    "layer_19.self_attn.v_proj": 64,
+    "layer_19.self_attn.k_proj": 32,
+    "layer_19.self_attn.q_proj": 32,
+    "layer_20.mlp.up_proj": 32,
+    "layer_20.mlp.down_proj": 32,
+    "layer_20.mlp.gate_proj": 32,
+    "layer_20.self_attn.o_proj": 32,
+    "layer_20.self_attn.v_proj": 64,
+    "layer_20.self_attn.k_proj": 32,
+    "layer_20.self_attn.q_proj": 64,
+    "layer_21.mlp.up_proj": 32,
+    "layer_21.mlp.down_proj": 32,
+    "layer_21.mlp.gate_proj": 32,
+    "layer_21.self_attn.o_proj": 64,
+    "layer_21.self_attn.v_proj": 64,
+    "layer_21.self_attn.k_proj": 32,
+    "layer_21.self_attn.q_proj": 32,
+    "layer_22.mlp.up_proj": 64,
+    "layer_22.mlp.down_proj": 32,
+    "layer_22.mlp.gate_proj": 32,
+    "layer_22.self_attn.o_proj": 64,
+    "layer_22.self_attn.v_proj": 64,
+    "layer_22.self_attn.k_proj": 32,
+    "layer_22.self_attn.q_proj": 32,
+    "layer_23.mlp.up_proj": 64,
+    "layer_23.mlp.down_proj": 64,
+    "layer_23.mlp.gate_proj": 64,
+    "layer_23.self_attn.o_proj": 64,
+    "layer_23.self_attn.v_proj": 64,
+    "layer_23.self_attn.k_proj": 64,
+    "layer_23.self_attn.q_proj": 64
+  },
+  "source_model": "optiq_output/openbmb_MiniCPM5-1B/optiq_mixed"
+}

adapters/humanizer-sft/adapter_config.json ADDED Viewed

	@@ -0,0 +1,206 @@

+{
+  "fine_tune_type": "lora",
+  "num_layers": -1,
+  "lora_parameters": {
+    "rank": 32,
+    "scale": 1.0,
+    "dropout": 0.0,
+    "keys": null
+  },
+  "base_model_name_or_path": "optiq_output/openbmb_MiniCPM5-1B/optiq_mixed",
+  "bias": "none",
+  "fan_in_fan_out": false,
+  "inference_mode": false,
+  "init_lora_weights": true,
+  "layers_to_transform": null,
+  "layers_pattern": null,
+  "lora_alpha": 32,
+  "lora_dropout": 0.0,
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "r": 32,
+  "revision": null,
+  "target_modules": [
+    "q_proj",
+    "k_proj",
+    "v_proj",
+    "o_proj",
+    "gate_proj",
+    "up_proj",
+    "down_proj"
+  ],
+  "task_type": "CAUSAL_LM",
+  "optiq": {
+    "rank_scaling": "by_bits",
+    "applied_ranks": {
+      "layer_0.mlp.up_proj": 64,
+      "layer_0.mlp.down_proj": 64,
+      "layer_0.mlp.gate_proj": 64,
+      "layer_0.self_attn.o_proj": 64,
+      "layer_0.self_attn.v_proj": 64,
+      "layer_0.self_attn.k_proj": 64,
+      "layer_0.self_attn.q_proj": 64,
+      "layer_1.mlp.up_proj": 32,
+      "layer_1.mlp.down_proj": 64,
+      "layer_1.mlp.gate_proj": 32,
+      "layer_1.self_attn.o_proj": 64,
+      "layer_1.self_attn.v_proj": 64,
+      "layer_1.self_attn.k_proj": 32,
+      "layer_1.self_attn.q_proj": 32,
+      "layer_2.mlp.up_proj": 32,
+      "layer_2.mlp.down_proj": 32,
+      "layer_2.mlp.gate_proj": 32,
+      "layer_2.self_attn.o_proj": 64,
+      "layer_2.self_attn.v_proj": 64,
+      "layer_2.self_attn.k_proj": 32,
+      "layer_2.self_attn.q_proj": 32,
+      "layer_3.mlp.up_proj": 32,
+      "layer_3.mlp.down_proj": 32,
+      "layer_3.mlp.gate_proj": 32,
+      "layer_3.self_attn.o_proj": 64,
+      "layer_3.self_attn.v_proj": 64,
+      "layer_3.self_attn.k_proj": 32,
+      "layer_3.self_attn.q_proj": 32,
+      "layer_4.mlp.up_proj": 32,
+      "layer_4.mlp.down_proj": 64,
+      "layer_4.mlp.gate_proj": 32,
+      "layer_4.self_attn.o_proj": 64,
+      "layer_4.self_attn.v_proj": 64,
+      "layer_4.self_attn.k_proj": 32,
+      "layer_4.self_attn.q_proj": 32,
+      "layer_5.mlp.up_proj": 32,
+      "layer_5.mlp.down_proj": 32,
+      "layer_5.mlp.gate_proj": 32,
+      "layer_5.self_attn.o_proj": 32,
+      "layer_5.self_attn.v_proj": 64,
+      "layer_5.self_attn.k_proj": 32,
+      "layer_5.self_attn.q_proj": 64,
+      "layer_6.mlp.up_proj": 32,
+      "layer_6.mlp.down_proj": 32,
+      "layer_6.mlp.gate_proj": 32,
+      "layer_6.self_attn.o_proj": 64,
+      "layer_6.self_attn.v_proj": 64,
+      "layer_6.self_attn.k_proj": 32,
+      "layer_6.self_attn.q_proj": 32,
+      "layer_7.mlp.up_proj": 64,
+      "layer_7.mlp.down_proj": 32,
+      "layer_7.mlp.gate_proj": 32,
+      "layer_7.self_attn.o_proj": 64,
+      "layer_7.self_attn.v_proj": 32,
+      "layer_7.self_attn.k_proj": 32,
+      "layer_7.self_attn.q_proj": 64,
+      "layer_8.mlp.up_proj": 32,
+      "layer_8.mlp.down_proj": 32,
+      "layer_8.mlp.gate_proj": 32,
+      "layer_8.self_attn.o_proj": 64,
+      "layer_8.self_attn.v_proj": 32,
+      "layer_8.self_attn.k_proj": 32,
+      "layer_8.self_attn.q_proj": 64,
+      "layer_9.mlp.up_proj": 32,
+      "layer_9.mlp.down_proj": 32,
+      "layer_9.mlp.gate_proj": 32,
+      "layer_9.self_attn.o_proj": 64,
+      "layer_9.self_attn.v_proj": 64,
+      "layer_9.self_attn.k_proj": 32,
+      "layer_9.self_attn.q_proj": 32,
+      "layer_10.mlp.up_proj": 64,
+      "layer_10.mlp.down_proj": 32,
+      "layer_10.mlp.gate_proj": 32,
+      "layer_10.self_attn.o_proj": 64,
+      "layer_10.self_attn.v_proj": 64,
+      "layer_10.self_attn.k_proj": 32,
+      "layer_10.self_attn.q_proj": 32,
+      "layer_11.mlp.up_proj": 32,
+      "layer_11.mlp.down_proj": 32,
+      "layer_11.mlp.gate_proj": 32,
+      "layer_11.self_attn.o_proj": 64,
+      "layer_11.self_attn.v_proj": 64,
+      "layer_11.self_attn.k_proj": 32,
+      "layer_11.self_attn.q_proj": 32,
+      "layer_12.mlp.up_proj": 32,
+      "layer_12.mlp.down_proj": 32,
+      "layer_12.mlp.gate_proj": 32,
+      "layer_12.self_attn.o_proj": 64,
+      "layer_12.self_attn.v_proj": 64,
+      "layer_12.self_attn.k_proj": 32,
+      "layer_12.self_attn.q_proj": 32,
+      "layer_13.mlp.up_proj": 64,
+      "layer_13.mlp.down_proj": 32,
+      "layer_13.mlp.gate_proj": 32,
+      "layer_13.self_attn.o_proj": 64,
+      "layer_13.self_attn.v_proj": 64,
+      "layer_13.self_attn.k_proj": 32,
+      "layer_13.self_attn.q_proj": 32,
+      "layer_14.mlp.up_proj": 32,
+      "layer_14.mlp.down_proj": 32,
+      "layer_14.mlp.gate_proj": 32,
+      "layer_14.self_attn.o_proj": 64,
+      "layer_14.self_attn.v_proj": 64,
+      "layer_14.self_attn.k_proj": 32,
+      "layer_14.self_attn.q_proj": 32,
+      "layer_15.mlp.up_proj": 32,
+      "layer_15.mlp.down_proj": 32,
+      "layer_15.mlp.gate_proj": 32,
+      "layer_15.self_attn.o_proj": 64,
+      "layer_15.self_attn.v_proj": 64,
+      "layer_15.self_attn.k_proj": 32,
+      "layer_15.self_attn.q_proj": 32,
+      "layer_16.mlp.up_proj": 64,
+      "layer_16.mlp.down_proj": 32,
+      "layer_16.mlp.gate_proj": 32,
+      "layer_16.self_attn.o_proj": 64,
+      "layer_16.self_attn.v_proj": 64,
+      "layer_16.self_attn.k_proj": 32,
+      "layer_16.self_attn.q_proj": 32,
+      "layer_17.mlp.up_proj": 32,
+      "layer_17.mlp.down_proj": 32,
+      "layer_17.mlp.gate_proj": 32,
+      "layer_17.self_attn.o_proj": 64,
+      "layer_17.self_attn.v_proj": 64,
+      "layer_17.self_attn.k_proj": 32,
+      "layer_17.self_attn.q_proj": 32,
+      "layer_18.mlp.up_proj": 32,
+      "layer_18.mlp.down_proj": 32,
+      "layer_18.mlp.gate_proj": 32,
+      "layer_18.self_attn.o_proj": 64,
+      "layer_18.self_attn.v_proj": 64,
+      "layer_18.self_attn.k_proj": 32,
+      "layer_18.self_attn.q_proj": 32,
+      "layer_19.mlp.up_proj": 64,
+      "layer_19.mlp.down_proj": 32,
+      "layer_19.mlp.gate_proj": 32,
+      "layer_19.self_attn.o_proj": 64,
+      "layer_19.self_attn.v_proj": 64,
+      "layer_19.self_attn.k_proj": 32,
+      "layer_19.self_attn.q_proj": 32,
+      "layer_20.mlp.up_proj": 32,
+      "layer_20.mlp.down_proj": 32,
+      "layer_20.mlp.gate_proj": 32,
+      "layer_20.self_attn.o_proj": 32,
+      "layer_20.self_attn.v_proj": 64,
+      "layer_20.self_attn.k_proj": 32,
+      "layer_20.self_attn.q_proj": 64,
+      "layer_21.mlp.up_proj": 32,
+      "layer_21.mlp.down_proj": 32,
+      "layer_21.mlp.gate_proj": 32,
+      "layer_21.self_attn.o_proj": 64,
+      "layer_21.self_attn.v_proj": 64,
+      "layer_21.self_attn.k_proj": 32,
+      "layer_21.self_attn.q_proj": 32,
+      "layer_22.mlp.up_proj": 64,
+      "layer_22.mlp.down_proj": 32,
+      "layer_22.mlp.gate_proj": 32,
+      "layer_22.self_attn.o_proj": 64,
+      "layer_22.self_attn.v_proj": 64,
+      "layer_22.self_attn.k_proj": 32,
+      "layer_22.self_attn.q_proj": 32,
+      "layer_23.mlp.up_proj": 64,
+      "layer_23.mlp.down_proj": 64,
+      "layer_23.mlp.gate_proj": 64,
+      "layer_23.self_attn.o_proj": 64,
+      "layer_23.self_attn.v_proj": 64,
+      "layer_23.self_attn.k_proj": 64,
+      "layer_23.self_attn.q_proj": 64
+    }
+  }
+}

adapters/humanizer-sft/adapters.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2dd9685946a90b65dcd123a83a1f401b5b47ccbd6a68c09b19d8b6d2fe98e650
+size 119206006

adapters/humanizer-sft/optiq_lora_config.json ADDED Viewed

	@@ -0,0 +1,206 @@

+{
+  "rank": 32,
+  "scale": 1.0,
+  "dropout": 0.0,
+  "rank_scaling": "by_bits",
+  "method": "sft",
+  "dpo_beta": 0.1,
+  "dpo_learning_rate": 5e-05,
+  "dpo_warmup_iters": null,
+  "dpo_lr_schedule": "cosine",
+  "target_modules": [
+    "q_proj",
+    "k_proj",
+    "v_proj",
+    "o_proj",
+    "gate_proj",
+    "up_proj",
+    "down_proj"
+  ],
+  "num_layers": -1,
+  "use_dora": false,
+  "mask_prompt": true,
+  "batch_size": 1,
+  "iters": 600,
+  "learning_rate": 0.0002,
+  "max_seq_length": 2048,
+  "grad_accumulation_steps": 1,
+  "grad_checkpoint": true,
+  "val_batches": 25,
+  "steps_per_report": 10,
+  "steps_per_eval": 200,
+  "steps_per_save": 100,
+  "adapter_path": "adapters/humanizer-minicpm5-1b",
+  "clear_cache_threshold": 0,
+  "applied_ranks": {
+    "layer_0.mlp.up_proj": 64,
+    "layer_0.mlp.down_proj": 64,
+    "layer_0.mlp.gate_proj": 64,
+    "layer_0.self_attn.o_proj": 64,
+    "layer_0.self_attn.v_proj": 64,
+    "layer_0.self_attn.k_proj": 64,
+    "layer_0.self_attn.q_proj": 64,
+    "layer_1.mlp.up_proj": 32,
+    "layer_1.mlp.down_proj": 64,
+    "layer_1.mlp.gate_proj": 32,
+    "layer_1.self_attn.o_proj": 64,
+    "layer_1.self_attn.v_proj": 64,
+    "layer_1.self_attn.k_proj": 32,
+    "layer_1.self_attn.q_proj": 32,
+    "layer_2.mlp.up_proj": 32,
+    "layer_2.mlp.down_proj": 32,
+    "layer_2.mlp.gate_proj": 32,
+    "layer_2.self_attn.o_proj": 64,
+    "layer_2.self_attn.v_proj": 64,
+    "layer_2.self_attn.k_proj": 32,
+    "layer_2.self_attn.q_proj": 32,
+    "layer_3.mlp.up_proj": 32,
+    "layer_3.mlp.down_proj": 32,
+    "layer_3.mlp.gate_proj": 32,
+    "layer_3.self_attn.o_proj": 64,
+    "layer_3.self_attn.v_proj": 64,
+    "layer_3.self_attn.k_proj": 32,
+    "layer_3.self_attn.q_proj": 32,
+    "layer_4.mlp.up_proj": 32,
+    "layer_4.mlp.down_proj": 64,
+    "layer_4.mlp.gate_proj": 32,
+    "layer_4.self_attn.o_proj": 64,
+    "layer_4.self_attn.v_proj": 64,
+    "layer_4.self_attn.k_proj": 32,
+    "layer_4.self_attn.q_proj": 32,
+    "layer_5.mlp.up_proj": 32,
+    "layer_5.mlp.down_proj": 32,
+    "layer_5.mlp.gate_proj": 32,
+    "layer_5.self_attn.o_proj": 32,
+    "layer_5.self_attn.v_proj": 64,
+    "layer_5.self_attn.k_proj": 32,
+    "layer_5.self_attn.q_proj": 64,
+    "layer_6.mlp.up_proj": 32,
+    "layer_6.mlp.down_proj": 32,
+    "layer_6.mlp.gate_proj": 32,
+    "layer_6.self_attn.o_proj": 64,
+    "layer_6.self_attn.v_proj": 64,
+    "layer_6.self_attn.k_proj": 32,
+    "layer_6.self_attn.q_proj": 32,
+    "layer_7.mlp.up_proj": 64,
+    "layer_7.mlp.down_proj": 32,
+    "layer_7.mlp.gate_proj": 32,
+    "layer_7.self_attn.o_proj": 64,
+    "layer_7.self_attn.v_proj": 32,
+    "layer_7.self_attn.k_proj": 32,
+    "layer_7.self_attn.q_proj": 64,
+    "layer_8.mlp.up_proj": 32,
+    "layer_8.mlp.down_proj": 32,
+    "layer_8.mlp.gate_proj": 32,
+    "layer_8.self_attn.o_proj": 64,
+    "layer_8.self_attn.v_proj": 32,
+    "layer_8.self_attn.k_proj": 32,
+    "layer_8.self_attn.q_proj": 64,
+    "layer_9.mlp.up_proj": 32,
+    "layer_9.mlp.down_proj": 32,
+    "layer_9.mlp.gate_proj": 32,
+    "layer_9.self_attn.o_proj": 64,
+    "layer_9.self_attn.v_proj": 64,
+    "layer_9.self_attn.k_proj": 32,
+    "layer_9.self_attn.q_proj": 32,
+    "layer_10.mlp.up_proj": 64,
+    "layer_10.mlp.down_proj": 32,
+    "layer_10.mlp.gate_proj": 32,
+    "layer_10.self_attn.o_proj": 64,
+    "layer_10.self_attn.v_proj": 64,
+    "layer_10.self_attn.k_proj": 32,
+    "layer_10.self_attn.q_proj": 32,
+    "layer_11.mlp.up_proj": 32,
+    "layer_11.mlp.down_proj": 32,
+    "layer_11.mlp.gate_proj": 32,
+    "layer_11.self_attn.o_proj": 64,
+    "layer_11.self_attn.v_proj": 64,
+    "layer_11.self_attn.k_proj": 32,
+    "layer_11.self_attn.q_proj": 32,
+    "layer_12.mlp.up_proj": 32,
+    "layer_12.mlp.down_proj": 32,
+    "layer_12.mlp.gate_proj": 32,
+    "layer_12.self_attn.o_proj": 64,
+    "layer_12.self_attn.v_proj": 64,
+    "layer_12.self_attn.k_proj": 32,
+    "layer_12.self_attn.q_proj": 32,
+    "layer_13.mlp.up_proj": 64,
+    "layer_13.mlp.down_proj": 32,
+    "layer_13.mlp.gate_proj": 32,
+    "layer_13.self_attn.o_proj": 64,
+    "layer_13.self_attn.v_proj": 64,
+    "layer_13.self_attn.k_proj": 32,
+    "layer_13.self_attn.q_proj": 32,
+    "layer_14.mlp.up_proj": 32,
+    "layer_14.mlp.down_proj": 32,
+    "layer_14.mlp.gate_proj": 32,
+    "layer_14.self_attn.o_proj": 64,
+    "layer_14.self_attn.v_proj": 64,
+    "layer_14.self_attn.k_proj": 32,
+    "layer_14.self_attn.q_proj": 32,
+    "layer_15.mlp.up_proj": 32,
+    "layer_15.mlp.down_proj": 32,
+    "layer_15.mlp.gate_proj": 32,
+    "layer_15.self_attn.o_proj": 64,
+    "layer_15.self_attn.v_proj": 64,
+    "layer_15.self_attn.k_proj": 32,
+    "layer_15.self_attn.q_proj": 32,
+    "layer_16.mlp.up_proj": 64,
+    "layer_16.mlp.down_proj": 32,
+    "layer_16.mlp.gate_proj": 32,
+    "layer_16.self_attn.o_proj": 64,
+    "layer_16.self_attn.v_proj": 64,
+    "layer_16.self_attn.k_proj": 32,
+    "layer_16.self_attn.q_proj": 32,
+    "layer_17.mlp.up_proj": 32,
+    "layer_17.mlp.down_proj": 32,
+    "layer_17.mlp.gate_proj": 32,
+    "layer_17.self_attn.o_proj": 64,
+    "layer_17.self_attn.v_proj": 64,
+    "layer_17.self_attn.k_proj": 32,
+    "layer_17.self_attn.q_proj": 32,
+    "layer_18.mlp.up_proj": 32,
+    "layer_18.mlp.down_proj": 32,
+    "layer_18.mlp.gate_proj": 32,
+    "layer_18.self_attn.o_proj": 64,
+    "layer_18.self_attn.v_proj": 64,
+    "layer_18.self_attn.k_proj": 32,
+    "layer_18.self_attn.q_proj": 32,
+    "layer_19.mlp.up_proj": 64,
+    "layer_19.mlp.down_proj": 32,
+    "layer_19.mlp.gate_proj": 32,
+    "layer_19.self_attn.o_proj": 64,
+    "layer_19.self_attn.v_proj": 64,
+    "layer_19.self_attn.k_proj": 32,
+    "layer_19.self_attn.q_proj": 32,
+    "layer_20.mlp.up_proj": 32,
+    "layer_20.mlp.down_proj": 32,
+    "layer_20.mlp.gate_proj": 32,
+    "layer_20.self_attn.o_proj": 32,
+    "layer_20.self_attn.v_proj": 64,
+    "layer_20.self_attn.k_proj": 32,
+    "layer_20.self_attn.q_proj": 64,
+    "layer_21.mlp.up_proj": 32,
+    "layer_21.mlp.down_proj": 32,
+    "layer_21.mlp.gate_proj": 32,
+    "layer_21.self_attn.o_proj": 64,
+    "layer_21.self_attn.v_proj": 64,
+    "layer_21.self_attn.k_proj": 32,
+    "layer_21.self_attn.q_proj": 32,
+    "layer_22.mlp.up_proj": 64,
+    "layer_22.mlp.down_proj": 32,
+    "layer_22.mlp.gate_proj": 32,
+    "layer_22.self_attn.o_proj": 64,
+    "layer_22.self_attn.v_proj": 64,
+    "layer_22.self_attn.k_proj": 32,
+    "layer_22.self_attn.q_proj": 32,
+    "layer_23.mlp.up_proj": 64,
+    "layer_23.mlp.down_proj": 64,
+    "layer_23.mlp.gate_proj": 64,
+    "layer_23.self_attn.o_proj": 64,
+    "layer_23.self_attn.v_proj": 64,
+    "layer_23.self_attn.k_proj": 64,
+    "layer_23.self_attn.q_proj": 64
+  },
+  "source_model": "optiq_output/openbmb_MiniCPM5-1B/optiq_mixed"
+}

chat_template.jinja ADDED Viewed

	@@ -0,0 +1,179 @@

+{{- bos_token }}{%- if tools %}
+    {%- set tool_definitions %}
+        {{- "# Tools\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
+        {%- for tool in tools %}
+            {{- "\n" }}
+            {{- tool | tojson(ensure_ascii=False) }}
+        {%- endfor %}
+        {{- '\n</tools>\n\nTool usage guidelines:\n- You may call zero or more functions. If no function calls are needed, just answer normally and do not include any <function ... </function>.\n- When calling a function, return an XML object within <function ... </function> using:\n<function name="function-name"><param name="param-name">param-value</param></function>\n- param-value may be multi-line. If it contains <, & or newline characters, wrap it in a CDATA block: <param name="param-name"><![CDATA[...multi-line value...]]></param>' }}
+    {%- endset %}
+    {{- '<|im_start|>system\n' }}
+    {%- if messages[0].role == 'system' %}
+        {%- if '<tool_def_sep>' in messages[0].content %}
+            {{- messages[0].content.replace('<tool_def_sep>', tool_definitions) }}
+        {%- else %}
+            {{- messages[0].content + '\n\n' + tool_definitions }}
+        {%- endif %}
+    {%- else %}
+        {{- tool_definitions.lstrip() }}
+    {%- endif %}
+    {{- '<|im_end|>\n' }}
+{%- else %}
+    {%- if messages[0].role == 'system' %}
+        {{- '<|im_start|>system\n' + messages[0].content + '<|im_end|>\n' }}
+    {%- endif %}
+{%- endif %}
+{%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}
+{%- for message in messages[::-1] %}
+    {%- set index = (messages|length - 1) - loop.index0 %}
+    {%- if ns.multi_step_tool and message.role == "user" and message.content is string and not(message.content.startswith('<tool_response>') and message.content.endswith('</tool_response>')) %}
+        {%- set ns.multi_step_tool = false %}
+        {%- set ns.last_query_index = index %}
+    {%- endif %}
+{%- endfor %}
+{%- for message in messages %}
+    {%- if message.content is string %}
+        {%- set content = message.content %}
+    {%- else %}
+        {%- set content = '' %}
+    {%- endif %}
+    {%- if (message.role == "user") or (message.role == "system" and not loop.first) %}
+        {{- '<|im_start|>' + message.role + '\n' + content + '<|im_end|>' + '\n' }}
+    {%- elif message.role == "assistant" %}
+        {%- set reasoning_content = '' %}
+        {%- if message.reasoning_content is string %}
+            {%- set reasoning_content = message.reasoning_content %}
+        {%- else %}
+            {%- if '</think>' in content %}
+                {%- set reasoning_content = content.split('</think>')[0].rstrip('\n').split('<think>')[-1].lstrip('\n') %}
+                {%- set content = content.split('</think>')[-1].lstrip('\n') %}
+            {%- endif %}
+        {%- endif %}
+        {%- if message.tool_calls %}
+            {%- set content_parts = content.split('<tool_sep>') %}
+            {%- set processed_content = content_parts[0] %}
+            {%- set tool_calls_count = message.tool_calls|length %}
+            {%- set tool_sep_count = content_parts|length - 1 %}
+            {%- set min_count = [tool_calls_count, tool_sep_count]|min %}
+            {%- for i in range(1, content_parts|length) %}
+                {%- set tool_index = i - 1 %}
+                {%- if tool_index < tool_calls_count %}
+                    {%- set tool_call = message.tool_calls[tool_index] %}
+                    {%- if tool_call.function %}
+                        {%- set tool_call = tool_call.function %}
+                    {%- endif %}
+                    {%- set single_tool_xml %}
+                        {{- '<function name="' ~ tool_call.name ~ '">' }}
+                        {%- if tool_call.arguments %}
+                            {%- set args_dict = tool_call.arguments %}
+                            {%- for param_name, param_value in args_dict.items() %}
+                                {{- '<param name="' ~ param_name ~ '">' }}
+                                {%- if param_value is string and ('<' in param_value or '&' in param_value or '\n' in param_value) %}
+                                    {{- '<![CDATA[' + param_value + ']]>' }}
+                                {%- else %}
+                                    {{- param_value }}
+                                {%- endif %}
+                                {{- '</param>' }}
+                            {%- endfor %}
+                        {%- endif %}
+                        {{- '</function>' }}
+                    {%- endset %}
+                    {%- set processed_content = processed_content + single_tool_xml + content_parts[i] %}
+                {%- else %}
+                    {%- set processed_content = processed_content + content_parts[i] %}
+                {%- endif %}
+            {%- endfor %}
+            {%- if tool_calls_count > tool_sep_count %}
+                {%- for remaining_index in range(tool_sep_count, tool_calls_count) %}
+                    {%- set tool_call = message.tool_calls[remaining_index] %}
+                    {%- if tool_call.function %}
+                        {%- set tool_call = tool_call.function %}
+                    {%- endif %}
+                    {%- set remaining_tool_xml %}
+                        {{- '<function name="' ~ tool_call.name ~ '">' }}
+                        {%- if tool_call.arguments %}
+                            {%- set args_dict = tool_call.arguments %}
+                            {%- for param_name, param_value in args_dict.items() %}
+                                {{- '<param name="' ~ param_name ~ '">' }}
+                                {%- if param_value is string and ('<' in param_value or '&' in param_value or '\n' in param_value) %}
+                                    {{- '<![CDATA[' + param_value + ']]>' }}
+                                {%- else %}
+                                    {{- param_value }}
+                                {%- endif %}
+                                {{- '</param>' }}
+                            {%- endfor %}
+                        {%- endif %}
+                        {{- '</function>' }}
+                    {%- endset %}
+                    {%- set processed_content = processed_content + remaining_tool_xml %}
+                {%- endfor %}
+            {%- endif %}
+            {%- set content = processed_content %}
+        {%- endif %}
+        {%- if loop.index0 > ns.last_query_index %}
+            {%- if reasoning_content %}
+                {{- '<|im_start|>' + message.role + '\n<think>\n' + reasoning_content.strip('\n') + '\n</think>\n\n' + content.lstrip('\n') }}
+            {%- else %}
+                {{- '<|im_start|>' + message.role + '\n' + content }}
+            {%- endif %}
+        {%- else %}
+            {{- '<|im_start|>' + message.role + '\n' + content }}
+        {%- endif %}
+        {%- if message.tool_calls and not has_tool_sep %}
+            {%- for tool_call in message.tool_calls %}
+                {%- if (loop.first and content) or (not loop.first) %}
+                    {{- '\n' }}
+                {%- endif %}
+                {%- if tool_call.function %}
+                    {%- set tool_call = tool_call.function %}
+                {%- endif %}
+                {{- '<function name="' ~ tool_call.name ~ '">' }}
+                {%- if tool_call.arguments %}
+                    {%- set args_dict = tool_call.arguments %}
+                    {%- for param_name, param_value in args_dict.items() %}
+                        {{- '<param name="' ~ param_name ~ '">' }}
+                        {%- if param_value is string and ('<' in param_value or '&' in param_value or '\n' in param_value) %}
+                            {{- '<![CDATA[' + param_value + ']]>' }}
+                        {%- else %}
+                            {{- param_value }}
+                        {%- endif %}
+                        {{- '</param>' }}
+                    {%- endfor %}
+                {%- endif %}
+                {{- '</function>' }}
+            {%- endfor %}
+        {%- endif %}
+        {{- '<|im_end|>\n' }}
+    {%- elif message.role == "tool" %}
+        {%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
+            {{- '<|im_start|>user' }}
+        {%- endif %}
+        {{- '\n<tool_response>\n' }}
+        {%- if message.content is string %}
+            {{- content }}
+        {%- else %}
+            {{- message.content | tojson(ensure_ascii=False) }}
+        {%- endif %}
+        {{- '\n</tool_response>' }}
+        {%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
+            {{- '<|im_end|>\n' }}
+        {%- endif %}
+    {%- endif %}
+{%- endfor %}
+{%- if add_generation_prompt %}
+    {{- '<|im_start|>assistant\n' }}
+    {%- if enable_thinking is defined %}
+        {%- if enable_thinking is false %}
+            {{- '<think>\n\n</think>\n\n' }}
+        {%- elif enable_thinking is true %}
+            {{- '<think>\n' }}
+        {%- endif %}
+    {%- endif %}
+{%- endif %}

config.json ADDED Viewed

	@@ -0,0 +1,1399 @@

+{
+    "architectures": [
+        "LlamaForCausalLM"
+    ],
+    "bos_token_id": 0,
+    "eos_token_id": [
+        1,
+        130073
+    ],
+    "head_dim": 128,
+    "hidden_act": "silu",
+    "hidden_size": 1536,
+    "initializer_range": 0.02,
+    "intermediate_size": 4608,
+    "max_position_embeddings": 131072,
+    "model_type": "llama",
+    "num_attention_heads": 16,
+    "num_hidden_layers": 24,
+    "num_key_value_heads": 2,
+    "pad_token_id": 1,
+    "quantization": {
+        "group_size": 64,
+        "bits": 4,
+        "mode": "affine",
+        "model.embed_tokens": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.0.self_attn.q_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.0.self_attn.k_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.0.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.0.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.0.mlp.gate_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.0.mlp.down_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.0.mlp.up_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.1.self_attn.q_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.1.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.1.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.1.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.1.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.1.mlp.down_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.1.mlp.up_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.2.self_attn.q_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.2.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.2.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.2.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.2.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.2.mlp.down_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.2.mlp.up_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.3.self_attn.q_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.3.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.3.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.3.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.3.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.3.mlp.down_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.3.mlp.up_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.4.self_attn.q_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.4.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.4.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.4.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.4.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.4.mlp.down_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.4.mlp.up_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.5.self_attn.q_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.5.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.5.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.5.self_attn.o_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.5.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.5.mlp.down_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.5.mlp.up_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.6.self_attn.q_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.6.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.6.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.6.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.6.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.6.mlp.down_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.6.mlp.up_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.7.self_attn.q_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.7.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.7.self_attn.v_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.7.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.7.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.7.mlp.down_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.7.mlp.up_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.8.self_attn.q_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.8.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.8.self_attn.v_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.8.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.8.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.8.mlp.down_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.8.mlp.up_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.9.self_attn.q_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.9.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.9.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.9.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.9.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.9.mlp.down_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.9.mlp.up_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.10.self_attn.q_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.10.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.10.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.10.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.10.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.10.mlp.down_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.10.mlp.up_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.11.self_attn.q_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.11.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.11.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.11.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.11.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.11.mlp.down_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.11.mlp.up_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.12.self_attn.q_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.12.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.12.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.12.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.12.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.12.mlp.down_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.12.mlp.up_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.13.self_attn.q_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.13.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.13.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.13.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.13.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.13.mlp.down_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.13.mlp.up_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.14.self_attn.q_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.14.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.14.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.14.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.14.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.14.mlp.down_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.14.mlp.up_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.15.self_attn.q_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.15.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.15.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.15.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.15.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.15.mlp.down_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.15.mlp.up_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.16.self_attn.q_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.16.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.16.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.16.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.16.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.16.mlp.down_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.16.mlp.up_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.17.self_attn.q_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.17.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.17.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.17.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.17.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.17.mlp.down_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.17.mlp.up_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.18.self_attn.q_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.18.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.18.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.18.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.18.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.18.mlp.down_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.18.mlp.up_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.19.self_attn.q_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.19.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.19.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.19.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.19.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.19.mlp.down_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.19.mlp.up_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.20.self_attn.q_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.20.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.20.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.20.self_attn.o_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.20.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.20.mlp.down_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.20.mlp.up_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.21.self_attn.q_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.21.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.21.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.21.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.21.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.21.mlp.down_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.21.mlp.up_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.22.self_attn.q_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.22.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.22.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.22.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.22.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.22.mlp.down_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.22.mlp.up_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.23.self_attn.q_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.23.self_attn.k_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.23.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.23.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.23.mlp.gate_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.23.mlp.down_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.23.mlp.up_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "lm_head": {
+            "bits": 8,
+            "group_size": 64
+        }
+    },
+    "quantization_config": {
+        "group_size": 64,
+        "bits": 4,
+        "mode": "affine",
+        "model.embed_tokens": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.0.self_attn.q_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.0.self_attn.k_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.0.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.0.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.0.mlp.gate_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.0.mlp.down_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.0.mlp.up_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.1.self_attn.q_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.1.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.1.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.1.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.1.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.1.mlp.down_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.1.mlp.up_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.2.self_attn.q_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.2.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.2.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.2.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.2.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.2.mlp.down_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.2.mlp.up_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.3.self_attn.q_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.3.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.3.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.3.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.3.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.3.mlp.down_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.3.mlp.up_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.4.self_attn.q_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.4.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.4.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.4.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.4.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.4.mlp.down_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.4.mlp.up_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.5.self_attn.q_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.5.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.5.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.5.self_attn.o_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.5.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.5.mlp.down_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.5.mlp.up_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.6.self_attn.q_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.6.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.6.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.6.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.6.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.6.mlp.down_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.6.mlp.up_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.7.self_attn.q_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.7.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.7.self_attn.v_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.7.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.7.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.7.mlp.down_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.7.mlp.up_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.8.self_attn.q_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.8.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.8.self_attn.v_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.8.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.8.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.8.mlp.down_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.8.mlp.up_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.9.self_attn.q_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.9.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.9.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.9.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.9.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.9.mlp.down_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.9.mlp.up_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.10.self_attn.q_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.10.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.10.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.10.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.10.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.10.mlp.down_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.10.mlp.up_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.11.self_attn.q_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.11.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.11.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.11.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.11.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.11.mlp.down_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.11.mlp.up_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.12.self_attn.q_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.12.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.12.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.12.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.12.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.12.mlp.down_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.12.mlp.up_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.13.self_attn.q_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.13.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.13.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.13.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.13.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.13.mlp.down_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.13.mlp.up_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.14.self_attn.q_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.14.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.14.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.14.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.14.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.14.mlp.down_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.14.mlp.up_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.15.self_attn.q_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.15.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.15.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.15.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.15.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.15.mlp.down_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.15.mlp.up_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.16.self_attn.q_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.16.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.16.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.16.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.16.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.16.mlp.down_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.16.mlp.up_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.17.self_attn.q_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.17.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.17.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.17.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.17.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.17.mlp.down_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.17.mlp.up_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.18.self_attn.q_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.18.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.18.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.18.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.18.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.18.mlp.down_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.18.mlp.up_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.19.self_attn.q_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.19.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.19.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.19.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.19.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.19.mlp.down_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.19.mlp.up_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.20.self_attn.q_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.20.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.20.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.20.self_attn.o_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.20.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.20.mlp.down_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.20.mlp.up_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.21.self_attn.q_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.21.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.21.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.21.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.21.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.21.mlp.down_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.21.mlp.up_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.22.self_attn.q_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.22.self_attn.k_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.22.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.22.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.22.mlp.gate_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.22.mlp.down_proj": {
+            "bits": 4,
+            "group_size": 64
+        },
+        "model.layers.22.mlp.up_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.23.self_attn.q_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.23.self_attn.k_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.23.self_attn.v_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.23.self_attn.o_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.23.mlp.gate_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.23.mlp.down_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "model.layers.23.mlp.up_proj": {
+            "bits": 8,
+            "group_size": 64
+        },
+        "lm_head": {
+            "bits": 8,
+            "group_size": 64
+        }
+    },
+    "rms_norm_eps": 1e-06,
+    "rope_scaling": null,
+    "rope_theta": 5000000,
+    "tie_word_embeddings": false,
+    "torch_dtype": "bfloat16",
+    "transformers_version": "5.6.2",
+    "use_cache": true,
+    "vocab_size": 130560
+}

generation_config.json ADDED Viewed

	@@ -0,0 +1,13 @@

+{
+  "_from_model_config": true,
+  "bos_token_id": 0,
+  "eos_token_id": [
+    1,
+    130073
+  ],
+  "pad_token_id": 1,
+  "do_sample": true,
+  "temperature": 0.9,
+  "top_p": 0.95,
+  "transformers_version": "5.6.2"
+}

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:88bb686ed4a28f7c2065e27aabef7669f84961ac47c83efe2d003436c179e2e4
+size 906870787

model.safetensors.index.json ADDED Viewed

	@@ -0,0 +1,567 @@

+{
+    "metadata": {
+        "total_size": 906808320,
+        "total_parameters": 1080632832
+    },
+    "weight_map": {
+        "lm_head.biases": "model.safetensors",
+        "lm_head.scales": "model.safetensors",
+        "lm_head.weight": "model.safetensors",
+        "model.embed_tokens.biases": "model.safetensors",
+        "model.embed_tokens.scales": "model.safetensors",
+        "model.embed_tokens.weight": "model.safetensors",
+        "model.layers.0.input_layernorm.weight": "model.safetensors",
+        "model.layers.0.mlp.down_proj.biases": "model.safetensors",
+        "model.layers.0.mlp.down_proj.scales": "model.safetensors",
+        "model.layers.0.mlp.down_proj.weight": "model.safetensors",
+        "model.layers.0.mlp.gate_proj.biases": "model.safetensors",
+        "model.layers.0.mlp.gate_proj.scales": "model.safetensors",
+        "model.layers.0.mlp.gate_proj.weight": "model.safetensors",
+        "model.layers.0.mlp.up_proj.biases": "model.safetensors",
+        "model.layers.0.mlp.up_proj.scales": "model.safetensors",
+        "model.layers.0.mlp.up_proj.weight": "model.safetensors",
+        "model.layers.0.post_attention_layernorm.weight": "model.safetensors",
+        "model.layers.0.self_attn.k_proj.biases": "model.safetensors",
+        "model.layers.0.self_attn.k_proj.scales": "model.safetensors",
+        "model.layers.0.self_attn.k_proj.weight": "model.safetensors",
+        "model.layers.0.self_attn.o_proj.biases": "model.safetensors",
+        "model.layers.0.self_attn.o_proj.scales": "model.safetensors",
+        "model.layers.0.self_attn.o_proj.weight": "model.safetensors",
+        "model.layers.0.self_attn.q_proj.biases": "model.safetensors",
+        "model.layers.0.self_attn.q_proj.scales": "model.safetensors",
+        "model.layers.0.self_attn.q_proj.weight": "model.safetensors",
+        "model.layers.0.self_attn.v_proj.biases": "model.safetensors",
+        "model.layers.0.self_attn.v_proj.scales": "model.safetensors",
+        "model.layers.0.self_attn.v_proj.weight": "model.safetensors",
+        "model.layers.1.input_layernorm.weight": "model.safetensors",
+        "model.layers.1.mlp.down_proj.biases": "model.safetensors",
+        "model.layers.1.mlp.down_proj.scales": "model.safetensors",
+        "model.layers.1.mlp.down_proj.weight": "model.safetensors",
+        "model.layers.1.mlp.gate_proj.biases": "model.safetensors",
+        "model.layers.1.mlp.gate_proj.scales": "model.safetensors",
+        "model.layers.1.mlp.gate_proj.weight": "model.safetensors",
+        "model.layers.1.mlp.up_proj.biases": "model.safetensors",
+        "model.layers.1.mlp.up_proj.scales": "model.safetensors",
+        "model.layers.1.mlp.up_proj.weight": "model.safetensors",
+        "model.layers.1.post_attention_layernorm.weight": "model.safetensors",
+        "model.layers.1.self_attn.k_proj.biases": "model.safetensors",
+        "model.layers.1.self_attn.k_proj.scales": "model.safetensors",
+        "model.layers.1.self_attn.k_proj.weight": "model.safetensors",
+        "model.layers.1.self_attn.o_proj.biases": "model.safetensors",
+        "model.layers.1.self_attn.o_proj.scales": "model.safetensors",
+        "model.layers.1.self_attn.o_proj.weight": "model.safetensors",
+        "model.layers.1.self_attn.q_proj.biases": "model.safetensors",
+        "model.layers.1.self_attn.q_proj.scales": "model.safetensors",
+        "model.layers.1.self_attn.q_proj.weight": "model.safetensors",
+        "model.layers.1.self_attn.v_proj.biases": "model.safetensors",
+        "model.layers.1.self_attn.v_proj.scales": "model.safetensors",
+        "model.layers.1.self_attn.v_proj.weight": "model.safetensors",
+        "model.layers.10.input_layernorm.weight": "model.safetensors",
+        "model.layers.10.mlp.down_proj.biases": "model.safetensors",
+        "model.layers.10.mlp.down_proj.scales": "model.safetensors",
+        "model.layers.10.mlp.down_proj.weight": "model.safetensors",
+        "model.layers.10.mlp.gate_proj.biases": "model.safetensors",
+        "model.layers.10.mlp.gate_proj.scales": "model.safetensors",
+        "model.layers.10.mlp.gate_proj.weight": "model.safetensors",
+        "model.layers.10.mlp.up_proj.biases": "model.safetensors",
+        "model.layers.10.mlp.up_proj.scales": "model.safetensors",
+        "model.layers.10.mlp.up_proj.weight": "model.safetensors",
+        "model.layers.10.post_attention_layernorm.weight": "model.safetensors",
+        "model.layers.10.self_attn.k_proj.biases": "model.safetensors",
+        "model.layers.10.self_attn.k_proj.scales": "model.safetensors",
+        "model.layers.10.self_attn.k_proj.weight": "model.safetensors",
+        "model.layers.10.self_attn.o_proj.biases": "model.safetensors",
+        "model.layers.10.self_attn.o_proj.scales": "model.safetensors",
+        "model.layers.10.self_attn.o_proj.weight": "model.safetensors",
+        "model.layers.10.self_attn.q_proj.biases": "model.safetensors",
+        "model.layers.10.self_attn.q_proj.scales": "model.safetensors",
+        "model.layers.10.self_attn.q_proj.weight": "model.safetensors",
+        "model.layers.10.self_attn.v_proj.biases": "model.safetensors",
+        "model.layers.10.self_attn.v_proj.scales": "model.safetensors",
+        "model.layers.10.self_attn.v_proj.weight": "model.safetensors",
+        "model.layers.11.input_layernorm.weight": "model.safetensors",
+        "model.layers.11.mlp.down_proj.biases": "model.safetensors",
+        "model.layers.11.mlp.down_proj.scales": "model.safetensors",
+        "model.layers.11.mlp.down_proj.weight": "model.safetensors",
+        "model.layers.11.mlp.gate_proj.biases": "model.safetensors",
+        "model.layers.11.mlp.gate_proj.scales": "model.safetensors",
+        "model.layers.11.mlp.gate_proj.weight": "model.safetensors",
+        "model.layers.11.mlp.up_proj.biases": "model.safetensors",
+        "model.layers.11.mlp.up_proj.scales": "model.safetensors",
+        "model.layers.11.mlp.up_proj.weight": "model.safetensors",
+        "model.layers.11.post_attention_layernorm.weight": "model.safetensors",
+        "model.layers.11.self_attn.k_proj.biases": "model.safetensors",
+        "model.layers.11.self_attn.k_proj.scales": "model.safetensors",
+        "model.layers.11.self_attn.k_proj.weight": "model.safetensors",
+        "model.layers.11.self_attn.o_proj.biases": "model.safetensors",
+        "model.layers.11.self_attn.o_proj.scales": "model.safetensors",
+        "model.layers.11.self_attn.o_proj.weight": "model.safetensors",
+        "model.layers.11.self_attn.q_proj.biases": "model.safetensors",
+        "model.layers.11.self_attn.q_proj.scales": "model.safetensors",
+        "model.layers.11.self_attn.q_proj.weight": "model.safetensors",
+        "model.layers.11.self_attn.v_proj.biases": "model.safetensors",
+        "model.layers.11.self_attn.v_proj.scales": "model.safetensors",
+        "model.layers.11.self_attn.v_proj.weight": "model.safetensors",
+        "model.layers.12.input_layernorm.weight": "model.safetensors",
+        "model.layers.12.mlp.down_proj.biases": "model.safetensors",
+        "model.layers.12.mlp.down_proj.scales": "model.safetensors",
+        "model.layers.12.mlp.down_proj.weight": "model.safetensors",
+        "model.layers.12.mlp.gate_proj.biases": "model.safetensors",
+        "model.layers.12.mlp.gate_proj.scales": "model.safetensors",
+        "model.layers.12.mlp.gate_proj.weight": "model.safetensors",
+        "model.layers.12.mlp.up_proj.biases": "model.safetensors",
+        "model.layers.12.mlp.up_proj.scales": "model.safetensors",
+        "model.layers.12.mlp.up_proj.weight": "model.safetensors",
+        "model.layers.12.post_attention_layernorm.weight": "model.safetensors",
+        "model.layers.12.self_attn.k_proj.biases": "model.safetensors",
+        "model.layers.12.self_attn.k_proj.scales": "model.safetensors",
+        "model.layers.12.self_attn.k_proj.weight": "model.safetensors",
+        "model.layers.12.self_attn.o_proj.biases": "model.safetensors",
+        "model.layers.12.self_attn.o_proj.scales": "model.safetensors",
+        "model.layers.12.self_attn.o_proj.weight": "model.safetensors",
+        "model.layers.12.self_attn.q_proj.biases": "model.safetensors",
+        "model.layers.12.self_attn.q_proj.scales": "model.safetensors",
+        "model.layers.12.self_attn.q_proj.weight": "model.safetensors",
+        "model.layers.12.self_attn.v_proj.biases": "model.safetensors",
+        "model.layers.12.self_attn.v_proj.scales": "model.safetensors",
+        "model.layers.12.self_attn.v_proj.weight": "model.safetensors",
+        "model.layers.13.input_layernorm.weight": "model.safetensors",
+        "model.layers.13.mlp.down_proj.biases": "model.safetensors",
+        "model.layers.13.mlp.down_proj.scales": "model.safetensors",
+        "model.layers.13.mlp.down_proj.weight": "model.safetensors",
+        "model.layers.13.mlp.gate_proj.biases": "model.safetensors",
+        "model.layers.13.mlp.gate_proj.scales": "model.safetensors",
+        "model.layers.13.mlp.gate_proj.weight": "model.safetensors",
+        "model.layers.13.mlp.up_proj.biases": "model.safetensors",
+        "model.layers.13.mlp.up_proj.scales": "model.safetensors",
+        "model.layers.13.mlp.up_proj.weight": "model.safetensors",
+        "model.layers.13.post_attention_layernorm.weight": "model.safetensors",
+        "model.layers.13.self_attn.k_proj.biases": "model.safetensors",
+        "model.layers.13.self_attn.k_proj.scales": "model.safetensors",
+        "model.layers.13.self_attn.k_proj.weight": "model.safetensors",
+        "model.layers.13.self_attn.o_proj.biases": "model.safetensors",
+        "model.layers.13.self_attn.o_proj.scales": "model.safetensors",
+        "model.layers.13.self_attn.o_proj.weight": "model.safetensors",
+        "model.layers.13.self_attn.q_proj.biases": "model.safetensors",
+        "model.layers.13.self_attn.q_proj.scales": "model.safetensors",
+        "model.layers.13.self_attn.q_proj.weight": "model.safetensors",
+        "model.layers.13.self_attn.v_proj.biases": "model.safetensors",
+        "model.layers.13.self_attn.v_proj.scales": "model.safetensors",
+        "model.layers.13.self_attn.v_proj.weight": "model.safetensors",
+        "model.layers.14.input_layernorm.weight": "model.safetensors",
+        "model.layers.14.mlp.down_proj.biases": "model.safetensors",
+        "model.layers.14.mlp.down_proj.scales": "model.safetensors",
+        "model.layers.14.mlp.down_proj.weight": "model.safetensors",
+        "model.layers.14.mlp.gate_proj.biases": "model.safetensors",
+        "model.layers.14.mlp.gate_proj.scales": "model.safetensors",
+        "model.layers.14.mlp.gate_proj.weight": "model.safetensors",
+        "model.layers.14.mlp.up_proj.biases": "model.safetensors",
+        "model.layers.14.mlp.up_proj.scales": "model.safetensors",
+        "model.layers.14.mlp.up_proj.weight": "model.safetensors",
+        "model.layers.14.post_attention_layernorm.weight": "model.safetensors",
+        "model.layers.14.self_attn.k_proj.biases": "model.safetensors",
+        "model.layers.14.self_attn.k_proj.scales": "model.safetensors",
+        "model.layers.14.self_attn.k_proj.weight": "model.safetensors",
+        "model.layers.14.self_attn.o_proj.biases": "model.safetensors",
+        "model.layers.14.self_attn.o_proj.scales": "model.safetensors",
+        "model.layers.14.self_attn.o_proj.weight": "model.safetensors",
+        "model.layers.14.self_attn.q_proj.biases": "model.safetensors",
+        "model.layers.14.self_attn.q_proj.scales": "model.safetensors",
+        "model.layers.14.self_attn.q_proj.weight": "model.safetensors",
+        "model.layers.14.self_attn.v_proj.biases": "model.safetensors",
+        "model.layers.14.self_attn.v_proj.scales": "model.safetensors",
+        "model.layers.14.self_attn.v_proj.weight": "model.safetensors",
+        "model.layers.15.input_layernorm.weight": "model.safetensors",
+        "model.layers.15.mlp.down_proj.biases": "model.safetensors",
+        "model.layers.15.mlp.down_proj.scales": "model.safetensors",
+        "model.layers.15.mlp.down_proj.weight": "model.safetensors",
+        "model.layers.15.mlp.gate_proj.biases": "model.safetensors",
+        "model.layers.15.mlp.gate_proj.scales": "model.safetensors",
+        "model.layers.15.mlp.gate_proj.weight": "model.safetensors",
+        "model.layers.15.mlp.up_proj.biases": "model.safetensors",
+        "model.layers.15.mlp.up_proj.scales": "model.safetensors",
+        "model.layers.15.mlp.up_proj.weight": "model.safetensors",
+        "model.layers.15.post_attention_layernorm.weight": "model.safetensors",
+        "model.layers.15.self_attn.k_proj.biases": "model.safetensors",
+        "model.layers.15.self_attn.k_proj.scales": "model.safetensors",
+        "model.layers.15.self_attn.k_proj.weight": "model.safetensors",
+        "model.layers.15.self_attn.o_proj.biases": "model.safetensors",
+        "model.layers.15.self_attn.o_proj.scales": "model.safetensors",
+        "model.layers.15.self_attn.o_proj.weight": "model.safetensors",
+        "model.layers.15.self_attn.q_proj.biases": "model.safetensors",
+        "model.layers.15.self_attn.q_proj.scales": "model.safetensors",
+        "model.layers.15.self_attn.q_proj.weight": "model.safetensors",
+        "model.layers.15.self_attn.v_proj.biases": "model.safetensors",
+        "model.layers.15.self_attn.v_proj.scales": "model.safetensors",
+        "model.layers.15.self_attn.v_proj.weight": "model.safetensors",
+        "model.layers.16.input_layernorm.weight": "model.safetensors",
+        "model.layers.16.mlp.down_proj.biases": "model.safetensors",
+        "model.layers.16.mlp.down_proj.scales": "model.safetensors",
+        "model.layers.16.mlp.down_proj.weight": "model.safetensors",
+        "model.layers.16.mlp.gate_proj.biases": "model.safetensors",
+        "model.layers.16.mlp.gate_proj.scales": "model.safetensors",
+        "model.layers.16.mlp.gate_proj.weight": "model.safetensors",
+        "model.layers.16.mlp.up_proj.biases": "model.safetensors",
+        "model.layers.16.mlp.up_proj.scales": "model.safetensors",
+        "model.layers.16.mlp.up_proj.weight": "model.safetensors",
+        "model.layers.16.post_attention_layernorm.weight": "model.safetensors",
+        "model.layers.16.self_attn.k_proj.biases": "model.safetensors",
+        "model.layers.16.self_attn.k_proj.scales": "model.safetensors",
+        "model.layers.16.self_attn.k_proj.weight": "model.safetensors",
+        "model.layers.16.self_attn.o_proj.biases": "model.safetensors",
+        "model.layers.16.self_attn.o_proj.scales": "model.safetensors",
+        "model.layers.16.self_attn.o_proj.weight": "model.safetensors",
+        "model.layers.16.self_attn.q_proj.biases": "model.safetensors",
+        "model.layers.16.self_attn.q_proj.scales": "model.safetensors",
+        "model.layers.16.self_attn.q_proj.weight": "model.safetensors",
+        "model.layers.16.self_attn.v_proj.biases": "model.safetensors",
+        "model.layers.16.self_attn.v_proj.scales": "model.safetensors",
+        "model.layers.16.self_attn.v_proj.weight": "model.safetensors",
+        "model.layers.17.input_layernorm.weight": "model.safetensors",
+        "model.layers.17.mlp.down_proj.biases": "model.safetensors",
+        "model.layers.17.mlp.down_proj.scales": "model.safetensors",
+        "model.layers.17.mlp.down_proj.weight": "model.safetensors",
+        "model.layers.17.mlp.gate_proj.biases": "model.safetensors",
+        "model.layers.17.mlp.gate_proj.scales": "model.safetensors",
+        "model.layers.17.mlp.gate_proj.weight": "model.safetensors",
+        "model.layers.17.mlp.up_proj.biases": "model.safetensors",
+        "model.layers.17.mlp.up_proj.scales": "model.safetensors",
+        "model.layers.17.mlp.up_proj.weight": "model.safetensors",
+        "model.layers.17.post_attention_layernorm.weight": "model.safetensors",
+        "model.layers.17.self_attn.k_proj.biases": "model.safetensors",
+        "model.layers.17.self_attn.k_proj.scales": "model.safetensors",
+        "model.layers.17.self_attn.k_proj.weight": "model.safetensors",
+        "model.layers.17.self_attn.o_proj.biases": "model.safetensors",
+        "model.layers.17.self_attn.o_proj.scales": "model.safetensors",
+        "model.layers.17.self_attn.o_proj.weight": "model.safetensors",
+        "model.layers.17.self_attn.q_proj.biases": "model.safetensors",
+        "model.layers.17.self_attn.q_proj.scales": "model.safetensors",
+        "model.layers.17.self_attn.q_proj.weight": "model.safetensors",
+        "model.layers.17.self_attn.v_proj.biases": "model.safetensors",
+        "model.layers.17.self_attn.v_proj.scales": "model.safetensors",
+        "model.layers.17.self_attn.v_proj.weight": "model.safetensors",
+        "model.layers.18.input_layernorm.weight": "model.safetensors",
+        "model.layers.18.mlp.down_proj.biases": "model.safetensors",
+        "model.layers.18.mlp.down_proj.scales": "model.safetensors",
+        "model.layers.18.mlp.down_proj.weight": "model.safetensors",
+        "model.layers.18.mlp.gate_proj.biases": "model.safetensors",
+        "model.layers.18.mlp.gate_proj.scales": "model.safetensors",
+        "model.layers.18.mlp.gate_proj.weight": "model.safetensors",
+        "model.layers.18.mlp.up_proj.biases": "model.safetensors",
+        "model.layers.18.mlp.up_proj.scales": "model.safetensors",
+        "model.layers.18.mlp.up_proj.weight": "model.safetensors",
+        "model.layers.18.post_attention_layernorm.weight": "model.safetensors",
+        "model.layers.18.self_attn.k_proj.biases": "model.safetensors",
+        "model.layers.18.self_attn.k_proj.scales": "model.safetensors",
+        "model.layers.18.self_attn.k_proj.weight": "model.safetensors",
+        "model.layers.18.self_attn.o_proj.biases": "model.safetensors",
+        "model.layers.18.self_attn.o_proj.scales": "model.safetensors",
+        "model.layers.18.self_attn.o_proj.weight": "model.safetensors",
+        "model.layers.18.self_attn.q_proj.biases": "model.safetensors",
+        "model.layers.18.self_attn.q_proj.scales": "model.safetensors",
+        "model.layers.18.self_attn.q_proj.weight": "model.safetensors",
+        "model.layers.18.self_attn.v_proj.biases": "model.safetensors",
+        "model.layers.18.self_attn.v_proj.scales": "model.safetensors",
+        "model.layers.18.self_attn.v_proj.weight": "model.safetensors",
+        "model.layers.19.input_layernorm.weight": "model.safetensors",
+        "model.layers.19.mlp.down_proj.biases": "model.safetensors",
+        "model.layers.19.mlp.down_proj.scales": "model.safetensors",
+        "model.layers.19.mlp.down_proj.weight": "model.safetensors",
+        "model.layers.19.mlp.gate_proj.biases": "model.safetensors",
+        "model.layers.19.mlp.gate_proj.scales": "model.safetensors",
+        "model.layers.19.mlp.gate_proj.weight": "model.safetensors",
+        "model.layers.19.mlp.up_proj.biases": "model.safetensors",
+        "model.layers.19.mlp.up_proj.scales": "model.safetensors",
+        "model.layers.19.mlp.up_proj.weight": "model.safetensors",
+        "model.layers.19.post_attention_layernorm.weight": "model.safetensors",
+        "model.layers.19.self_attn.k_proj.biases": "model.safetensors",
+        "model.layers.19.self_attn.k_proj.scales": "model.safetensors",
+        "model.layers.19.self_attn.k_proj.weight": "model.safetensors",
+        "model.layers.19.self_attn.o_proj.biases": "model.safetensors",
+        "model.layers.19.self_attn.o_proj.scales": "model.safetensors",
+        "model.layers.19.self_attn.o_proj.weight": "model.safetensors",
+        "model.layers.19.self_attn.q_proj.biases": "model.safetensors",
+        "model.layers.19.self_attn.q_proj.scales": "model.safetensors",
+        "model.layers.19.self_attn.q_proj.weight": "model.safetensors",
+        "model.layers.19.self_attn.v_proj.biases": "model.safetensors",
+        "model.layers.19.self_attn.v_proj.scales": "model.safetensors",
+        "model.layers.19.self_attn.v_proj.weight": "model.safetensors",
+        "model.layers.2.input_layernorm.weight": "model.safetensors",
+        "model.layers.2.mlp.down_proj.biases": "model.safetensors",
+        "model.layers.2.mlp.down_proj.scales": "model.safetensors",
+        "model.layers.2.mlp.down_proj.weight": "model.safetensors",
+        "model.layers.2.mlp.gate_proj.biases": "model.safetensors",
+        "model.layers.2.mlp.gate_proj.scales": "model.safetensors",
+        "model.layers.2.mlp.gate_proj.weight": "model.safetensors",
+        "model.layers.2.mlp.up_proj.biases": "model.safetensors",
+        "model.layers.2.mlp.up_proj.scales": "model.safetensors",
+        "model.layers.2.mlp.up_proj.weight": "model.safetensors",
+        "model.layers.2.post_attention_layernorm.weight": "model.safetensors",
+        "model.layers.2.self_attn.k_proj.biases": "model.safetensors",
+        "model.layers.2.self_attn.k_proj.scales": "model.safetensors",
+        "model.layers.2.self_attn.k_proj.weight": "model.safetensors",
+        "model.layers.2.self_attn.o_proj.biases": "model.safetensors",
+        "model.layers.2.self_attn.o_proj.scales": "model.safetensors",
+        "model.layers.2.self_attn.o_proj.weight": "model.safetensors",
+        "model.layers.2.self_attn.q_proj.biases": "model.safetensors",
+        "model.layers.2.self_attn.q_proj.scales": "model.safetensors",
+        "model.layers.2.self_attn.q_proj.weight": "model.safetensors",
+        "model.layers.2.self_attn.v_proj.biases": "model.safetensors",
+        "model.layers.2.self_attn.v_proj.scales": "model.safetensors",
+        "model.layers.2.self_attn.v_proj.weight": "model.safetensors",
+        "model.layers.20.input_layernorm.weight": "model.safetensors",
+        "model.layers.20.mlp.down_proj.biases": "model.safetensors",
+        "model.layers.20.mlp.down_proj.scales": "model.safetensors",
+        "model.layers.20.mlp.down_proj.weight": "model.safetensors",
+        "model.layers.20.mlp.gate_proj.biases": "model.safetensors",
+        "model.layers.20.mlp.gate_proj.scales": "model.safetensors",
+        "model.layers.20.mlp.gate_proj.weight": "model.safetensors",
+        "model.layers.20.mlp.up_proj.biases": "model.safetensors",
+        "model.layers.20.mlp.up_proj.scales": "model.safetensors",
+        "model.layers.20.mlp.up_proj.weight": "model.safetensors",
+        "model.layers.20.post_attention_layernorm.weight": "model.safetensors",
+        "model.layers.20.self_attn.k_proj.biases": "model.safetensors",
+        "model.layers.20.self_attn.k_proj.scales": "model.safetensors",
+        "model.layers.20.self_attn.k_proj.weight": "model.safetensors",
+        "model.layers.20.self_attn.o_proj.biases": "model.safetensors",
+        "model.layers.20.self_attn.o_proj.scales": "model.safetensors",
+        "model.layers.20.self_attn.o_proj.weight": "model.safetensors",
+        "model.layers.20.self_attn.q_proj.biases": "model.safetensors",
+        "model.layers.20.self_attn.q_proj.scales": "model.safetensors",
+        "model.layers.20.self_attn.q_proj.weight": "model.safetensors",
+        "model.layers.20.self_attn.v_proj.biases": "model.safetensors",
+        "model.layers.20.self_attn.v_proj.scales": "model.safetensors",
+        "model.layers.20.self_attn.v_proj.weight": "model.safetensors",
+        "model.layers.21.input_layernorm.weight": "model.safetensors",
+        "model.layers.21.mlp.down_proj.biases": "model.safetensors",
+        "model.layers.21.mlp.down_proj.scales": "model.safetensors",
+        "model.layers.21.mlp.down_proj.weight": "model.safetensors",
+        "model.layers.21.mlp.gate_proj.biases": "model.safetensors",
+        "model.layers.21.mlp.gate_proj.scales": "model.safetensors",
+        "model.layers.21.mlp.gate_proj.weight": "model.safetensors",
+        "model.layers.21.mlp.up_proj.biases": "model.safetensors",
+        "model.layers.21.mlp.up_proj.scales": "model.safetensors",
+        "model.layers.21.mlp.up_proj.weight": "model.safetensors",
+        "model.layers.21.post_attention_layernorm.weight": "model.safetensors",
+        "model.layers.21.self_attn.k_proj.biases": "model.safetensors",
+        "model.layers.21.self_attn.k_proj.scales": "model.safetensors",
+        "model.layers.21.self_attn.k_proj.weight": "model.safetensors",
+        "model.layers.21.self_attn.o_proj.biases": "model.safetensors",
+        "model.layers.21.self_attn.o_proj.scales": "model.safetensors",
+        "model.layers.21.self_attn.o_proj.weight": "model.safetensors",
+        "model.layers.21.self_attn.q_proj.biases": "model.safetensors",
+        "model.layers.21.self_attn.q_proj.scales": "model.safetensors",
+        "model.layers.21.self_attn.q_proj.weight": "model.safetensors",
+        "model.layers.21.self_attn.v_proj.biases": "model.safetensors",
+        "model.layers.21.self_attn.v_proj.scales": "model.safetensors",
+        "model.layers.21.self_attn.v_proj.weight": "model.safetensors",
+        "model.layers.22.input_layernorm.weight": "model.safetensors",
+        "model.layers.22.mlp.down_proj.biases": "model.safetensors",
+        "model.layers.22.mlp.down_proj.scales": "model.safetensors",
+        "model.layers.22.mlp.down_proj.weight": "model.safetensors",
+        "model.layers.22.mlp.gate_proj.biases": "model.safetensors",
+        "model.layers.22.mlp.gate_proj.scales": "model.safetensors",
+        "model.layers.22.mlp.gate_proj.weight": "model.safetensors",
+        "model.layers.22.mlp.up_proj.biases": "model.safetensors",
+        "model.layers.22.mlp.up_proj.scales": "model.safetensors",
+        "model.layers.22.mlp.up_proj.weight": "model.safetensors",
+        "model.layers.22.post_attention_layernorm.weight": "model.safetensors",
+        "model.layers.22.self_attn.k_proj.biases": "model.safetensors",
+        "model.layers.22.self_attn.k_proj.scales": "model.safetensors",
+        "model.layers.22.self_attn.k_proj.weight": "model.safetensors",
+        "model.layers.22.self_attn.o_proj.biases": "model.safetensors",
+        "model.layers.22.self_attn.o_proj.scales": "model.safetensors",
+        "model.layers.22.self_attn.o_proj.weight": "model.safetensors",
+        "model.layers.22.self_attn.q_proj.biases": "model.safetensors",
+        "model.layers.22.self_attn.q_proj.scales": "model.safetensors",
+        "model.layers.22.self_attn.q_proj.weight": "model.safetensors",
+        "model.layers.22.self_attn.v_proj.biases": "model.safetensors",
+        "model.layers.22.self_attn.v_proj.scales": "model.safetensors",
+        "model.layers.22.self_attn.v_proj.weight": "model.safetensors",
+        "model.layers.23.input_layernorm.weight": "model.safetensors",
+        "model.layers.23.mlp.down_proj.biases": "model.safetensors",
+        "model.layers.23.mlp.down_proj.scales": "model.safetensors",
+        "model.layers.23.mlp.down_proj.weight": "model.safetensors",
+        "model.layers.23.mlp.gate_proj.biases": "model.safetensors",
+        "model.layers.23.mlp.gate_proj.scales": "model.safetensors",
+        "model.layers.23.mlp.gate_proj.weight": "model.safetensors",
+        "model.layers.23.mlp.up_proj.biases": "model.safetensors",
+        "model.layers.23.mlp.up_proj.scales": "model.safetensors",
+        "model.layers.23.mlp.up_proj.weight": "model.safetensors",
+        "model.layers.23.post_attention_layernorm.weight": "model.safetensors",
+        "model.layers.23.self_attn.k_proj.biases": "model.safetensors",
+        "model.layers.23.self_attn.k_proj.scales": "model.safetensors",
+        "model.layers.23.self_attn.k_proj.weight": "model.safetensors",
+        "model.layers.23.self_attn.o_proj.biases": "model.safetensors",
+        "model.layers.23.self_attn.o_proj.scales": "model.safetensors",
+        "model.layers.23.self_attn.o_proj.weight": "model.safetensors",
+        "model.layers.23.self_attn.q_proj.biases": "model.safetensors",
+        "model.layers.23.self_attn.q_proj.scales": "model.safetensors",
+        "model.layers.23.self_attn.q_proj.weight": "model.safetensors",
+        "model.layers.23.self_attn.v_proj.biases": "model.safetensors",
+        "model.layers.23.self_attn.v_proj.scales": "model.safetensors",
+        "model.layers.23.self_attn.v_proj.weight": "model.safetensors",
+        "model.layers.3.input_layernorm.weight": "model.safetensors",
+        "model.layers.3.mlp.down_proj.biases": "model.safetensors",
+        "model.layers.3.mlp.down_proj.scales": "model.safetensors",
+        "model.layers.3.mlp.down_proj.weight": "model.safetensors",
+        "model.layers.3.mlp.gate_proj.biases": "model.safetensors",
+        "model.layers.3.mlp.gate_proj.scales": "model.safetensors",
+        "model.layers.3.mlp.gate_proj.weight": "model.safetensors",
+        "model.layers.3.mlp.up_proj.biases": "model.safetensors",
+        "model.layers.3.mlp.up_proj.scales": "model.safetensors",
+        "model.layers.3.mlp.up_proj.weight": "model.safetensors",
+        "model.layers.3.post_attention_layernorm.weight": "model.safetensors",
+        "model.layers.3.self_attn.k_proj.biases": "model.safetensors",
+        "model.layers.3.self_attn.k_proj.scales": "model.safetensors",
+        "model.layers.3.self_attn.k_proj.weight": "model.safetensors",
+        "model.layers.3.self_attn.o_proj.biases": "model.safetensors",
+        "model.layers.3.self_attn.o_proj.scales": "model.safetensors",
+        "model.layers.3.self_attn.o_proj.weight": "model.safetensors",
+        "model.layers.3.self_attn.q_proj.biases": "model.safetensors",
+        "model.layers.3.self_attn.q_proj.scales": "model.safetensors",
+        "model.layers.3.self_attn.q_proj.weight": "model.safetensors",
+        "model.layers.3.self_attn.v_proj.biases": "model.safetensors",
+        "model.layers.3.self_attn.v_proj.scales": "model.safetensors",
+        "model.layers.3.self_attn.v_proj.weight": "model.safetensors",
+        "model.layers.4.input_layernorm.weight": "model.safetensors",
+        "model.layers.4.mlp.down_proj.biases": "model.safetensors",
+        "model.layers.4.mlp.down_proj.scales": "model.safetensors",
+        "model.layers.4.mlp.down_proj.weight": "model.safetensors",
+        "model.layers.4.mlp.gate_proj.biases": "model.safetensors",
+        "model.layers.4.mlp.gate_proj.scales": "model.safetensors",
+        "model.layers.4.mlp.gate_proj.weight": "model.safetensors",
+        "model.layers.4.mlp.up_proj.biases": "model.safetensors",
+        "model.layers.4.mlp.up_proj.scales": "model.safetensors",
+        "model.layers.4.mlp.up_proj.weight": "model.safetensors",
+        "model.layers.4.post_attention_layernorm.weight": "model.safetensors",
+        "model.layers.4.self_attn.k_proj.biases": "model.safetensors",
+        "model.layers.4.self_attn.k_proj.scales": "model.safetensors",
+        "model.layers.4.self_attn.k_proj.weight": "model.safetensors",
+        "model.layers.4.self_attn.o_proj.biases": "model.safetensors",
+        "model.layers.4.self_attn.o_proj.scales": "model.safetensors",
+        "model.layers.4.self_attn.o_proj.weight": "model.safetensors",
+        "model.layers.4.self_attn.q_proj.biases": "model.safetensors",
+        "model.layers.4.self_attn.q_proj.scales": "model.safetensors",
+        "model.layers.4.self_attn.q_proj.weight": "model.safetensors",
+        "model.layers.4.self_attn.v_proj.biases": "model.safetensors",
+        "model.layers.4.self_attn.v_proj.scales": "model.safetensors",
+        "model.layers.4.self_attn.v_proj.weight": "model.safetensors",
+        "model.layers.5.input_layernorm.weight": "model.safetensors",
+        "model.layers.5.mlp.down_proj.biases": "model.safetensors",
+        "model.layers.5.mlp.down_proj.scales": "model.safetensors",
+        "model.layers.5.mlp.down_proj.weight": "model.safetensors",
+        "model.layers.5.mlp.gate_proj.biases": "model.safetensors",
+        "model.layers.5.mlp.gate_proj.scales": "model.safetensors",
+        "model.layers.5.mlp.gate_proj.weight": "model.safetensors",
+        "model.layers.5.mlp.up_proj.biases": "model.safetensors",
+        "model.layers.5.mlp.up_proj.scales": "model.safetensors",
+        "model.layers.5.mlp.up_proj.weight": "model.safetensors",
+        "model.layers.5.post_attention_layernorm.weight": "model.safetensors",
+        "model.layers.5.self_attn.k_proj.biases": "model.safetensors",
+        "model.layers.5.self_attn.k_proj.scales": "model.safetensors",
+        "model.layers.5.self_attn.k_proj.weight": "model.safetensors",
+        "model.layers.5.self_attn.o_proj.biases": "model.safetensors",
+        "model.layers.5.self_attn.o_proj.scales": "model.safetensors",
+        "model.layers.5.self_attn.o_proj.weight": "model.safetensors",
+        "model.layers.5.self_attn.q_proj.biases": "model.safetensors",
+        "model.layers.5.self_attn.q_proj.scales": "model.safetensors",
+        "model.layers.5.self_attn.q_proj.weight": "model.safetensors",
+        "model.layers.5.self_attn.v_proj.biases": "model.safetensors",
+        "model.layers.5.self_attn.v_proj.scales": "model.safetensors",
+        "model.layers.5.self_attn.v_proj.weight": "model.safetensors",
+        "model.layers.6.input_layernorm.weight": "model.safetensors",
+        "model.layers.6.mlp.down_proj.biases": "model.safetensors",
+        "model.layers.6.mlp.down_proj.scales": "model.safetensors",
+        "model.layers.6.mlp.down_proj.weight": "model.safetensors",
+        "model.layers.6.mlp.gate_proj.biases": "model.safetensors",
+        "model.layers.6.mlp.gate_proj.scales": "model.safetensors",
+        "model.layers.6.mlp.gate_proj.weight": "model.safetensors",
+        "model.layers.6.mlp.up_proj.biases": "model.safetensors",
+        "model.layers.6.mlp.up_proj.scales": "model.safetensors",
+        "model.layers.6.mlp.up_proj.weight": "model.safetensors",
+        "model.layers.6.post_attention_layernorm.weight": "model.safetensors",
+        "model.layers.6.self_attn.k_proj.biases": "model.safetensors",
+        "model.layers.6.self_attn.k_proj.scales": "model.safetensors",
+        "model.layers.6.self_attn.k_proj.weight": "model.safetensors",
+        "model.layers.6.self_attn.o_proj.biases": "model.safetensors",
+        "model.layers.6.self_attn.o_proj.scales": "model.safetensors",
+        "model.layers.6.self_attn.o_proj.weight": "model.safetensors",
+        "model.layers.6.self_attn.q_proj.biases": "model.safetensors",
+        "model.layers.6.self_attn.q_proj.scales": "model.safetensors",
+        "model.layers.6.self_attn.q_proj.weight": "model.safetensors",
+        "model.layers.6.self_attn.v_proj.biases": "model.safetensors",
+        "model.layers.6.self_attn.v_proj.scales": "model.safetensors",
+        "model.layers.6.self_attn.v_proj.weight": "model.safetensors",
+        "model.layers.7.input_layernorm.weight": "model.safetensors",
+        "model.layers.7.mlp.down_proj.biases": "model.safetensors",
+        "model.layers.7.mlp.down_proj.scales": "model.safetensors",
+        "model.layers.7.mlp.down_proj.weight": "model.safetensors",
+        "model.layers.7.mlp.gate_proj.biases": "model.safetensors",
+        "model.layers.7.mlp.gate_proj.scales": "model.safetensors",
+        "model.layers.7.mlp.gate_proj.weight": "model.safetensors",
+        "model.layers.7.mlp.up_proj.biases": "model.safetensors",
+        "model.layers.7.mlp.up_proj.scales": "model.safetensors",
+        "model.layers.7.mlp.up_proj.weight": "model.safetensors",
+        "model.layers.7.post_attention_layernorm.weight": "model.safetensors",
+        "model.layers.7.self_attn.k_proj.biases": "model.safetensors",
+        "model.layers.7.self_attn.k_proj.scales": "model.safetensors",
+        "model.layers.7.self_attn.k_proj.weight": "model.safetensors",
+        "model.layers.7.self_attn.o_proj.biases": "model.safetensors",
+        "model.layers.7.self_attn.o_proj.scales": "model.safetensors",
+        "model.layers.7.self_attn.o_proj.weight": "model.safetensors",
+        "model.layers.7.self_attn.q_proj.biases": "model.safetensors",
+        "model.layers.7.self_attn.q_proj.scales": "model.safetensors",
+        "model.layers.7.self_attn.q_proj.weight": "model.safetensors",
+        "model.layers.7.self_attn.v_proj.biases": "model.safetensors",
+        "model.layers.7.self_attn.v_proj.scales": "model.safetensors",
+        "model.layers.7.self_attn.v_proj.weight": "model.safetensors",
+        "model.layers.8.input_layernorm.weight": "model.safetensors",
+        "model.layers.8.mlp.down_proj.biases": "model.safetensors",
+        "model.layers.8.mlp.down_proj.scales": "model.safetensors",
+        "model.layers.8.mlp.down_proj.weight": "model.safetensors",
+        "model.layers.8.mlp.gate_proj.biases": "model.safetensors",
+        "model.layers.8.mlp.gate_proj.scales": "model.safetensors",
+        "model.layers.8.mlp.gate_proj.weight": "model.safetensors",
+        "model.layers.8.mlp.up_proj.biases": "model.safetensors",
+        "model.layers.8.mlp.up_proj.scales": "model.safetensors",
+        "model.layers.8.mlp.up_proj.weight": "model.safetensors",
+        "model.layers.8.post_attention_layernorm.weight": "model.safetensors",
+        "model.layers.8.self_attn.k_proj.biases": "model.safetensors",
+        "model.layers.8.self_attn.k_proj.scales": "model.safetensors",
+        "model.layers.8.self_attn.k_proj.weight": "model.safetensors",
+        "model.layers.8.self_attn.o_proj.biases": "model.safetensors",
+        "model.layers.8.self_attn.o_proj.scales": "model.safetensors",
+        "model.layers.8.self_attn.o_proj.weight": "model.safetensors",
+        "model.layers.8.self_attn.q_proj.biases": "model.safetensors",
+        "model.layers.8.self_attn.q_proj.scales": "model.safetensors",
+        "model.layers.8.self_attn.q_proj.weight": "model.safetensors",
+        "model.layers.8.self_attn.v_proj.biases": "model.safetensors",
+        "model.layers.8.self_attn.v_proj.scales": "model.safetensors",
+        "model.layers.8.self_attn.v_proj.weight": "model.safetensors",
+        "model.layers.9.input_layernorm.weight": "model.safetensors",
+        "model.layers.9.mlp.down_proj.biases": "model.safetensors",
+        "model.layers.9.mlp.down_proj.scales": "model.safetensors",
+        "model.layers.9.mlp.down_proj.weight": "model.safetensors",
+        "model.layers.9.mlp.gate_proj.biases": "model.safetensors",
+        "model.layers.9.mlp.gate_proj.scales": "model.safetensors",
+        "model.layers.9.mlp.gate_proj.weight": "model.safetensors",
+        "model.layers.9.mlp.up_proj.biases": "model.safetensors",
+        "model.layers.9.mlp.up_proj.scales": "model.safetensors",
+        "model.layers.9.mlp.up_proj.weight": "model.safetensors",
+        "model.layers.9.post_attention_layernorm.weight": "model.safetensors",
+        "model.layers.9.self_attn.k_proj.biases": "model.safetensors",
+        "model.layers.9.self_attn.k_proj.scales": "model.safetensors",
+        "model.layers.9.self_attn.k_proj.weight": "model.safetensors",
+        "model.layers.9.self_attn.o_proj.biases": "model.safetensors",
+        "model.layers.9.self_attn.o_proj.scales": "model.safetensors",
+        "model.layers.9.self_attn.o_proj.weight": "model.safetensors",
+        "model.layers.9.self_attn.q_proj.biases": "model.safetensors",
+        "model.layers.9.self_attn.q_proj.scales": "model.safetensors",
+        "model.layers.9.self_attn.q_proj.weight": "model.safetensors",
+        "model.layers.9.self_attn.v_proj.biases": "model.safetensors",
+        "model.layers.9.self_attn.v_proj.scales": "model.safetensors",
+        "model.layers.9.self_attn.v_proj.weight": "model.safetensors",
+        "model.norm.weight": "model.safetensors"
+    }
+}

optiq_metadata.json ADDED Viewed

	@@ -0,0 +1,688 @@

+{
+  "method": "optiq_mixed_precision",
+  "base_model": "openbmb/MiniCPM5-1B",
+  "reference": "bf16",
+  "target_bpw": 5.0,
+  "achieved_bpw": 5.805183199285076,
+  "n_high_bits": 67,
+  "n_low_bits": 102,
+  "threshold": 0.0,
+  "per_layer": {
+    "lm_head": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.23.mlp.up_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.23.mlp.down_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.23.mlp.gate_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.23.self_attn.o_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.23.self_attn.v_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.23.self_attn.k_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.23.self_attn.q_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.22.mlp.up_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.22.mlp.down_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.22.mlp.gate_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.22.self_attn.o_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.22.self_attn.v_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.22.self_attn.k_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.22.self_attn.q_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.21.mlp.up_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.21.mlp.down_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.21.mlp.gate_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.21.self_attn.o_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.21.self_attn.v_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.21.self_attn.k_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.21.self_attn.q_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.20.mlp.up_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.20.mlp.down_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.20.mlp.gate_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.20.self_attn.o_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.20.self_attn.v_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.20.self_attn.k_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.20.self_attn.q_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.19.mlp.up_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.19.mlp.down_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.19.mlp.gate_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.19.self_attn.o_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.19.self_attn.v_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.19.self_attn.k_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.19.self_attn.q_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.18.mlp.up_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.18.mlp.down_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.18.mlp.gate_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.18.self_attn.o_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.18.self_attn.v_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.18.self_attn.k_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.18.self_attn.q_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.17.mlp.up_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.17.mlp.down_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.17.mlp.gate_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.17.self_attn.o_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.17.self_attn.v_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.17.self_attn.k_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.17.self_attn.q_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.16.mlp.up_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.16.mlp.down_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.16.mlp.gate_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.16.self_attn.o_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.16.self_attn.v_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.16.self_attn.k_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.16.self_attn.q_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.15.mlp.up_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.15.mlp.down_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.15.mlp.gate_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.15.self_attn.o_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.15.self_attn.v_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.15.self_attn.k_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.15.self_attn.q_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.14.mlp.up_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.14.mlp.down_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.14.mlp.gate_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.14.self_attn.o_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.14.self_attn.v_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.14.self_attn.k_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.14.self_attn.q_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.13.mlp.up_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.13.mlp.down_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.13.mlp.gate_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.13.self_attn.o_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.13.self_attn.v_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.13.self_attn.k_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.13.self_attn.q_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.12.mlp.up_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.12.mlp.down_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.12.mlp.gate_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.12.self_attn.o_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.12.self_attn.v_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.12.self_attn.k_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.12.self_attn.q_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.11.mlp.up_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.11.mlp.down_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.11.mlp.gate_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.11.self_attn.o_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.11.self_attn.v_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.11.self_attn.k_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.11.self_attn.q_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.10.mlp.up_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.10.mlp.down_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.10.mlp.gate_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.10.self_attn.o_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.10.self_attn.v_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.10.self_attn.k_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.10.self_attn.q_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.9.mlp.up_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.9.mlp.down_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.9.mlp.gate_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.9.self_attn.o_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.9.self_attn.v_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.9.self_attn.k_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.9.self_attn.q_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.8.mlp.up_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.8.mlp.down_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.8.mlp.gate_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.8.self_attn.o_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.8.self_attn.v_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.8.self_attn.k_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.8.self_attn.q_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.7.mlp.up_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.7.mlp.down_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.7.mlp.gate_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.7.self_attn.o_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.7.self_attn.v_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.7.self_attn.k_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.7.self_attn.q_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.6.mlp.up_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.6.mlp.down_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.6.mlp.gate_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.6.self_attn.o_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.6.self_attn.v_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.6.self_attn.k_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.6.self_attn.q_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.5.mlp.up_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.5.mlp.down_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.5.mlp.gate_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.5.self_attn.o_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.5.self_attn.v_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.5.self_attn.k_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.5.self_attn.q_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.4.mlp.up_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.4.mlp.down_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.4.mlp.gate_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.4.self_attn.o_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.4.self_attn.v_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.4.self_attn.k_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.4.self_attn.q_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.3.mlp.up_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.3.mlp.down_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.3.mlp.gate_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.3.self_attn.o_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.3.self_attn.v_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.3.self_attn.k_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.3.self_attn.q_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.2.mlp.up_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.2.mlp.down_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.2.mlp.gate_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.2.self_attn.o_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.2.self_attn.v_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.2.self_attn.k_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.2.self_attn.q_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.1.mlp.up_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.1.mlp.down_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.1.mlp.gate_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.1.self_attn.o_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.1.self_attn.v_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.1.self_attn.k_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.1.self_attn.q_proj": {
+      "bits": 4,
+      "group_size": 64
+    },
+    "model.layers.0.mlp.up_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.0.mlp.down_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.0.mlp.gate_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.0.self_attn.o_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.0.self_attn.v_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.0.self_attn.k_proj": {
+      "bits": 8,
+      "group_size": 64
+    },
+    "model.layers.0.self_attn.q_proj": {
+      "bits": 8,
+      "group_size": 64
+    }
+  }
+}

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,17 @@

+{
+  "add_prefix_space": null,
+  "backend": "tokenizers",
+  "bos_token": "<s>",
+  "clean_up_tokenization_spaces": false,
+  "eos_token": "</s>",
+  "is_local": true,
+  "legacy": true,
+  "local_files_only": false,
+  "model_max_length": 1000000000000000019884624838656,
+  "pad_token": "</s>",
+  "sp_model_kwargs": {},
+  "spaces_between_special_tokens": false,
+  "tokenizer_class": "TokenizersBackend",
+  "unk_token": "<unk>",
+  "use_default_system_prompt": false
+}