Transformers
Safetensors
English
chengpingan commited on
Commit
cfa5c3d
Β·
verified Β·
1 Parent(s): b3878a7

Upload 4 files

Browse files
README.md ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - chengpingan/CoConflictQA
5
+ language:
6
+ - en
7
+ base_model:
8
+ - meta-llama/Meta-Llama-3-8B
9
+ library_name: transformers
10
+ ---
11
+
12
+ # πŸ€– ParamMute: Suppressing Knowledge-Critical FFNs for Faithful Retrieval-Augmented Generation
13
+
14
+ This is the official model for **[ParamMute: Suppressing Knowledge-Critical FFNs for Faithful Retrieval-Augmented Generation](https://arxiv.org/pdf/2502.15543)**.
15
+
16
+ We investigate the internal mechanisms behind unfaithful generation and identify a subset of **mid-to-deep (70%–90% relative depth range) FFNs** that are disproportionately activated in such cases. Building on this insight, we propose Parametric Knowledge Muting through FFN Suppression (**ParamMute**), a framework that improves contextual faithfulness by suppressing the activation of unfaithfulness-associated FFNs and calibrating the model toward retrieved knowledge. Experimental results on CoConflictQA and ConFiQA demonstrate that ParamMute significantly reduces knowledge conflicts and improves context fidelity.
17
+
18
+ ---
19
+
20
+ ## πŸ“š **Paper**
21
+ For a detailed explanation of the methodology and experiments, please refer to our paper:
22
+ [**ParamMute: Suppressing Knowledge-Critical FFNs for Faithful Retrieval-Augmented Generation**](https://arxiv.org/abs/2502.15543)
23
+
24
+ ---
25
+
26
+ ## πŸ“Š Reproduce the Results
27
+ To reproduce the experiments and benchmarks from the paper, follow the instructions provided in the official GitHub repository:
28
+ [πŸ‘‰ GitHub: OpenBMB/ParamMute](https://github.com/OpenBMB/ParamMute).
29
+
30
+ ## πŸ“ Model Details
31
+ - Model Name: ParamMute-8B-KTO
32
+ - Architecture: LLaMA3-8B-Instruct trained with the KTO
33
+ - Training Data: [CoConflictQA](https://huggingface.co/datasets/chengpingan/CoConflictQA) Dataset
34
+ - Pretrained Tasks: Knowledge-Augmented Generation, Contextual Faithfulness Evaluation
35
+
36
+ ## πŸ”– Citation
37
+ If you use PIP-KAG in your work, please consider citing our paper:
38
+ ```
39
+ @misc{huang2025parammutesuppressingknowledgecriticalffns,
40
+ title={ParamMute: Suppressing Knowledge-Critical FFNs for Faithful Retrieval-Augmented Generation},
41
+ author={Pengcheng Huang and Zhenghao Liu and Yukun Yan and Haiyan Zhao and Xiaoyuan Yi and Hao Chen and Zhiyuan Liu and Maosong Sun and Tong Xiao and Ge Yu and Chenyan Xiong},
42
+ year={2025},
43
+ eprint={2502.15543},
44
+ archivePrefix={arXiv},
45
+ primaryClass={cs.CL},
46
+ url={https://arxiv.org/abs/2502.15543},
47
+ }
48
+
49
+ ```
adapter_config.json ADDED
@@ -0,0 +1,40 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "alpha_pattern": {},
3
+ "auto_mapping": {
4
+ "base_model_class": "LlamaForInputContrastivew_act_inhibit",
5
+ "parent_library": "transformers.models.llama.modeling_llama"
6
+ },
7
+ "base_model_name_or_path": "Models/llama3-8b-instruct",
8
+ "bias": "none",
9
+ "eva_config": null,
10
+ "exclude_modules": null,
11
+ "fan_in_fan_out": false,
12
+ "inference_mode": true,
13
+ "init_lora_weights": true,
14
+ "layer_replication": null,
15
+ "layers_pattern": null,
16
+ "layers_to_transform": null,
17
+ "loftq_config": {},
18
+ "lora_alpha": 64,
19
+ "lora_bias": false,
20
+ "lora_dropout": 0.0,
21
+ "megatron_config": null,
22
+ "megatron_core": "megatron.core",
23
+ "modules_to_save": null,
24
+ "peft_type": "LORA",
25
+ "r": 64,
26
+ "rank_pattern": {},
27
+ "revision": null,
28
+ "target_modules": [
29
+ "q_proj",
30
+ "k_proj",
31
+ "v_proj",
32
+ "down_proj",
33
+ "up_proj",
34
+ "o_proj",
35
+ "gate_proj"
36
+ ],
37
+ "task_type": null,
38
+ "use_dora": false,
39
+ "use_rslora": false
40
+ }
adapter_model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:403ffcdc5d9ffef165bf4133ce8c94eb7d8341eba696bd15e41a5923dc33f2f1
3
+ size 671149168
training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:614039b6ad6a6fd2c3c20bf02f849bc69de0bc9d00d2a023e716c2b33803d119
3
+ size 5880