# Structural FFN Decomposition Guides Cross-Model Compression and Quantization Artifacts for the paper by Yeonseong Cynn (River Lab, May 2026). ## Summary Decomposes transformer FFN layers into structural (format-preserving) and classification-relevant components across BERT and GPT-2. Key findings: - Early-layer FFN is 90-200x more structural than classification-relevant; late layers approach 1:1 - **Structural pruning**: head + FFN neuron removal with layer-wise retraining achieves 19.1% parameter reduction on BERT (SST-2) and 9.1% on GPT-2 with no accuracy loss - **Neuron pruning**: removing 8% rarely-active FFN neurons *improves* BERT accuracy by 0.3% - **Mixed-precision quantization**: INT4 on structurally-dominant layers (L1-L3) with STE retraining recovers to -2.1% loss ## Files ### Weights - `bert_sst2_int4_ste.pt` — BERT SST-2 with L1-L3 INT4 quantization + STE retraining. Standard BERT state_dict, loadable directly. Accuracy: 90.1% (original FP32: 92.4%). ### Results — BERT (`results/bert/`) - `bert_structural_prune.json` — Per-layer structural pruning results (head/FFN reduction, accuracy) - `bert_sst2_all_prune.json` — All-layer simultaneous FFN pruning results - `bert_l8_prune_results.json` — L8 FFN correction + pruning (multi-seed) - `bert_quantize_results.json` — INT4/INT8 post-training quantization results - `bert_quantize_retrain.json` — INT4 STE retraining results ### Results — GPT-2 (`results/gpt2/`) - `gpt2_structural_prune.json` — Per-layer structural pruning (head + FFN) - `gpt2_each_layer_prune.json` — Individual layer compression results - `gpt2_prune_validate.json` — Pruning validation (PPL, accuracy) ### Figures - `figures/fig1_ratio.png` — FFN dual role ratio: BERT vs GPT-2 (log scale) - `figures/fig2_compression.png` — Per-layer compression rates comparison - `figures/fig3_pruning.png` — BERT SST-2 FFN neuron pruning curve - `figures/fig4_quantization.png` — INT4 quantization results (PTQ vs STE) ## Base Models - BERT: [textattack/bert-base-uncased-SST-2](https://huggingface.co/textattack/bert-base-uncased-SST-2) - GPT-2: [gpt2](https://huggingface.co/gpt2) (124M, pre-trained) ## License MIT