aphex5
/

fdn-adapters

federated-distillation

Model card Files Files and versions

fdn-adapters / README.md

aphex5's picture

model card: round r3

399fbe4 verified 24 days ago

|

History Blame Contribute Delete

3.15 kB

	---
	license: apache-2.0
	base_model: Qwen/Qwen3.5-4B
	tags: [lora, federated-distillation, code, golang, fdn]
	---

	# FDN Adapters — Federated Distillation Network (subject: programming/go/concurrency)

	LoRA adapters produced by FDN's federated distillation loop: **many independent
	trainers distill open teachers into one student per subject — and only verified
	improvement gets accepted.** Every training sample is machine-verified
	(`go build && go test -race`) before it may train anything; every adapter is
	graded on held-out exams with contamination structurally excluded
	(solution-class splits). Teachers are open, distillation-permissive models only
	(DeepSeek V4 Flash, MIT) — enforced in code, with license texts verified.

	Code, methodology and all raw evidence: https://github.com/chainswarm/fdn-subnet

	## Student progression (held-out, contamination-free, bucket/200)

	\| Round \| Student \| Score \| Notes \|
	\|---\|---\|---\|---\|
	\| R0 \| base Qwen3.5-4B \| 36/200 (7 families) \| concurrency-bug fixing baseline \|
	\| R1 \| `r1/` adapter (1017 verified samples) \| 200/200 (7 families) \| full miner loop, ≈$1.50 \|
	\| R2 \| `r2/` — 3 independent trainers \| every unsolved target 0→200 (3 new harder families) \| composed router-student: no regression anywhere \|

	## Round R2 — federation in action

	Three trainers on separate cloud machines, same public curriculum, independent
	verified datasets (194/169/197 samples): miner a and b accepted at full
	reward; miner c deliberately duplicated a's assignment — MinHash manifest
	overlap (41%) was detected and its reward discounted 30% (novelty economics).
	Full verdicts: `r2/round-report-v2.json`; per-miner dataset provenance:
	`r2/miner-*/manifest.json`.

	## Usage

	```python
	from peft import PeftModel
	from transformers import AutoModelForCausalLM, AutoTokenizer

	base = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3.5-4B", dtype="bfloat16", device_map="auto")
	model = PeftModel.from_pretrained(base, "aphex5/fdn-adapters", subfolder="r1")
	```

	Adapters compose router-style per subject family; greedy decoding for
	reproducible evals (prompt format: see `fdn/eval/harness/prompts.py`, go-fix-v1).

	## External anchor: HumanEval (pass@1, greedy)

	\| variant \| pass@1 \|
	\|---\|---\|
	\| base Qwen3.5-4B \| 81.1% \|
	\| R1 student \| 79.9% \|

	Subject mastery (36→200/200) held general coding ability within binomial noise.
	A TIES-merged snapshot was built and rejected by FDN's own Pareto gate
	(merge interference destroyed one merged-in skill) — anchor-failing merges
	don't promote; router composition remains the student. Quality only ratchets up.

	## Round R3 — second subject (`programming/go/errors`), the report-card loop

	Base calibration showed 3 of 4 error-handling families already mastered;
	`nil_map_write` (20%) was THE weakness — so all three independent trainers
	targeted it, exactly as the weakness→multiplier loop directs. All three took
	it 60→200 (identical quality). Rewards priced their originality:
	**289 (first, novel) → 198 (−31%, 40% dataset overlap) → 91 (−69%, 91% overlap
	near-duplicate)**. Same machinery, new subject, zero code changes.