Exa-distilled model: nDCG=0.9999, HN-AUC=0.9731, trained on 93K pairs 555d737 verified siddhm11 commited on 13 days ago
FINAL production model: intent-based labels, nDCG=0.8725, HN-AUC=0.8325 c8a8dc3 verified siddhm11 commited on 14 days ago
Add final extraction script (tested, works, needs stable compute for full run)" 36ec200 verified siddhm11 commited on 15 days ago
Add V6 deployment guide: production-ready model, zero code changes needed bcd1b92 verified siddhm11 commited on 15 days ago
Add V6 PRODUCTION MODEL: 37-feature schema, drops into app, Hard Neg AUC=0.758 f94f80c verified siddhm11 commited on 15 days ago
Add V6: drop-in replacement, 37 features, cross-survey labels, HN-AUC=0.964 a3f9c6f verified siddhm11 commited on 15 days ago
Update CHANGELOG: add V4 and V5 results. V5 achieves Hard Neg AUC=0.837 9856611 verified siddhm11 commited on 19 days ago
Add V5: graph+metadata, Hard Neg AUC=0.837 (+7.6% over V4) a4dc0e5 verified siddhm11 commited on 19 days ago
Add V4 LightGBM: 25 graph features, hard_neg_auc=0.778 (+5.4% over V3) 4808c83 verified siddhm11 commited on 20 days ago
Update CHANGELOG: add V3 model with new eval framework e0169cb verified siddhm11 commited on 20 days ago
Add V3 eval metrics: nDCG@10=0.9494, hard_neg_auc=0.7380 6f159f9 verified siddhm11 commited on 20 days ago
Add V3 LightGBM: trained on cross-survey authority labels, hard_neg_auc=0.738 247d191 verified siddhm11 commited on 20 days ago
Add eval v2 design document: explains why old eval was weak and how new one works 4c07339 verified siddhm11 commited on 20 days ago
Add eval v2: evaluation script with proper metrics for survey reading lists 38004cb verified siddhm11 commited on 20 days ago
Add eval v2: extract survey paper reading lists from unarXive 2024 2628be1 verified siddhm11 commited on 20 days ago
docs: comprehensive README with production results and integration guide 0ef3e5f verified siddhm11 commited on Apr 27