| # Model Card for aedupuga/multioutput-regression-models | |
| ### Model Description | |
| This model card describes the multi-output regression models trained on the aedupuga/2025-scaffold-strucutres dataset. The models predict structural properties of DNA sequences based on their sequence and other features. | |
| - **Model developed by:** Anuhya Edupuganti | |
| - **Model type:** Multi-output regression models (e.g., Ridge, Elastic Net, etc.) | |
| ### Model Sources | |
| - **Dataset:** https://huggingface.co/datasets/aedupuga/2025-scaffold-strucutres | |
| ### Direct Use | |
| - These models can be used to predict structural properties of new DNA sequences. The inputs should be the sequence (one hot encoded), length_bp, GC_content, and AT_content in the same format as the training data. | |
| ## Bias, Risks, and Limitations | |
| - The models are trained on a specific dataset and may not generalize well to sequences with significantly different characteristics. | |
| ## Training Data: | |
| The models were trained on the original split of the aedupuga/2025-scaffold-strucutres dataset, which contains features like sequence, length_bp, GC_content and target variables mfe_energy, num_pairs, stem_len_mean, num_stems, num_hairpins, and num_internal_loops. | |
| ## Evaluation Data: | |
| The models were evaluated using Mean Absolute Error (MAE) per target variable, Overall Mean Squared Error (MSE), and Overall R2 score on a test set. The results of this evaluation are below: | |
| |index|MAE per Target|Overall MSE|Overall R2|Training Time \(s\)|Prediction Time \(s\)| | |
| |---|---|---|---|---|---| | |
| |Elastic Net Regression|\{'mfe\_energy': 52\.246284144510895, 'num\_pairs': 26\.310440395684935, 'stem\_len\_mean': 0\.12521268046915585, 'num\_stems': 11\.824946984005694, 'num\_hairpins': 6\.362566878951059, 'num\_internal\_loops': 10\.42332493488957\}|1106\.2239040178551|0\.826949061716721|37\.89513540267944|0\.1340947151184082| | |
| |Gradient Boosting Regressor|\{'mfe\_energy': 93\.86046583448288, 'num\_pairs': 62\.12858533728426, 'stem\_len\_mean': 0\.1195790099334551, 'num\_stems': 19\.521731017111673, 'num\_hairpins': 8\.17095118930435, 'num\_internal\_loops': 13\.708766069413938\}|8056\.465535344057|0\.6354714816262127|1064\.1453528404236|0\.1442549228668213| | |
| |Hist Gradient Boosting Regressor|\{'mfe\_energy': 92\.7948317451044, 'num\_pairs': 119\.05137751966541, 'stem\_len\_mean': 0\.09455135368867978, 'num\_stems': 38\.937795002481145, 'num\_hairpins': 14\.538582916907997, 'num\_internal\_loops': 17\.869036566267987\}|22401\.159492850904|0\.8354263411439559|2276\.7718391418457|0\.05630350112915039| | |
| |LGBM Regressor|\{'mfe\_energy': 101\.99282118712706, 'num\_pairs': 118\.43061288454638, 'stem\_len\_mean': 0\.09833922311726692, 'num\_stems': 40\.143725672660345, 'num\_hairpins': 14\.649323146842754, 'num\_internal\_loops': 17\.48710432164195\}|23866\.947492270672|0\.8261400755125136|110\.61460065841675|2\.587249279022217| | |
| |Ridge Regression|\{'mfe\_energy': 53\.306863779432625, 'num\_pairs': 25\.654395957994026, 'stem\_len\_mean': 0\.08403309633471835, 'num\_stems': 11\.393997952747661, 'num\_hairpins': 5\.67977376648804, 'num\_internal\_loops': 9\.260745328034114\}|1260\.7624462037288|0\.9156932974948483|7\.063617944717407|0\.12312531471252441| | |
| |Lasso Regression|\{'mfe\_energy': 67\.2766660142239, 'num\_pairs': 31\.48700612938905, 'stem\_len\_mean': 0\.12521713179836697, 'num\_stems': 13\.158785656539967, 'num\_hairpins': 6\.854702974737726, 'num\_internal\_loops': 11\.13869663689622\}|1823\.6267070867707|0\.8248397294025618|51\.86927938461304|0\.12734723091125488| | |
| |MLP Regressor|\{'mfe\_energy': 113\.60031276554486, 'num\_pairs': 76\.11145098696264, 'stem\_len\_mean': 1\.7844990300743258, 'num\_stems': 19\.919928534641326, 'num\_hairpins': 9\.225894814725708, 'num\_internal\_loops': 13\.794781026278551\}|5507\.494866833836|-34\.39226684672794|68\.65580224990845|0\.13591504096984863| | |
| ## Model Card Contact | |
| Anuhya Edupuganti (Carnegie Mellon Univerity)- aedupuga@andrew.cmu.edu |