ufold / README.md
ZhiyuanChen's picture
Upload folder using huggingface_hub
6e6578d verified
---
language: rna
library_name: multimolecule
license: agpl-3.0
pipeline: rna-secondary-structure
pipeline_tag: other
tags:
- Biology
- RNA
widget:
- example_title: microRNA 21
output:
text: '......................'
pipeline_tag: rna-secondary-structure
sequence_type: ncRNA
task: rna-secondary-structure
text: UAGCUUAUCAGACUGAUGUUGA
- example_title: microRNA 146a
output:
text: '......................'
pipeline_tag: rna-secondary-structure
sequence_type: ncRNA
task: rna-secondary-structure
text: UGAGAACUGAAUUCCAUGGGUU
- example_title: microRNA 155
output:
text: '........................'
pipeline_tag: rna-secondary-structure
sequence_type: ncRNA
task: rna-secondary-structure
text: UUAAUGCUAAUCGUGAUAGGGGUU
- example_title: RNA component of mitochondrial RNA processing endoribonuclease
output:
text: '..............................([(..........(...{..([[.....<<A{<{A<BB<A<B{CCAD<DDAABEDCEA)EDFBG])HG](ECCF[H}DCGG}HFDGH)IJK]>LMNabABE>FGLOJ))}](a(c(>[[MMbcdNaO>a}be{N{Oad{OLcaO>{cKad)](>[>bfgc)e}f)g(fgh}}gdiejdjh)kkcelb)f]}balgmmngdel.]nhom.o.oo.n.odh............................'
pipeline_tag: rna-secondary-structure
sequence_type: ncRNA
task: rna-secondary-structure
text: GGUUCGUGCUGAAGGCCUGUAUCCUAGGCUACACACUGAGGACUCUGUUCCUCCCCUUUCCGCCUAGGGGAAAGUCCCCGGACCUCGGGCAGAGAGUGCCACGUGCAUACGCACGUAGACAUUCCCCGCUUCCCACUCCAAAGUCCGCCAAGAAGCGUAUCCCGCUGAGCGGCGUGGCGCGGGGGCGUCAUCCGUCAGCUCCCUCUAGUUACGCAGGCAGUGCGUGUCCGCGCACCAACCACACGGGGCUCAUUCUCAGCGCGGCUGUAAAAAAAAA
- example_title: 7SK small nuclear RNA
output:
text: '.....................([{{.<....A..B.A.B..A....A..C......C.C...D..DEDEFECDDE)]}}(([>a{aa[<<EFbGa{A[GHA[bB)(B<<IHcHAB]AFBEI]FFFJBdGGDEI]HIGIJKFFHGIHKGHIK.HIK)}II>LHaMJKL][b[bcCK}d{{MNOPJ([{[BLCe))(]O.]>a><fd}ghefh(if(hg(bb<]gh}aa(i[)hgc[efAdAbOPei)]A)cjLjgc>bkihfdfce)ie{fikgjk)d]h>}}heaf()llkl]km>iianoaopoklgmpijhi..................'
pipeline_tag: rna-secondary-structure
sequence_type: ncRNA
task: rna-secondary-structure
text: GGAUGUGAGGGCGAUCUGGCUGCGACAUCUGUCACCCCAUUGAUCGCCAGGGUUGAUUCGGCUGAUCUGGCUGGCUAGGCGGGUGUCCCCUUCCUCCCUCACCGCUCCAUGUGCGUCCCUCCCGAAGCUGCGCGCUCGGUCGAAGAGGACGACCAUCCCCGAUAGAGGAGGACCGGUCUUCGGUCAAGGGUAUACGAGUAGCUGCGCUCCCCUGCUAGAACCUCCAAACAAGCUCUCAAGGUCCAUUUGUAGGAGAACGUAGGGUAGUCAAGCUUCCAAGACUCCAGACACAUCCAAAUGAGGCGCUGCAUGUGGCAGUCUGCCUUUCUUUU
- example_title: telomerase RNA component
output:
text: '.............................................((([.{<A[.B[{B<<([AB{{[((<CAA<DDB[[EBE(CF[{(GHAG<)HDBIECJFDE(EBJG)AI{EBJHKECBA)IFHIGKJDLJAK]J}HMFGCE)K}KLIM]NFEOJCEHKIMB)JCKDE)F]>IJaKaD]MJNIK]FEGH}LMN))b}(>KaNObMbHLNOPNcIaOcPbcOb]PdPMQ[dRc[BecMSeOdeR)>R(feTUaKNgMUd}effe]befRbP]hV{gh>{]<bShTTi)UfPeViiVjackgRkUdjalmkeWmnmheA)]l}i(}bi[k[Xmnojfg}>chpdnm{pohka)ipj(qek>rjjnkr]Cnstop>aulf)mro]i}gmhmjgrirobpisudbecjjnkkktvljtwvuump()xvk..noh..................'
pipeline_tag: rna-secondary-structure
sequence_type: ncRNA
task: rna-secondary-structure
text: GGGUUGCGGAGGGUGGGCCUGGGAGGGGUGGUGGCCAUUUUUUGUCUAACCCUAACUGAGAAGGGCGUAGGCGCCGUGCUUUUGCUCCCCGCGCGCUGUUUUUCUCGCUGACUUUCAGCGGGCGGAAAAGCCUCGGCCUGCCGCCUUCCACCGUUCAUUCUAGAGCAAACAAAAAAUGUCAGCUGCUGGCCCGUUCGCCCCUCCCGGGGACCUGCGGCGGGUCGCCUGCCCAGCCCCCGAACCCCGCCUGGAGGCCGCGGUCGGCCCGGGGCUUCUCCGGAGGCACCCACUGCCACCGCGAAGAGUUGGGCUCUGUCAGCCGCGGGUCUCUCGGGGGCGAGGGCGAGGUUCAGGCCUUUCAGGCCGCAGGAAGAGGAACGGAGCGAGUCCCCGCGCGCGGCGCGAUUCCCUGAGCUGUGGGACGUGCACCCAGGACUCGGCUCACACAUGC
- example_title: vault RNA 2-1
output:
text: '..........................([{<AB..B..C.........)]}>abb([c()[)(({][<A)})(]])>...a............................'
pipeline_tag: rna-secondary-structure
sequence_type: ncRNA
task: rna-secondary-structure
text: CGGGUCGGAGUUAGCUCAAGCGGUUACCUCCUCAUGCCGGACUUUCUAUCUGUCCAUCUCUGUGCUGGGGUUCGAGACCCGCGGGUGCUUACUGACCCUUUUAUGCAA
- example_title: brain cytoplasmic RNA 1
output:
text: '............................(((...[...........................................{<<<{A{AA{B[{{<<BA)))]((].....}.}..............>....)..)}>}.....}a}>>.a>ba.b....a.........................................'
pipeline_tag: rna-secondary-structure
sequence_type: ncRNA
task: rna-secondary-structure
text: GGCCGGGCGCGGUGGCUCACGCCUGUAAUCCCAGCUCUCAGGGAGGCUAAGAGGCGGGAGGAUAGCUUGAGCCCAGGAGUUCGAGACCUGCCUGGGCAAUAUAGCGAGACCCCGUUCUCCAGAAAAAGGAAAAAAAAAAACAAAAGACAAAAAAAAAAUAAGCGUAACUUCCCUCAAAGCAACAACCCCCCCCCCCCUUU
- example_title: HIV-1 TAR-WT
output:
text: '.........................................................'
pipeline_tag: rna-secondary-structure
sequence_type: ncRNA
task: rna-secondary-structure
text: GGUCUCUCUGGUUAGACCAGAUCUGAGCCUGGGAGCUCUCUGGCUAACUAGGGAACC
- example_title: prion protein (Kanno blood group)
output:
text: '..................................................................'
pipeline_tag: rna-secondary-structure
sequence_type: mRNA
task: rna-secondary-structure
text: AUGGCGAACCUUGGCUGCUGGAUGCUGGUUCUCUUUGUGGCCACAUGGAGUGACCUGGGCCUCUGC
- example_title: interleukin 10
output:
text: '......................................................'
pipeline_tag: rna-secondary-structure
sequence_type: mRNA
task: rna-secondary-structure
text: AUGCACAGCUCAGCACUGCUCUGUUGCCUGGUCCUCCUGACUGGGGUGAGGGCC
- example_title: Zaire ebolavirus
output:
text: '..............................(........................[[[.{[<.{[[{{[[<{{AAA<{A{<[B{<<A[C)((DD(ED(ED(EB((AB]CC])EDDDFEF]GEH}I]F}GGH>aH]}IHJJKLK}>].[LLLa>Lb)LML}NNc)OdMO..OP)QedNQ)O.>Pfg).d}RaSghhi>ITbjchjJb(e]cdBaf]kl}deCl]dB)}lgl]m)lfn]lo()am}enk>ohoppbiqbeqo(elcnrastd)([j)]i.()..........................'
pipeline_tag: rna-secondary-structure
sequence_type: mRNA
task: rna-secondary-structure
text: AAUGUUCAAACACUUUGUGAAGCUCUGUUAGCUGAUGGUCUUGCUAAAGCAUUUCCUAGCAAUAUGAUGGUAGUCACAGAGCGUGAGCAAAAAGAAAGCUUAUUGCAUCAAGCAUCAUGGCACCACACAAGUGAUGAUUUUGGUGAGCAUGCCACAGUUAGAGGGAGUAGCUUUGUAACUGAUUUAGAGAAAUACAAUCUUGCAUUUAGAUAUGAGUUUACAGCACCUUUUAUAGAAUAUUGUAACCGUUGCUAUGGUGUUAAGAAUGUUUUUAAUUGGAUGCAUUAUACAAUCCCACAGUGUUAU
- example_title: SARS coronavirus
output:
text: '..................................................((..[.....{......{(<([[{.[{[[A[{{<.A(ABA(AB{([(CCC.DE[A{[<F.{{{<......EEF.........GBA.CD..BEGC.D.ABB(C[D{<.)A].}>HECFGHHA<HIF.C.HI.[<DF<AIJAKHKL<KE.JL)M(EGILBAIHKMJ]KN]FLCE}>)K)F>JGMKOIHGJaF]N}JNLHMG>MOPI>QabL><RM}JaKNLOacN}aLabcb)OccadP])ebef>B>gCdeQ)]RfgSRhTiSCg(eUfjhfDSTf}fEgk}ghi>GGal<jlhai})aFaU]ajjAh]kUl]ljhki]k}mni]j}b)comcmcbddp>cqe))](rkdnels(sbrcdfjom}cncke}koteliaefspl>gggfbui)gh)hmunnhaoq()([m)]rtklu..............................'
pipeline_tag: rna-secondary-structure
sequence_type: mRNA
task: rna-secondary-structure
text: AUGUUUAUUUUCUUAUUAUUUCUUACUCUCACUAGUGGUAGUGACCUUGACCGGUGCACCACUUUUGAUGAUGUUCAAGCUCCUAAUUACACUCAACAUACUUCAUCUAUGAGGGGGGUUUACUAUCCUGAUGAAAUUUUUAGAUCAGACACUCUUUAUUUAACUCAGGAUUUAUUUCUUCCAUUUUAUUCUAAUGUUACAGGGUUUCAUACUAUUAAUCAUACGUUUGACAACCCUGUCAUACCUUUUAAGGAUGGUAUUUAUUUUGCUGCCACAGAGAAAUCAAAUGUUGUCCGUGGUUGGGUUUUUGGUUCUACCAUGAACAACAAGUCACAGUCGGUGAUUAUUAUUAACAAUUCUACUAAUGUUGUUAUACGAGCAUGUAACUUUGAAUUGUGUGACAACCCUUUCUUUGCUGUUUCUAAACCCAUGGGUACACAGACACAUACUAUGAUAUUCGAUAAUGCAUUUAAAUGCACUUUCGAGUACAUAUCU
- example_title: insulin
output:
text: '.....................([{...<A...<.....BA.BB.......B.C.BDBDAADCDEABFECDFDEGGGC)(](}([>>{<{[([{[{{<F[G<EH<EGaDaHAF)(FbFcIFBG]A]EGHH(IE[)JBJJGDE]}{I>)KcLa([I){LKaKKbadLd]eaA]M)bK}>ALMe}e].Kde)bNfLgMbNgGN>)(dcb<h(acid}(bb(hii<d<j)A}]>kd)[fCeCagOefakOlOkPg(ek}jffOfj))gmhed>}k]klmlgiab)ncgn>loomglpn()f(g(oh)c>[o())]......................'
pipeline_tag: rna-secondary-structure
sequence_type: mRNA
task: rna-secondary-structure
text: AUGGCCCUGUGGAUGCGCCUCCUGCCCCUGCUGGCGCUGCUGGCCCUCUGGGGACCUGACCCAGCCGCAGCCUUUGUGAACCAACACCUGUGCGGCUCACACCUGGUGGAAGCUCUCUACCUAGUGUGCGGGGAACGAGGCUUCUUCUACACACCCAAGACCCGCCGGGAGGCAGAGGACCUGCAGGUGGGGCAGGUGGAGCUGGGCGGGGGCCCUGGUGCAGGCAGCCUGCAGCCCUUGGCCCUGGAGGGGUCCCUGCAGAAGCGUGGCAUUGUGGAACAAUGCUGUACCAGCAUCUGCUCCCUCUACCAGCUGGAGAACUACUGCAACUAG
- example_title: cyclin dependent kinase inhibitor 2A
output:
text: '.......(.[...{{{..<.A..B...<)(](((.[..([[CDD(C[[DDCC[DAE(FAEGBBGFDCB}}FC}{GH{EH{I><I<DJAKL[GJEJFECIFJDGHFE{IGFKGKHGLKJLHKFCLLJHIIM)]KG}ME))LEM]F]})>IaJN(aL(aIJKMOAJAOKLbbPcMM[KcLdNePQL(ce>LRMNOePRf[{))MfP]SLIMJQSgJbORT]OPUfgcN}dh}])SO>{a<aV<eTg]ORVeiQjafAUbBBc)A)BAiiAjjBhfkllSgdUcTci]dReBAmUSRe)k(jm}Vih>]fl}gjfnhmc[lagd[)(o{loaf(kain{g(boebo{]h[jjd[kCaompn)plfmkmadqijliljmqrdgjbrrng)b.rk.}ssoppo}lk]s>l)}ptuctjv.qrms].]r.kimhs.kuul.tlu...vbv...........................'
pipeline_tag: rna-secondary-structure
sequence_type: mRNA
task: rna-secondary-structure
text: AUGGAGCCGGCGGCGGGGAGCAGCAUGGAGCCUUCGGCUGACUGGCUGGCCACGGCCGCGGCCCGGGGUCGGGUAGAGGAGGUGCGGGCGCUGCUGGAGGCGGGGGCGCUGCCCAACGCACCGAAUAGUUACGGUCGGAGGCCGAUCCAGGUCAUGAUGAUGGGCAGCGCCCGAGUGGCGGAGCUGCUGCUGCUCCACGGCGCGGAGCCCAACUGCGCCGACCCCGCCACUCUCACCCGACCCGUGCACGACGCUGCCCGGGAGGGCUUCCUGGACACGCUGGUGGUGCUGCACCGGGCCGGGGCGCGGCUGGACGUGCGCGAUGCCUGGGGCCGUCUGCCCGUGGACCUGGCUGAGGAGCUGGGCCAUCGCGAUGUCGCACGGUACCUGCGCGCGGCUGCGGGGGGCACCAGAGGCAGUAACCAUGCCCGCAUAGAUGCCGCGGAAGGUCCCUCAGACAUCCCCGAUUGA
- example_title: human papillomavirus type 16 E6
output:
text: '........................................(([(......{....<...A((.[{([<(AB([{(A<({[{[A(BB{[[[[CD<{({<EDDAE<A(E[)FF]GHIJC<FAGHH<B)BIBB)HCAKIDEF]G)HIJKL)]]MNOPQRSTUVWXYZ...YI(EGQMN}UFS.(BHOTHPZQOJXPVIGWKYHQW.X.>YaQQ>}TRY]{U.]R)>b)XaVJSb>Nbbcbdaee[[B..ffYg)W])ghhZg].hZ)DEi.]F}}jh.iG}j...ik..]jkl}}Im>nnamooppqnJqqrro)(b]J>pKabLqqqsccdCd][rsteee}[>sfutvf{gEwtauvg<hwbuwhx)BidfvaxMNdyz]wyy.Bjx.)xyzyAz].i)y))j]gk}..(zc>..}>e)...ea.ahil.(..b.bfmh.bin[)j]k..............................'
pipeline_tag: rna-secondary-structure
sequence_type: mRNA
task: rna-secondary-structure
text: AUGCACCAAAAGAGAACUGCAAUGUUUCAGGACCCACAGGAGCGACCCAGAAAGUUACCACAGUUAUGCACAGAGCUGCAAACAACUAUACAUGAUAUAAUAUUAGAAUGUGUGUACUGCAAGCAACAGUUACUGCGACGUGAGGUAUAUGACUUUGCUUUUCGGGAUUUAUGCAUAGUAUAUAGAGAUGGGAAUCCAUAUGCUGUAUGUGAUAAAUGUUUAAAGUUUUAUUCUAAAAUUAGUGAGUAUAGACAUUAUUGUUAUAGUUUGUAUGGAACAACAUUAGAACAGCAAUACAACAAACCGUUGUGUGAUUUGUUAAUUAGGUGUAUUAACUGUCAAAAGCCACUGUGUCCUGAAGAAAAGCAAAGACAUCUGGACAAAAAGCAAAGAUUCCAUAAUAUAAGGGGUCGGUGGACCGGUCGAUGUAUGUCUUGUUGCAGAUCAUCAAGAACACGUAGAGAAACCCAGCUGUAA
- example_title: NRAS proto-oncogene
output:
text: '..................(([({<<<{[ABAAAC.CC.DEBDFE.FGHIJK......))]])(((}}[[>>>[)(aaaa]bb]))c])deccfdghi(jefk)()()()......................'
pipeline_tag: rna-secondary-structure
sequence_type: 5' UTR
task: rna-secondary-structure
text: GGGGCCGGAAGUGCCGCUCCUUGGUGGGGGCUGUUCAUGGCGGUUCCGGGGUCUCCAACAUUUUUCCCGGCUGUGGUCCUAAAUCUGUCCAAAGCAGAGGCAGUGGAGCUUGAGGUUCUUGCUGGUGUGAA
- example_title: amyloid beta precursor protein
output:
text: '.......................((....[{<....................A<BCBCBD.EEFGCEGG((HI.(EJ)F)G)AI))]}>>(aabcbc[bcdeef(e{g))](ef}ghgi)g([)]ij.......................'
pipeline_tag: rna-secondary-structure
sequence_type: 5' UTR
task: rna-secondary-structure
text: GUCAGUUUCCUCGGCAGCGGUAGGCGAGAGCACGCGGAGGAGCGUGCGCGGGGGCCCCGGGAGACGGCGGCGGUGGCGGCGCGGGCAGAGCAAGGACGCGGCGGAUCCCACUCGCACAGCAGCGCACUCGGUGCCCCGCGCAGGGUCGCG
- example_title: RUNX family transcription factor 1
output:
text: '.............................................................((.[..[{..(<.[A{.{A[B{.<.ABB{C(C[)D(EE](E<E)F)GCHA)()]ID}HJ}{>I))a>]ab}(c}](J)JdKc](db)efe})}g>aaehih(eib(cjjjk)[)()]................'
pipeline_tag: rna-secondary-structure
sequence_type: 5' UTR
task: rna-secondary-structure
text: ACUUCUUUGGGCCUCAUAAACAACCACAGAACCACAAGUUGGGUAGCCUGGCAGUGUCAGAAGUCUGAACCCAGCAUAGUGGUCAGCAGGCAGGACGAAUCACACUGAAUGCAAACCACAGGGUUUCGCAGCGUGGUAAAAGAAAUCAUUGAGUCCCCCGCCUUCAGAAGAGGGUGCAUUUUCAGGAGGAAGCG
- example_title: fragile X messenger ribonucleoprotein 1
output:
text: '.......(.[(.{..({(<([{A.AB.CA..D.D.EF.B.DE.CEFE.DEG.)))])})(]}}[>a[a{a<AA[[<H({({b{b{c{d{{eB<cCAIB)CBIJI{Ad)JCJCI)(](](KIeeeJC[}K}}C}L)d(>e]}d(D]aE}(fD<faEF}bL)F])(c(}[gFc>bGGGc}>{M>a{c<M<d)]h)cei)}bf)jdkfig>}i.cjgf>il.)kjamglimj..e.............................'
pipeline_tag: rna-secondary-structure
sequence_type: 5' UTR
task: rna-secondary-structure
text: CUCAGUCAGGCGCUCAGCUCCGUUUCGGUUUCACUUCCGGUGGAGGGCCGCCUCUGAGCGGGCGGCGGGCCGACGGCGAGCGCGGGCGGCGGCGGUGACGGAGGCGCCGCUGCCAGGGGGCGUGCGGCAGCGCGGCGGCGGCGGCGGCGGCGGCGGCGGCGGAGGCGGCGGCGGCGGCGGCGGCGGCGGCGGCUGGGCCUCGAGCGCCCGCAGCCCACCUCUCGGGGGCGGGCUCCCGGCGCUAGCAGGGCUGAAGAGAAG
- example_title: MYC proto-oncogene
output:
text: '............................(([((.{{<.A{BCD{<{A{<{ABECBA<<BD(E[<AACDFGHBIGC(B[IDC(JJJJJJI)IECKLMD)]{<ABDMFHKD<BEAF)CBEGH]HIAGJIGBFHKJA}GK))D))](>a[a([>}[JLK>L}(KaMJL[MbbHGaLGHIJKLaNbO>OaPOJPcMdbbL>cKQb))Od(]>]N}efaPNd(aePcc[.dbQ}MgbhbfNc}gNijg>h}>k)ajgg]g<deahdbkc)]}dfcdhilijlj[kem[fem)](gjkknjl>hhngnlmn(knol]p(oohm)))iiijpjpqjkklqj.]ljmiopj.m..................'
pipeline_tag: rna-secondary-structure
sequence_type: 5' UTR
task: rna-secondary-structure
text: AACUCGCUGUAGUAAUUCCAGCGAGAGGCAGAGGGAGCGAGCGGGCGGCCGGCUAGGGUGGAAGAGCCGGGCGAGCAGAGCUGCGCUGCGGGCGUCCUGGGAAGGGAGAUCCGGAGCGAAUAGGGGGCUUCGCCUCUGGCCCAGCCCUCCCGCUGAUCCCCCAGCCAGCGGUCCGCAACCCUUGCCGCAUCCACGAAACUUUGCCCAUAGCAGCGGGCGGGCACUUUGCACUGGAACUUACAACACCCGAGCAAGGACGCGACUCUCCCGACGCGGGGAGGCUAUUCUGCCCAUUUGGGGACACUUCCCCGCCGCUGCCAGGACCCGCUUCUCUGAAAGGCUCUCCUUGCAGCUGCUUAGACG
- example_title: activating transcription factor 4
output:
text: '.............((.[...{.<<.<.ABCDEFC{FBGGBBCCBHHGHDIHJ)E)((I(][[KIGLJG}I[KI[J}{M{>KaAAN(Kb(AAL>bL{AMMM[McLLLNLbMNc)O)PbQdRSTeQ)PP(OO]I]U}MNbBOSPcQ>]<cd][}{e)<f(gCaggCh()fiDbBg<agh<]ihjaa)AkAk)llBilEmnlmBlijnAooBlD}ohmp]>>)>(iopikcqaapbcbpjnmbk)qbq}.ldrsstmm>.aadeum.....n.............'
pipeline_tag: rna-secondary-structure
sequence_type: 5' UTR
task: rna-secondary-structure
text: CAUUUCUACUUUGCCCGCCCACAGAUGUAGUUUUCUCUGCGCGUGUGCGUUUUCCCUCCUCCCCGCCCUCAGGGUCCACGGCCACCAUGGCGUAUUAGGGGCAGCAGUGCCUGCGGCAGCAUUGGCCUUUGCAGCGGCGGCAGCAGCACCAGGCUCUGCAGCGGCAACCCCCAGCGGCUUAAGCCAUGGCGCUUCUCACGGCAUUCAGCAGCAGCGUUGCUGUAACCGACAAAGACACCUUCGAAUUAAGCACAUUCCUCGAUUCCAGCAAAGCACCGCAAC
- example_title: Human GPI protein p137
output:
text: '..........................................................([((...(((.(.{([{({.(<<[A{B<<.A.BCBABBBCBDCBDCE.ED.FEGF.BFFF)GADGECHHCGIJCDKJKHEIKHLFFGGH(LHBG]))}[{[>[aab)b)bc]d]bef]cc}cfdg)][ecghdghaebf}gM<g(hgN<fEGNhb[cibOh))ijbk])dId)b)}]Dl>>)>]l>>(ajkeffce)}gbmfghennkdi.o..............................'
pipeline_tag: rna-secondary-structure
sequence_type: 3' UTR
task: rna-secondary-structure
text: UUUUUAAAAGGAAAAGAUACCAAAUGCCUGCUGCUACCACCCUUUUCAAUUGCUAUGUUUUGAAAGGCACCAGUAUGUGUUUUAGAUUGAUUUAAAUGUUUCAUUUAAAUCACGGACAGUAGUUUCAGUUCUGAUGGUAUAAGCAAAACAAAUAAAACGUUUAUAAAAGUUGUAUCUUGAAACACUGGUGUUCAACAGCUAGCAGCUUAUGUGAUUCACCCCAUGCCACGUUAGUGUCACAAAUUUUAUGGUUUAUCUCCAGCAACAUUUCUCUAGUACUUGCACUUAUUAUCUGAAUUC
- example_title: nucleophosmin 1
output:
text: '..............................(([.........{.{<(.AAB...[C{D<<AD.[A<{.(CE({.A<D(((.BF[{<A<FBAB)EBBCDDEFD[B{EF)<EFG(A<H)GCGH)]HIDG}D]>[}>{HJEaHIJIJKFK><JIFKGJKFLGHa[))bK(MNKNcbAO]M]L}LdK>I)efI)J]gdL]a)M][b)(b}}[cd[>Mfg>CN)d(d}(O>]ah[eaff>gghgijFabOiij[baekilDk}e)>lci>}ha]b)kdalcmcj]dkjfmlghdj]mdmneffo()oeni]nfhkfkokjh...................'
pipeline_tag: rna-secondary-structure
sequence_type: 3' UTR
task: rna-secondary-structure
text: GAAAAUAGUUUAAACAAUUUGUUAAAAAAUUUUCCGUCUUAUUUCAUUUCUGUAACAGUUGAUAUCUGGCUGUCCUUUUUAUAAUGCAGAGUGAGAACUUUCCCUACCGUGUUUGAUAAAUGUUGUCCAGGUUCUAUUGCCAAGAAUGUGUUGUCCAAAAUGCCUGUUUAGUUUUUAAAGAUGGAACUCCACCCUUUGCUUGGUUUUAAGUAUGUAUGGAAUGUUAUGAUAGGACAUAGUAGUAGCGGUGGUCAGACAUGGAAAUGGUGGGGAGACAAAAAUAUACAUGUGAAAUAAAACUCAGUAUUUUAAUAAAGUAGCACGGUUUCUAUUGA
- example_title: superoxide dismutase 1
output:
text: '............................................(([{<A<ABABCCDDD{D{EBF{<{{FD{GE<AE)G)(]([[F[[B[A[FGA<DHGBAHHGGHIIJKLEMCFMFF}LMK..N)MM.MN]..O.]P.P.O...}>.].a.].......P.....)..(........QR}SQ>T(}..}(a>P}.a.R}>b{a.b]abbcc.{..d.Q.defef.f.eQ]..fgg.h...gh>aic...baj........kkdgdhlmdmeld)noo}pm[)qpp)mqffhg}pqrm]qrngimstf............................................'
pipeline_tag: rna-secondary-structure
sequence_type: 3' UTR
task: rna-secondary-structure
text: ACAUUCCCUUGGAUGUAGUCUGAGGCCCCUUAACUCAUCUGUUAUCCUGCUAGCUGUAGAAAUGUAUCCUGAUAAACAUUAAACACUGUAAUCUUAAAAGUGUAAUUGUGUGACUUUUUCAGAGUUGCUUUAAAGUACCUGUAGUGAGAAACUGAUUUAUGAUCACUUGGAAGAUUUGUAUAGUUUUAUAAAACUCAGUUAAAAUGUCUGUUUCAAUGACCUGUAUUUUGCCAGACUUAAAUCACAGAUGGGUAUUAAACUUGUCAGAAUUUCUUUGUCAUUCAAGCCUGUGAAUAAAAACCCUGUAUGGCACUUAUUAUGAGGCUAUUAAAAGAAUCCAAAUUCAAACUAAA
- example_title: hemoglobin subunit alpha 2
output:
text: '..............(((....[.....(.[{[{<A..A.AA.BCD)EE]}]>abc)d]}[))aa(a)ee...]....................................'
pipeline_tag: rna-secondary-structure
sequence_type: 3' UTR
task: rna-secondary-structure
text: CUGGAGCCUCGGUAGCCGUUCCUCCUGCCCGCUGGGCCUCCCAACGGGCCCUCCUCCCCUCCUUGCACCGGCCCUUCCUGGUCUUUGAAUAAAGUCUGAGUGGGCAGCA
- example_title: BRAF proto-oncogene
output:
text: '.................................................([...)]..(.((([{..<([(A{<ABC{D{{(<[<A(BACAA{<<<(<<EFA<BC{DEFGHIJB[AHDACKBCD{EBEFBIBFGFEB)JGCDE)(JFGHI)J]JKLMN})]GJK)M}}{)HLO)F>]L<][))NO(Ma([bb[cPdbe}L[LNbbLcd[DOQMN{OPf{DQfDRgahDM>g)S(OG]{]NhD(THGG<))iHj}eT]>I[>>H[e)a}UUb>cddfS>(}b}kd>dg(B)cUl]C)>(l]mhj]dadgl}bP>gdlafkhge)}mnfeEh]lanablj[}fmiF>oamcg)}g>ompcbnach.h]pnqri(djoeinssootjt)ufjef(kupuq())...................'
pipeline_tag: rna-secondary-structure
sequence_type: 3' UTR
task: rna-secondary-structure
text: AACAAAUGAGUGAGAGAGUUCAGGAGAGUAGCAACAAAAGGAAAAUAAAUGAACAUAUGUUUGCUUAUAUGUUAAAUUGAAUAAAAUACUCUCUUUUUUUUUAAGGUGAACCAAAGAACACUUGUGUGGUUAAAGACUAGAUAUAAUUUUUCCCCAAACUAAAAUUUAUACUUAACAUUGGAUUUUUAACAUCCAAGGGUUAAAAUACAUAGACAUUGCUAAAAAUUGGCAGAGCCUCUUCUAGAGGCUUUACUUUCUGUUCCGGGUUUGUAUCAUUCACUUGGUUAUUUUAAGUAGUAAACUUCAGUUUCUCAUGCAACUUUUGUUGCCAGCUAUCACAUGUCCACUAGGGACUCCAGAAGAAGACCCUACCUAUGCCUGUGUUUGCAGGUGAGAAGUUGGCAGUCGGUUAGCCUGGG
- example_title: H3 clustered histone 1
output:
text: '..........................................................'
pipeline_tag: rna-secondary-structure
sequence_type: 3' UTR
task: rna-secondary-structure
text: UUACUGUGGUCUCUCUGACGGUCCAAGCAAAGGCUCUUUUCAGAGCCACCACCUUUUC
---
# UFold
Pre-trained model for RNA secondary structure prediction using an image-like sequence representation and a U-Net.
## Disclaimer
This is an UNOFFICIAL implementation of [UFold: fast and accurate RNA secondary structure prediction with deep learning](https://doi.org/10.1093/nar/gkab1074) by Laiyi Fu, Yingxin Cao, Jie Wu, Qinke Peng, Qing Nie, and Xiaohui Xie.
The OFFICIAL repository of UFold is at [uci-cbcl/UFold](https://github.com/uci-cbcl/UFold).
> [!TIP]
> The MultiMolecule implementation is a direct PyTorch port of the original U-Net architecture and feature construction.
**The team releasing UFold did not write this model card for this model so this model card has been written by the MultiMolecule team.**
## Model Details
UFold predicts RNA base-pair contact maps from single RNA sequences. It represents a sequence as a 17-channel image: 16 channels are outer products of one-hot nucleotide indicators and one channel is a hand-crafted canonical/wobble pairing score. A U-Net predicts a symmetric contact score matrix, and the original constrained post-processing routine can be enabled to enforce base-pairing constraints.
### Model Specification
| Num Parameters (M) | FLOPs (G) | MACs (G) |
| ------------------ | --------- | -------- |
| 8.64 | 188.29 | 93.81 |
FLOPs and MACs are computed with `multimolecule.utils` for one 600 nt sequence.
### Links
- **Code**: [multimolecule.ufold](https://github.com/DLS5-Omics/multimolecule/tree/master/multimolecule/models/ufold)
- **Weights**: [multimolecule/ufold](https://huggingface.co/multimolecule/ufold)
- **Paper**: [UFold: fast and accurate RNA secondary structure prediction with deep learning](https://doi.org/10.1093/nar/gkab1074)
- **Developed by**: Laiyi Fu, Yingxin Cao, Jie Wu, Qinke Peng, Qing Nie, Xiaohui Xie
- **Original Repository**: [uci-cbcl/UFold](https://github.com/uci-cbcl/UFold)
## Usage
The model file depends on the [`multimolecule`](https://multimolecule.danling.org) library. You can install it using pip:
```bash
pip install multimolecule
```
### RNA Secondary Structure Pipeline
```python
import multimolecule
from transformers import pipeline
predictor = pipeline("rna-secondary-structure", model="multimolecule/ufold")
output = predictor("GGGCUAUUAGCUCAGUUGGUUAGAGCGCACCCCUGAUAAGGGUGAGGUCGCUGAUUCGAAUUCAGCAUAGCUCA")
```
### PyTorch Inference
```python
from multimolecule import RnaTokenizer, UfoldModel
tokenizer = RnaTokenizer.from_pretrained("multimolecule/ufold")
model = UfoldModel.from_pretrained("multimolecule/ufold")
sequence = "GGGCUAUUAGCUCAGUUGGUUAGAGCGCACCCCUGAUAAGGGUGAGGUCGCUGAUUCGAAUUCAGCAUAGCUCA"
inputs = tokenizer(sequence, return_tensors="pt")
output = model(**inputs)
contact_map = output.contact_map
```
To run the original constrained post-processing loop:
```python
output = model(**inputs, use_postprocessing=True)
contact_map = output.postprocessed_contact_map
```
## Training Details
UFold was trained for RNA secondary structure prediction from annotated contact maps and base-pairing rules.
### Training Data
- RNAStrAlign: 30,451 unique RNAs from eight RNA families; the paper reports a random split with 24,895 training RNAs and 2,854 test RNAs after redundancy filtering.
- bpRNA-1m: 102,318 RNAs from 2,588 families; CD-HIT was used to remove redundant sequences before splitting the data into TR0 and TS0.
- augmented data: synthetic training examples were generated from bpRNA-new sequences by random mutation and structure prediction.
- PDB training data: high-resolution RNA structures from bpRNA and the PDB were used for fine-tuning/evaluation experiments; test sets TS1, TS2, and TS3 were filtered at 80% sequence identity.
- evaluation data: ArchiveII, TS0, bpRNA-new, and PDB test data were used for benchmark evaluation.
### Training Procedure
- input representation: 16 outer-product channels following the MultiMolecule tokenizer order plus one hand-crafted pairing-score channel.
- objective: weighted binary cross entropy over base-pair contact maps.
- optimizer: Adam.
- training epochs: 100.
- batch size: 1.
- positive-class weight: 300.
- post-processing: constrained optimization with canonical/wobble pairing rules, sparsity shrinkage, and a 0.5 threshold.
## Citation
```bibtex
@article{fu2022ufold,
author = {Fu, Laiyi and Cao, Yingxin and Wu, Jie and Peng, Qinke and Nie, Qing and Xie, Xiaohui},
title = {UFold: fast and accurate RNA secondary structure prediction with deep learning},
journal = {Nucleic Acids Research},
volume = {50},
number = {3},
pages = {e14},
year = {2022},
doi = {10.1093/nar/gkab1074}
}
```
> [!NOTE]
> The artifacts distributed in this repository are part of the MultiMolecule project.
> If you use MultiMolecule in your research, you must cite the MultiMolecule project.
## License
This model is licensed under the [GNU Affero General Public License](license.md).
For additional terms and clarifications, please refer to our [License FAQ](license-faq.md).
```spdx
SPDX-License-Identifier: AGPL-3.0-or-later
```