| | --- |
| | tags: |
| | - mms |
| | language: |
| | - ab |
| | - af |
| | - ak |
| | - am |
| | - ar |
| | - as |
| | - av |
| | - ay |
| | - az |
| | - ba |
| | - bm |
| | - be |
| | - bn |
| | - bi |
| | - bo |
| | - sh |
| | - br |
| | - bg |
| | - ca |
| | - cs |
| | - ce |
| | - cv |
| | - ku |
| | - cy |
| | - da |
| | - de |
| | - dv |
| | - dz |
| | - el |
| | - en |
| | - eo |
| | - et |
| | - eu |
| | - ee |
| | - fo |
| | - fa |
| | - fj |
| | - fi |
| | - fr |
| | - fy |
| | - ff |
| | - ga |
| | - gl |
| | - gn |
| | - gu |
| | - zh |
| | - ht |
| | - ha |
| | - he |
| | - hi |
| | - sh |
| | - hu |
| | - hy |
| | - ig |
| | - ia |
| | - ms |
| | - is |
| | - it |
| | - jv |
| | - ja |
| | - kn |
| | - ka |
| | - kk |
| | - kr |
| | - km |
| | - ki |
| | - rw |
| | - ky |
| | - ko |
| | - kv |
| | - lo |
| | - la |
| | - lv |
| | - ln |
| | - lt |
| | - lb |
| | - lg |
| | - mh |
| | - ml |
| | - mr |
| | - ms |
| | - mk |
| | - mg |
| | - mt |
| | - mn |
| | - mi |
| | - my |
| | - zh |
| | - nl |
| | - 'no' |
| | - 'no' |
| | - ne |
| | - ny |
| | - oc |
| | - om |
| | - or |
| | - os |
| | - pa |
| | - pl |
| | - pt |
| | - ms |
| | - ps |
| | - qu |
| | - qu |
| | - qu |
| | - qu |
| | - qu |
| | - qu |
| | - qu |
| | - qu |
| | - qu |
| | - qu |
| | - qu |
| | - qu |
| | - qu |
| | - qu |
| | - qu |
| | - qu |
| | - qu |
| | - qu |
| | - qu |
| | - qu |
| | - qu |
| | - qu |
| | - ro |
| | - rn |
| | - ru |
| | - sg |
| | - sk |
| | - sl |
| | - sm |
| | - sn |
| | - sd |
| | - so |
| | - es |
| | - sq |
| | - su |
| | - sv |
| | - sw |
| | - ta |
| | - tt |
| | - te |
| | - tg |
| | - tl |
| | - th |
| | - ti |
| | - ts |
| | - tr |
| | - uk |
| | - ms |
| | - vi |
| | - wo |
| | - xh |
| | - ms |
| | - yo |
| | - ms |
| | - zu |
| | - za |
| | license: mit |
| | datasets: |
| | - google/fleurs |
| | metrics: |
| | - wer |
| | --- |
| | |
| | # Massively Multilingual Speech (MMS) - Finetuned ASR - ALL |
| |
|
| | This checkpoint is a model fine-tuned for multi-lingual ASR and part of Facebook's [Massive Multilingual Speech project](https://research.facebook.com/publications/scaling-speech-technology-to-1000-languages/). |
| | This checkpoint is based on the [Wav2Vec2 architecture](https://huggingface.co/docs/transformers/model_doc/wav2vec2) and makes use of adapter models to transcribe 1000+ languages. |
| | The checkpoint consists of **1 billion parameters** and has been fine-tuned from [facebook/mms-1b](https://huggingface.co/facebook/mms-1b) on 1162 languages. |
| |
|
| | ## Table Of Content |
| |
|
| | - [Example](#example) |
| | - [Supported Languages](#supported-languages) |
| | - [Model details](#model-details) |
| | - [Additional links](#additional-links) |
| |
|
| | ## Example |
| |
|
| | This MMS checkpoint can be used with [Transformers](https://github.com/huggingface/transformers) to transcribe audio of 1107 different |
| | languages. Let's look at a simple example. |
| |
|
| | First, we install transformers and some other libraries |
| | ``` |
| | pip install torch accelerate torchaudio datasets |
| | pip install --upgrade transformers |
| | ```` |
| |
|
| | **Note**: In order to use MMS you need to have at least `transformers >= 4.30` installed. If the `4.30` version |
| | is not yet available [on PyPI](https://pypi.org/project/transformers/) make sure to install `transformers` from |
| | source: |
| | ``` |
| | pip install git+https://github.com/huggingface/transformers.git |
| | ``` |
| |
|
| | Next, we load a couple of audio samples via `datasets`. Make sure that the audio data is sampled to 16000 kHz. |
| |
|
| | ```py |
| | from datasets import load_dataset, Audio |
| | |
| | # English |
| | stream_data = load_dataset("mozilla-foundation/common_voice_13_0", "en", split="test", streaming=True) |
| | stream_data = stream_data.cast_column("audio", Audio(sampling_rate=16000)) |
| | en_sample = next(iter(stream_data))["audio"]["array"] |
| | |
| | # French |
| | stream_data = load_dataset("mozilla-foundation/common_voice_13_0", "fr", split="test", streaming=True) |
| | stream_data = stream_data.cast_column("audio", Audio(sampling_rate=16000)) |
| | fr_sample = next(iter(stream_data))["audio"]["array"] |
| | ``` |
| |
|
| | Next, we load the model and processor |
| |
|
| | ```py |
| | from transformers import Wav2Vec2ForCTC, AutoProcessor |
| | import torch |
| | |
| | model_id = "facebook/mms-1b-all" |
| | |
| | processor = AutoProcessor.from_pretrained(model_id) |
| | model = Wav2Vec2ForCTC.from_pretrained(model_id) |
| | ``` |
| |
|
| | Now we process the audio data, pass the processed audio data to the model and transcribe the model output, just like we usually do for Wav2Vec2 models such as [facebook/wav2vec2-base-960h](https://huggingface.co/facebook/wav2vec2-base-960h) |
| |
|
| | ```py |
| | inputs = processor(en_sample, sampling_rate=16_000, return_tensors="pt") |
| | |
| | with torch.no_grad(): |
| | outputs = model(**inputs).logits |
| | |
| | ids = torch.argmax(outputs, dim=-1)[0] |
| | transcription = processor.decode(ids) |
| | # 'joe keton disapproved of films and buster also had reservations about the media' |
| | ``` |
| |
|
| | We can now keep the same model in memory and simply switch out the language adapters by calling the convenient [`load_adapter()`]() function for the model and [`set_target_lang()`]() for the tokenizer. We pass the target language as an input - "fra" for French. |
| |
|
| | ```py |
| | processor.tokenizer.set_target_lang("fra") |
| | model.load_adapter("fra") |
| | |
| | inputs = processor(fr_sample, sampling_rate=16_000, return_tensors="pt") |
| | |
| | with torch.no_grad(): |
| | outputs = model(**inputs).logits |
| | |
| | ids = torch.argmax(outputs, dim=-1)[0] |
| | transcription = processor.decode(ids) |
| | # "ce dernier est volé tout au long de l'histoire romaine" |
| | ``` |
| |
|
| | In the same way the language can be switched out for all other supported languages. Please have a look at: |
| | ```py |
| | processor.tokenizer.vocab.keys() |
| | ``` |
| |
|
| | For more details, please have a look at [the official docs](https://huggingface.co/docs/transformers/main/en/model_doc/mms). |
| |
|
| | ## Supported Languages |
| |
|
| | This model supports 1162 languages. Unclick the following to toogle all supported languages of this checkpoint in [ISO 639-3 code](https://en.wikipedia.org/wiki/ISO_639-3). |
| | You can find more details about the languages and their ISO 649-3 codes in the [MMS Language Coverage Overview](https://dl.fbaipublicfiles.com/mms/misc/language_coverage_mms.html). |
| | <details> |
| | <summary>Click to toggle</summary> |
| |
|
| | - abi |
| | - abk |
| | - abp |
| | - aca |
| | - acd |
| | - ace |
| | - acf |
| | - ach |
| | - acn |
| | - acr |
| | - acu |
| | - ade |
| | - adh |
| | - adj |
| | - adx |
| | - aeu |
| | - afr |
| | - agd |
| | - agg |
| | - agn |
| | - agr |
| | - agu |
| | - agx |
| | - aha |
| | - ahk |
| | - aia |
| | - aka |
| | - akb |
| | - ake |
| | - akp |
| | - alj |
| | - alp |
| | - alt |
| | - alz |
| | - ame |
| | - amf |
| | - amh |
| | - ami |
| | - amk |
| | - ann |
| | - any |
| | - aoz |
| | - apb |
| | - apr |
| | - ara |
| | - arl |
| | - asa |
| | - asg |
| | - asm |
| | - ast |
| | - ata |
| | - atb |
| | - atg |
| | - ati |
| | - atq |
| | - ava |
| | - avn |
| | - avu |
| | - awa |
| | - awb |
| | - ayo |
| | - ayr |
| | - ayz |
| | - azb |
| | - azg |
| | - azj-script_cyrillic |
| | - azj-script_latin |
| | - azz |
| | - bak |
| | - bam |
| | - ban |
| | - bao |
| | - bas |
| | - bav |
| | - bba |
| | - bbb |
| | - bbc |
| | - bbo |
| | - bcc-script_arabic |
| | - bcc-script_latin |
| | - bcl |
| | - bcw |
| | - bdg |
| | - bdh |
| | - bdq |
| | - bdu |
| | - bdv |
| | - beh |
| | - bel |
| | - bem |
| | - ben |
| | - bep |
| | - bex |
| | - bfa |
| | - bfo |
| | - bfy |
| | - bfz |
| | - bgc |
| | - bgq |
| | - bgr |
| | - bgt |
| | - bgw |
| | - bha |
| | - bht |
| | - bhz |
| | - bib |
| | - bim |
| | - bis |
| | - biv |
| | - bjr |
| | - bjv |
| | - bjw |
| | - bjz |
| | - bkd |
| | - bkv |
| | - blh |
| | - blt |
| | - blx |
| | - blz |
| | - bmq |
| | - bmr |
| | - bmu |
| | - bmv |
| | - bng |
| | - bno |
| | - bnp |
| | - boa |
| | - bod |
| | - boj |
| | - bom |
| | - bor |
| | - bos |
| | - bov |
| | - box |
| | - bpr |
| | - bps |
| | - bqc |
| | - bqi |
| | - bqj |
| | - bqp |
| | - bre |
| | - bru |
| | - bsc |
| | - bsq |
| | - bss |
| | - btd |
| | - bts |
| | - btt |
| | - btx |
| | - bud |
| | - bul |
| | - bus |
| | - bvc |
| | - bvz |
| | - bwq |
| | - bwu |
| | - byr |
| | - bzh |
| | - bzi |
| | - bzj |
| | - caa |
| | - cab |
| | - cac-dialect_sanmateoixtatan |
| | - cac-dialect_sansebastiancoatan |
| | - cak-dialect_central |
| | - cak-dialect_santamariadejesus |
| | - cak-dialect_santodomingoxenacoj |
| | - cak-dialect_southcentral |
| | - cak-dialect_western |
| | - cak-dialect_yepocapa |
| | - cap |
| | - car |
| | - cas |
| | - cat |
| | - cax |
| | - cbc |
| | - cbi |
| | - cbr |
| | - cbs |
| | - cbt |
| | - cbu |
| | - cbv |
| | - cce |
| | - cco |
| | - cdj |
| | - ceb |
| | - ceg |
| | - cek |
| | - ces |
| | - cfm |
| | - cgc |
| | - che |
| | - chf |
| | - chv |
| | - chz |
| | - cjo |
| | - cjp |
| | - cjs |
| | - ckb |
| | - cko |
| | - ckt |
| | - cla |
| | - cle |
| | - cly |
| | - cme |
| | - cmn-script_simplified |
| | - cmo-script_khmer |
| | - cmo-script_latin |
| | - cmr |
| | - cnh |
| | - cni |
| | - cnl |
| | - cnt |
| | - coe |
| | - cof |
| | - cok |
| | - con |
| | - cot |
| | - cou |
| | - cpa |
| | - cpb |
| | - cpu |
| | - crh |
| | - crk-script_latin |
| | - crk-script_syllabics |
| | - crn |
| | - crq |
| | - crs |
| | - crt |
| | - csk |
| | - cso |
| | - ctd |
| | - ctg |
| | - cto |
| | - ctu |
| | - cuc |
| | - cui |
| | - cuk |
| | - cul |
| | - cwa |
| | - cwe |
| | - cwt |
| | - cya |
| | - cym |
| | - daa |
| | - dah |
| | - dan |
| | - dar |
| | - dbj |
| | - dbq |
| | - ddn |
| | - ded |
| | - des |
| | - deu |
| | - dga |
| | - dgi |
| | - dgk |
| | - dgo |
| | - dgr |
| | - dhi |
| | - did |
| | - dig |
| | - dik |
| | - dip |
| | - div |
| | - djk |
| | - dnj-dialect_blowowest |
| | - dnj-dialect_gweetaawueast |
| | - dnt |
| | - dnw |
| | - dop |
| | - dos |
| | - dsh |
| | - dso |
| | - dtp |
| | - dts |
| | - dug |
| | - dwr |
| | - dyi |
| | - dyo |
| | - dyu |
| | - dzo |
| | - eip |
| | - eka |
| | - ell |
| | - emp |
| | - enb |
| | - eng |
| | - enx |
| | - epo |
| | - ese |
| | - ess |
| | - est |
| | - eus |
| | - evn |
| | - ewe |
| | - eza |
| | - fal |
| | - fao |
| | - far |
| | - fas |
| | - fij |
| | - fin |
| | - flr |
| | - fmu |
| | - fon |
| | - fra |
| | - frd |
| | - fry |
| | - ful |
| | - gag-script_cyrillic |
| | - gag-script_latin |
| | - gai |
| | - gam |
| | - gau |
| | - gbi |
| | - gbk |
| | - gbm |
| | - gbo |
| | - gde |
| | - geb |
| | - gej |
| | - gil |
| | - gjn |
| | - gkn |
| | - gld |
| | - gle |
| | - glg |
| | - glk |
| | - gmv |
| | - gna |
| | - gnd |
| | - gng |
| | - gof-script_latin |
| | - gog |
| | - gor |
| | - gqr |
| | - grc |
| | - gri |
| | - grn |
| | - grt |
| | - gso |
| | - gub |
| | - guc |
| | - gud |
| | - guh |
| | - guj |
| | - guk |
| | - gum |
| | - guo |
| | - guq |
| | - guu |
| | - gux |
| | - gvc |
| | - gvl |
| | - gwi |
| | - gwr |
| | - gym |
| | - gyr |
| | - had |
| | - hag |
| | - hak |
| | - hap |
| | - hat |
| | - hau |
| | - hay |
| | - heb |
| | - heh |
| | - hif |
| | - hig |
| | - hil |
| | - hin |
| | - hlb |
| | - hlt |
| | - hne |
| | - hnn |
| | - hns |
| | - hoc |
| | - hoy |
| | - hrv |
| | - hsb |
| | - hto |
| | - hub |
| | - hui |
| | - hun |
| | - hus-dialect_centralveracruz |
| | - hus-dialect_westernpotosino |
| | - huu |
| | - huv |
| | - hvn |
| | - hwc |
| | - hye |
| | - hyw |
| | - iba |
| | - ibo |
| | - icr |
| | - idd |
| | - ifa |
| | - ifb |
| | - ife |
| | - ifk |
| | - ifu |
| | - ify |
| | - ign |
| | - ikk |
| | - ilb |
| | - ilo |
| | - imo |
| | - ina |
| | - inb |
| | - ind |
| | - iou |
| | - ipi |
| | - iqw |
| | - iri |
| | - irk |
| | - isl |
| | - ita |
| | - itl |
| | - itv |
| | - ixl-dialect_sangasparchajul |
| | - ixl-dialect_sanjuancotzal |
| | - ixl-dialect_santamarianebaj |
| | - izr |
| | - izz |
| | - jac |
| | - jam |
| | - jav |
| | - jbu |
| | - jen |
| | - jic |
| | - jiv |
| | - jmc |
| | - jmd |
| | - jpn |
| | - jun |
| | - juy |
| | - jvn |
| | - kaa |
| | - kab |
| | - kac |
| | - kak |
| | - kam |
| | - kan |
| | - kao |
| | - kaq |
| | - kat |
| | - kay |
| | - kaz |
| | - kbo |
| | - kbp |
| | - kbq |
| | - kbr |
| | - kby |
| | - kca |
| | - kcg |
| | - kdc |
| | - kde |
| | - kdh |
| | - kdi |
| | - kdj |
| | - kdl |
| | - kdn |
| | - kdt |
| | - kea |
| | - kek |
| | - ken |
| | - keo |
| | - ker |
| | - key |
| | - kez |
| | - kfb |
| | - kff-script_telugu |
| | - kfw |
| | - kfx |
| | - khg |
| | - khm |
| | - khq |
| | - kia |
| | - kij |
| | - kik |
| | - kin |
| | - kir |
| | - kjb |
| | - kje |
| | - kjg |
| | - kjh |
| | - kki |
| | - kkj |
| | - kle |
| | - klu |
| | - klv |
| | - klw |
| | - kma |
| | - kmd |
| | - kml |
| | - kmr-script_arabic |
| | - kmr-script_cyrillic |
| | - kmr-script_latin |
| | - kmu |
| | - knb |
| | - kne |
| | - knf |
| | - knj |
| | - knk |
| | - kno |
| | - kog |
| | - kor |
| | - kpq |
| | - kps |
| | - kpv |
| | - kpy |
| | - kpz |
| | - kqe |
| | - kqp |
| | - kqr |
| | - kqy |
| | - krc |
| | - kri |
| | - krj |
| | - krl |
| | - krr |
| | - krs |
| | - kru |
| | - ksb |
| | - ksr |
| | - kss |
| | - ktb |
| | - ktj |
| | - kub |
| | - kue |
| | - kum |
| | - kus |
| | - kvn |
| | - kvw |
| | - kwd |
| | - kwf |
| | - kwi |
| | - kxc |
| | - kxf |
| | - kxm |
| | - kxv |
| | - kyb |
| | - kyc |
| | - kyf |
| | - kyg |
| | - kyo |
| | - kyq |
| | - kyu |
| | - kyz |
| | - kzf |
| | - lac |
| | - laj |
| | - lam |
| | - lao |
| | - las |
| | - lat |
| | - lav |
| | - law |
| | - lbj |
| | - lbw |
| | - lcp |
| | - lee |
| | - lef |
| | - lem |
| | - lew |
| | - lex |
| | - lgg |
| | - lgl |
| | - lhu |
| | - lia |
| | - lid |
| | - lif |
| | - lin |
| | - lip |
| | - lis |
| | - lit |
| | - lje |
| | - ljp |
| | - llg |
| | - lln |
| | - lme |
| | - lnd |
| | - lns |
| | - lob |
| | - lok |
| | - lom |
| | - lon |
| | - loq |
| | - lsi |
| | - lsm |
| | - ltz |
| | - luc |
| | - lug |
| | - luo |
| | - lwo |
| | - lww |
| | - lzz |
| | - maa-dialect_sanantonio |
| | - maa-dialect_sanjeronimo |
| | - mad |
| | - mag |
| | - mah |
| | - mai |
| | - maj |
| | - mak |
| | - mal |
| | - mam-dialect_central |
| | - mam-dialect_northern |
| | - mam-dialect_southern |
| | - mam-dialect_western |
| | - maq |
| | - mar |
| | - maw |
| | - maz |
| | - mbb |
| | - mbc |
| | - mbh |
| | - mbj |
| | - mbt |
| | - mbu |
| | - mbz |
| | - mca |
| | - mcb |
| | - mcd |
| | - mco |
| | - mcp |
| | - mcq |
| | - mcu |
| | - mda |
| | - mdf |
| | - mdv |
| | - mdy |
| | - med |
| | - mee |
| | - mej |
| | - men |
| | - meq |
| | - met |
| | - mev |
| | - mfe |
| | - mfh |
| | - mfi |
| | - mfk |
| | - mfq |
| | - mfy |
| | - mfz |
| | - mgd |
| | - mge |
| | - mgh |
| | - mgo |
| | - mhi |
| | - mhr |
| | - mhu |
| | - mhx |
| | - mhy |
| | - mib |
| | - mie |
| | - mif |
| | - mih |
| | - mil |
| | - mim |
| | - min |
| | - mio |
| | - mip |
| | - miq |
| | - mit |
| | - miy |
| | - miz |
| | - mjl |
| | - mjv |
| | - mkd |
| | - mkl |
| | - mkn |
| | - mlg |
| | - mlt |
| | - mmg |
| | - mnb |
| | - mnf |
| | - mnk |
| | - mnw |
| | - mnx |
| | - moa |
| | - mog |
| | - mon |
| | - mop |
| | - mor |
| | - mos |
| | - mox |
| | - moz |
| | - mpg |
| | - mpm |
| | - mpp |
| | - mpx |
| | - mqb |
| | - mqf |
| | - mqj |
| | - mqn |
| | - mri |
| | - mrw |
| | - msy |
| | - mtd |
| | - mtj |
| | - mto |
| | - muh |
| | - mup |
| | - mur |
| | - muv |
| | - muy |
| | - mvp |
| | - mwq |
| | - mwv |
| | - mxb |
| | - mxq |
| | - mxt |
| | - mxv |
| | - mya |
| | - myb |
| | - myk |
| | - myl |
| | - myv |
| | - myx |
| | - myy |
| | - mza |
| | - mzi |
| | - mzj |
| | - mzk |
| | - mzm |
| | - mzw |
| | - nab |
| | - nag |
| | - nan |
| | - nas |
| | - naw |
| | - nca |
| | - nch |
| | - ncj |
| | - ncl |
| | - ncu |
| | - ndj |
| | - ndp |
| | - ndv |
| | - ndy |
| | - ndz |
| | - neb |
| | - new |
| | - nfa |
| | - nfr |
| | - nga |
| | - ngl |
| | - ngp |
| | - ngu |
| | - nhe |
| | - nhi |
| | - nhu |
| | - nhw |
| | - nhx |
| | - nhy |
| | - nia |
| | - nij |
| | - nim |
| | - nin |
| | - nko |
| | - nlc |
| | - nld |
| | - nlg |
| | - nlk |
| | - nmz |
| | - nnb |
| | - nno |
| | - nnq |
| | - nnw |
| | - noa |
| | - nob |
| | - nod |
| | - nog |
| | - not |
| | - npi |
| | - npl |
| | - npy |
| | - nso |
| | - nst |
| | - nsu |
| | - ntm |
| | - ntr |
| | - nuj |
| | - nus |
| | - nuz |
| | - nwb |
| | - nxq |
| | - nya |
| | - nyf |
| | - nyn |
| | - nyo |
| | - nyy |
| | - nzi |
| | - obo |
| | - oci |
| | - ojb-script_latin |
| | - ojb-script_syllabics |
| | - oku |
| | - old |
| | - omw |
| | - onb |
| | - ood |
| | - orm |
| | - ory |
| | - oss |
| | - ote |
| | - otq |
| | - ozm |
| | - pab |
| | - pad |
| | - pag |
| | - pam |
| | - pan |
| | - pao |
| | - pap |
| | - pau |
| | - pbb |
| | - pbc |
| | - pbi |
| | - pce |
| | - pcm |
| | - peg |
| | - pez |
| | - pib |
| | - pil |
| | - pir |
| | - pis |
| | - pjt |
| | - pkb |
| | - pls |
| | - plw |
| | - pmf |
| | - pny |
| | - poh-dialect_eastern |
| | - poh-dialect_western |
| | - poi |
| | - pol |
| | - por |
| | - poy |
| | - ppk |
| | - pps |
| | - prf |
| | - prk |
| | - prt |
| | - pse |
| | - pss |
| | - ptu |
| | - pui |
| | - pus |
| | - pwg |
| | - pww |
| | - pxm |
| | - qub |
| | - quc-dialect_central |
| | - quc-dialect_east |
| | - quc-dialect_north |
| | - quf |
| | - quh |
| | - qul |
| | - quw |
| | - quy |
| | - quz |
| | - qvc |
| | - qve |
| | - qvh |
| | - qvm |
| | - qvn |
| | - qvo |
| | - qvs |
| | - qvw |
| | - qvz |
| | - qwh |
| | - qxh |
| | - qxl |
| | - qxn |
| | - qxo |
| | - qxr |
| | - rah |
| | - rai |
| | - rap |
| | - rav |
| | - raw |
| | - rej |
| | - rel |
| | - rgu |
| | - rhg |
| | - rif-script_arabic |
| | - rif-script_latin |
| | - ril |
| | - rim |
| | - rjs |
| | - rkt |
| | - rmc-script_cyrillic |
| | - rmc-script_latin |
| | - rmo |
| | - rmy-script_cyrillic |
| | - rmy-script_latin |
| | - rng |
| | - rnl |
| | - roh-dialect_sursilv |
| | - roh-dialect_vallader |
| | - rol |
| | - ron |
| | - rop |
| | - rro |
| | - rub |
| | - ruf |
| | - rug |
| | - run |
| | - rus |
| | - sab |
| | - sag |
| | - sah |
| | - saj |
| | - saq |
| | - sas |
| | - sat |
| | - sba |
| | - sbd |
| | - sbl |
| | - sbp |
| | - sch |
| | - sck |
| | - sda |
| | - sea |
| | - seh |
| | - ses |
| | - sey |
| | - sgb |
| | - sgj |
| | - sgw |
| | - shi |
| | - shk |
| | - shn |
| | - sho |
| | - shp |
| | - sid |
| | - sig |
| | - sil |
| | - sja |
| | - sjm |
| | - sld |
| | - slk |
| | - slu |
| | - slv |
| | - sml |
| | - smo |
| | - sna |
| | - snd |
| | - sne |
| | - snn |
| | - snp |
| | - snw |
| | - som |
| | - soy |
| | - spa |
| | - spp |
| | - spy |
| | - sqi |
| | - sri |
| | - srm |
| | - srn |
| | - srp-script_cyrillic |
| | - srp-script_latin |
| | - srx |
| | - stn |
| | - stp |
| | - suc |
| | - suk |
| | - sun |
| | - sur |
| | - sus |
| | - suv |
| | - suz |
| | - swe |
| | - swh |
| | - sxb |
| | - sxn |
| | - sya |
| | - syl |
| | - sza |
| | - tac |
| | - taj |
| | - tam |
| | - tao |
| | - tap |
| | - taq |
| | - tat |
| | - tav |
| | - tbc |
| | - tbg |
| | - tbk |
| | - tbl |
| | - tby |
| | - tbz |
| | - tca |
| | - tcc |
| | - tcs |
| | - tcz |
| | - tdj |
| | - ted |
| | - tee |
| | - tel |
| | - tem |
| | - teo |
| | - ter |
| | - tes |
| | - tew |
| | - tex |
| | - tfr |
| | - tgj |
| | - tgk |
| | - tgl |
| | - tgo |
| | - tgp |
| | - tha |
| | - thk |
| | - thl |
| | - tih |
| | - tik |
| | - tir |
| | - tkr |
| | - tlb |
| | - tlj |
| | - tly |
| | - tmc |
| | - tmf |
| | - tna |
| | - tng |
| | - tnk |
| | - tnn |
| | - tnp |
| | - tnr |
| | - tnt |
| | - tob |
| | - toc |
| | - toh |
| | - tom |
| | - tos |
| | - tpi |
| | - tpm |
| | - tpp |
| | - tpt |
| | - trc |
| | - tri |
| | - trn |
| | - trs |
| | - tso |
| | - tsz |
| | - ttc |
| | - tte |
| | - ttq-script_tifinagh |
| | - tue |
| | - tuf |
| | - tuk-script_arabic |
| | - tuk-script_latin |
| | - tuo |
| | - tur |
| | - tvw |
| | - twb |
| | - twe |
| | - twu |
| | - txa |
| | - txq |
| | - txu |
| | - tye |
| | - tzh-dialect_bachajon |
| | - tzh-dialect_tenejapa |
| | - tzj-dialect_eastern |
| | - tzj-dialect_western |
| | - tzo-dialect_chamula |
| | - tzo-dialect_chenalho |
| | - ubl |
| | - ubu |
| | - udm |
| | - udu |
| | - uig-script_arabic |
| | - uig-script_cyrillic |
| | - ukr |
| | - umb |
| | - unr |
| | - upv |
| | - ura |
| | - urb |
| | - urd-script_arabic |
| | - urd-script_devanagari |
| | - urd-script_latin |
| | - urk |
| | - urt |
| | - ury |
| | - usp |
| | - uzb-script_cyrillic |
| | - uzb-script_latin |
| | - vag |
| | - vid |
| | - vie |
| | - vif |
| | - vmw |
| | - vmy |
| | - vot |
| | - vun |
| | - vut |
| | - wal-script_ethiopic |
| | - wal-script_latin |
| | - wap |
| | - war |
| | - waw |
| | - way |
| | - wba |
| | - wlo |
| | - wlx |
| | - wmw |
| | - wob |
| | - wol |
| | - wsg |
| | - wwa |
| | - xal |
| | - xdy |
| | - xed |
| | - xer |
| | - xho |
| | - xmm |
| | - xnj |
| | - xnr |
| | - xog |
| | - xon |
| | - xrb |
| | - xsb |
| | - xsm |
| | - xsr |
| | - xsu |
| | - xta |
| | - xtd |
| | - xte |
| | - xtm |
| | - xtn |
| | - xua |
| | - xuo |
| | - yaa |
| | - yad |
| | - yal |
| | - yam |
| | - yao |
| | - yas |
| | - yat |
| | - yaz |
| | - yba |
| | - ybb |
| | - ycl |
| | - ycn |
| | - yea |
| | - yka |
| | - yli |
| | - yor |
| | - yre |
| | - yua |
| | - yue-script_traditional |
| | - yuz |
| | - yva |
| | - zaa |
| | - zab |
| | - zac |
| | - zad |
| | - zae |
| | - zai |
| | - zam |
| | - zao |
| | - zaq |
| | - zar |
| | - zas |
| | - zav |
| | - zaw |
| | - zca |
| | - zga |
| | - zim |
| | - ziw |
| | - zlm |
| | - zmz |
| | - zne |
| | - zos |
| | - zpc |
| | - zpg |
| | - zpi |
| | - zpl |
| | - zpm |
| | - zpo |
| | - zpt |
| | - zpu |
| | - zpz |
| | - ztq |
| | - zty |
| | - zul |
| | - zyb |
| | - zyp |
| | - zza |
| | |
| | </details> |
| | |
| | ## Model details |
| | |
| | - **Developed by:** Vineel Pratap et al. |
| | - **Model type:** Multi-Lingual Automatic Speech Recognition model |
| | - **Language(s):** 1000+ languages, see [supported languages](#supported-languages) |
| | - **License:** CC-BY-NC 4.0 license |
| | - **Num parameters**: 1 billion |
| | - **Audio sampling rate**: 16,000 kHz |
| | - **Cite as:** |
| | |
| | @article{pratap2023mms, |
| | title={Scaling Speech Technology to 1,000+ Languages}, |
| | author={Vineel Pratap and Andros Tjandra and Bowen Shi and Paden Tomasello and Arun Babu and Sayani Kundu and Ali Elkahky and Zhaoheng Ni and Apoorv Vyas and Maryam Fazel-Zarandi and Alexei Baevski and Yossi Adi and Xiaohui Zhang and Wei-Ning Hsu and Alexis Conneau and Michael Auli}, |
| | journal={arXiv}, |
| | year={2023} |
| | } |
| | |
| | ## Additional Links |
| | |
| | - [Blog post](https://ai.facebook.com/blog/multilingual-model-speech-recognition/) |
| | - [Transformers documentation](https://huggingface.co/docs/transformers/main/en/model_doc/mms). |
| | - [Paper](https://arxiv.org/abs/2305.13516) |
| | - [GitHub Repository](https://github.com/facebookresearch/fairseq/tree/main/examples/mms#asr) |
| | - [Other **MMS** checkpoints](https://huggingface.co/models?other=mms) |
| | - MMS base checkpoints: |
| | - [facebook/mms-1b](https://huggingface.co/facebook/mms-1b) |
| | - [facebook/mms-300m](https://huggingface.co/facebook/mms-300m) |
| | - [Official Space](https://huggingface.co/spaces/facebook/MMS) |