| --- |
| model-index: |
| - name: Sociovestix/lenu_PL |
| results: |
| - task: |
| type: text-classification |
| name: Text Classification |
| dataset: |
| name: lenu |
| type: Sociovestix/lenu |
| config: PL |
| split: test |
| revision: f4d57b8d77a49ec5c62d899c9a213d23cd9f9428 |
| metrics: |
| - type: f1 |
| value: 0.9930020993701889 |
| name: f1 |
| - type: f1 |
| value: 0.6630198501925706 |
| name: f1 macro |
| args: |
| average: macro |
| widget: |
| - text: "INSTYTUT DIABETOLOGII SPÓŁKA Z OGRANICZONĄ ODPOWIEDZIALNOŚCIĄ" |
| - text: '"METAL-SYSTEM" OGRODZENIA - SCHODY SŁAWOMIR BINKOWSKI' |
| - text: "GERLACH S.A." |
| - text: "EMU SPÓŁKA Z OGRANICZONĄ ODPOWIEDZIALNOŚCIĄ SPÓŁKA KOMANDYTOWA" |
| - text: "JEREMIE SEED CAPITAL WOJEWÓDZTWA POMORSKIEGO FUNDUSZ INWESTYCYJNY ZAMKNIĘTY W LIKWIDACJI" |
| - text: "MIASTO BIELSKO-BIAŁA" |
| - text: 'MARKETING" KRYSTIAN GDOWKA, ARTUR OSTRĘGA SPÓŁKA JAWNA' |
| - text: "Bank Spółdzielczy w Poddębicach" |
| - text: 'Fundacja Dzieciom "POMAGAJ"' |
| - text: "KANCELARIA RADCÓW PRAWNYCH BRUDKIEWICZ, SUCHECKA SPÓŁKA KOMANDYTOWO-AKCYJNA" |
| - text: "AKADEMIA MARYNARKI WOJENNEJ IM. BOHATERÓW WESTERPLATTE" |
| - text: "ZGROMADZENIE SIÓSTR URSZULANEK UNII RZYMSKIEJ DOM ZAKONNY" |
| - text: "STOWARZYSZENIE AUTORÓW ZAIKS" |
| - text: "SKAT TRANSPORT PROSTA SPÓŁKA AKCYJNA" |
| - text: "Nationale-Nederlanden Dobrowolny Fundusz Emerytalny Nasze Jutro 2055" |
| - text: "STORY HOUSE EGMONT SPÓŁKA Z OGRANICZONĄ ODPOWIEDZIALNOŚCIĄ" |
| - text: "Narodowy Fundusz Ochrony Środowiska i Gospodarki Wodnej" |
| - text: 'ORGANIZACJA ZAKŁADOWA NSZZ "SOLIDARNOŚĆ" NR 3395 W T-MOBILE POLSKA S.A.' |
| - text: "CI GAMES SPÓŁKA EUROPEJSKA" |
| - text: "PPK Pocztylion 2040 Dobrowolny Fundusz Emerytalny" |
| - text: "TOWARZYSTWO UBEZPIECZEŃ WZAJEMNYCH POLSKI ZAKŁAD UBEZPIECZEŃ WZAJEMNYCH" |
| - text: "KABANEK JANINA POTORSKA ROBERT POTORSKI" |
| - text: "SPÓŁDZIELCZA KASA OSZCZĘDNOŚCIOWO-KREDYTOWA ENERGIA" |
| - text: "SZOSTEK_BAR I PARTNERZY KANCELARIA PRAWNA" |
| - text: "MIEJSKI ZARZĄD BUDYNKÓW MIESZKALNYCH" |
| - text: "IZBA ADWOKACKA W KATOWICACH" |
| - text: '1. Niepubliczny Specjalistyczny Zakład Opieki Zdrowotnej "LUNG" Krzysztof Garbino 2. Drukarnia "GARBINO"' |
| --- |
| |
| # LENU - Legal Entity Name Understanding for Poland |
|
|
| A Polish Bert (uncased) model fine-tuned on Polish legal entity names (jurisdiction PL) from the Global [Legal Entity Identifier](https://www.gleif.org/en/about-lei/introducing-the-legal-entity-identifier-lei) |
| (LEI) System with the goal to detect [Entity Legal Form (ELF) Codes](https://www.gleif.org/en/about-lei/code-lists/iso-20275-entity-legal-forms-code-list). |
|
|
| --------------- |
|
|
| <h1 align="center"> |
| <a href="https://gleif.org"> |
| <img src="https://www.gleif.org/assets/build/img/logo/gleif-logo-new.svg" width="220px" style="display: inherit"> |
| </a> |
| </h1><br> |
| <h3 align="center">in collaboration with</h3> |
| <h1 align="center"> |
| <a href="https://sociovestix.com"> |
| <img src="https://www.sociovestix.com/img/svl_logo.png" width="450px" style="width: 75%"> |
| </a> |
| </h1><br> |
|
|
| --------------- |
|
|
| ## Model Description |
|
|
| <!-- Provide a longer summary of what this model is. --> |
|
|
| The model has been created as part of a collaboration of the [Global Legal Entity Identifier Foundation](https://gleif.org) (GLEIF) and |
| [Sociovestix Labs](https://sociovestix.com) with the goal to explore how Machine Learning can support in detecting the ELF Code solely based on an entity's legal name and legal jurisdiction. |
| See also the open source python library [lenu](https://github.com/Sociovestix/lenu), which supports in this task. |
|
|
| The model has been trained on the dataset [lenu](https://huggingface.co/datasets/Sociovestix), with a focus on polish legal entities and ELF Codes within the Jurisdiction "PL". |
|
|
| - **Developed by:** [GLEIF](https://gleif.org) and [Sociovestix Labs](https://huggingface.co/Sociovestix) |
| - **License:** Creative Commons (CC0) license |
| - **Finetuned from model [optional]:** dkleczek/bert-base-polish-uncased-v1 |
| - **Resources for more information:** [Press Release](https://www.gleif.org/en/newsroom/press-releases/machine-learning-new-open-source-tool-developed-by-gleif-and-sociovestix-labs-enables-organizations-everywhere-to-automatically-) |
|
|
| # Uses |
|
|
| <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. --> |
|
|
| An entity's legal form is a crucial component when verifying and screening organizational identity. |
| The wide variety of entity legal forms that exist within and between jurisdictions, however, has made it difficult for large organizations to capture legal form as structured data. |
| The Jurisdiction specific models of [lenu](https://github.com/Sociovestix/lenu), trained on entities from |
| GLEIF’s Legal Entity Identifier (LEI) database of over two million records, will allow banks, |
| investment firms, corporations, governments, and other large organizations to retrospectively analyze |
| their master data, extract the legal form from the unstructured text of the legal name and |
| uniformly apply an ELF code to each entity type, according to the ISO 20275 standard. |
|
|
|
|
| # Licensing Information |
|
|
| This model, which is trained on LEI data, is available under Creative Commons (CC0) license. |
| See [gleif.org/en/about/open-data](https://gleif.org/en/about/open-data). |
|
|
| # Recommendations |
|
|
| Users should always consider the score of the suggested ELF Codes. For low score values it may be necessary to manually review the affected entities. |
|
|