Spaces:
Configuration error
Configuration error
| title: README | |
| emoji: 💠| |
| colorFrom: yellow | |
| colorTo: indigo | |
| sdk: streamlit | |
| pinned: false | |
| Welcome to our space! 🎊 | |
| The [Unstructured.io](https://www.unstructured.io/) Team provides libraries with open-source components for pre-processing text documents | |
| such as **PDFs**, **HTML** and **Word** Documents. These components are packaged as *bricks* 🧱, which provide | |
| users the building blocks they need to build pipelines targeted at the documents they care | |
| about. Bricks in the library fall into three categories: | |
| - 🧩 ***Partitioning bricks*** that break raw documents down into standard, structured | |
| elements. | |
| - 🧹 ***Cleaning bricks*** that remove unwanted text from documents, such as boilerplate and | |
| sentence | |
| fragments. | |
| - 🎠***Staging bricks*** that format data for downstream tasks, such as ML inference | |
| and data labeling. | |
| In this space we explore different settings of deep-learning models fine-tuned with several datasets containing a | |
| specific document type and corresponding annotations. | |
| Main GitHub repository link: [here](https://github.com/Unstructured-IO/unstructured) | |