--- license: mit language: - en tags: - lstm - text-segmentation - lightweight - client-side - web - onnxruntime-web - speech-to-text - low-memory-footprint --- Check this [NPM package](https://github.com/orgs/the-vedantic-coder/packages/npm/stst/580080323) (built for Speech-To-Text usecase) implements the setup and inference for this model. It provides a [React app demo](https://sentence-splitter-poc.vercel.app/) and a `processDirectText` method to try direct inference on text. The sentence splitter model is modification of the LSTM model with around 500 B input size taken from the repository: [NNSplit](https://github.com/kornelski/nnsplit) The size of the model used here is **~4 MB**. | | NNSplit | Spacy (Tagger) | Spacy (Sentencizer) | |------------------------|------------|----------------|---------------------| | Clean | 0.754371 | 0.853603 | 0.820934 | | Partial punctuation | 0.485907 | 0.517829 | 0.249753 | | Partial case | 0.761754 | 0.825119 | 0.819679 | | Partial punctuation and case | 0.443704 | 0.458619 | 0.249873 | | No punctuation and case| 0.166273 | 0.180859 | 0.00463281 | ### Example No punctuation and no cases (~17% accuracy)
**Input:** ```text the difference between rest and graphql is explained as follows rest is an architectural style that exposes resources via endpoints typically following crud operations each endpoint returns a fixed data structure graphql on the other hand allows clients to specify exactly what data they need in a single query often reducing overfetching and underfetching issues ``` **Result: 28.90ms ✅** ![image](https://cdn-uploads.huggingface.co/production/uploads/687852a7f13fbe6c3d4c9974/LrASgi-BmWROVZIUK-36-.png)