| --- |
| title: NLP App |
| emoji: ⚡ |
| colorFrom: indigo |
| colorTo: indigo |
| sdk: streamlit |
| sdk_version: 1.31.0 |
| app_file: app.py |
| pinned: false |
| --- |
| |
| Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference |
|
|
| ## NLP App Hugging Face's logo |
| Hugging Face |
| # Streamlit app with computer vision 💡 |
| Elbrus Bootcamp | Phase-2 | Team Project |
|
|
| ## Team🧑🏻💻 |
| 1. [Awlly](https://github.com/Awlly) |
| 2. [sakoser](https://github.com/sakoser) |
| 3. [whoisida]https://github.com/whoisida |
|
|
| ## Task 📌lassifi |
| Create a service that classifies movie reviews into good, neutral and bad categories, a service that classifies user input as toxic or non-toxic, as well as a GPT 2 based text generation service that was trained to emulate a certain author’s writing. |
|
|
| ## Contents 📝 |
| 1. Classifies movie reviewsusing LSTM,ruBert,BOW 💨 [Dataset](https://drive.google.com/file/d/1c92sz81bEfOw-rutglKpmKGm6rySmYbt/view?usp=sharing) |
| 2. classifies user input as toxic or non-toxi using ruBert-tiny-toxicity 📑 [Dataset](https://drive.google.com/file/d/1O7orH9CrNEhnbnA5KjXji8sgrn6iD5n-/view?usp=drive_link) |
| 3. GPT 2 based text generation service |
|
|
| ## Deployment 🎈 |
| The service is implemented on [Hugging Face](https://huggingface.co/spaces/Awlly/NLP_app) |
|
|
| ## Libraries 📖 |
| ```python |
| import os |
| import unicodedata |
| import nltk |
| from dataclasses import dataclass |
| import joblib |
| import numpy as np |
| import matplotlib.pyplot as plt |
| import torch |
| import torch.nn as nn |
| import torch.nn.functional as F |
| import torch.optim as optim |
| from torch.utils.data import DataLoader, TensorDataset |
| from torchvision.datasets import ImageFolder |
| from torchvision import datasets |
| from torchvision import transforms as T |
| from torchvision.io import read_image |
| from torch.utils.data import Dataset, random_split |
| import torchutils as tu |
| from transformers import GPT2LMHeadModel, GPT2Tokenizer |
| from typing import Tuple |
| from tqdm import tqdm |
| from transformers import AutoModel, AutoTokenizer |
| from transformers import AutoModelForSequenceClassification |
| import pydensecrf.densecrf as dcrf |
| import pydensecrf.utils as dcrf_utils |
| from preprocessing import data_preprocessing |
| import streamlit as st |
| import string |
| from sklearn.linear_model import LogisticRegression |
| import re |
| |
| |
| |
| |
| from preprocessing import preprocess_single_string |
| ``` |
|
|
|
|
| from preprocessing import data_preprocessing |
| |
| |
| |
| |
| ## Guide 📜 |
| #### How to run locally? |
| |
| 1. To create a Python virtual environment for running the code, enter: |
| |
| ``python3 -m venv my-env`` |
| |
| 2. Activate the new environment: |
| |
| * Windows: ```my-env\Scripts\activate.bat``` |
| * macOS and Linux: ```source my-env/bin/activate``` |
| |
| |