Spaces:

Huzaifa367
/

Text-Classification-Api

Runtime error

App Files Files Community

Huzaifa367 commited on May 15, 2024

Commit

d0ca212

verified ·

1 Parent(s): f8f0c53

Upload 8 files

Browse files

Files changed (8) hide show

Dockerfile +61 -0
README (1).md +11 -0
README.Docker.md +65 -0
compose.yaml +49 -0
dockerignore +34 -0
main.py +106 -0
requirements.txt +8 -0
test_main.py +70 -0

Dockerfile ADDED Viewed

	@@ -0,0 +1,61 @@

+# Comments are provided throughout this file to help you get started.
+# If you need more help, visit the Dockerfile reference guide at
+# https://docs.docker.com/go/dockerfile-reference/
+# Want to help us make this template better? Share your feedback here: https://forms.gle/ybq9Krt8jtBL3iCk7
+ARG PYTHON_VERSION=3.11.9
+FROM python:${PYTHON_VERSION}-slim as base
+# Prevents Python from writing pyc files.
+ENV PYTHONDONTWRITEBYTECODE=1
+# Keeps Python from buffering stdout and stderr to avoid situations where
+# the application crashes without emitting any logs due to buffering.
+ENV PYTHONUNBUFFERED=1
+WORKDIR /app
+# Create a non-privileged user that the app will run under.
+# See https://docs.docker.com/go/dockerfile-user-best-practices/
+ARG UID=10001
+RUN adduser \
+    --disabled-password \
+    --gecos "" \
+    --home "/nonexistent" \
+    --shell "/sbin/nologin" \
+    --no-create-home \
+    --uid "${UID}" \
+    appuser
+# Download dependencies as a separate step to take advantage of Docker's caching.
+# Leverage a cache mount to /root/.cache/pip to speed up subsequent builds.
+# Leverage a bind mount to requirements.txt to avoid having to copy them into
+# into this layer.
+RUN --mount=type=cache,target=/root/.cache/pip \
+    --mount=type=bind,source=requirements.txt,target=requirements.txt \
+    python -m pip install -r requirements.txt
+# Switch to the non-privileged user to run the application.
+USER appuser
+# Set the TRANSFORMERS_CACHE environment variable
+ENV TRANSFORMERS_CACHE=/tmp/.cache/huggingface
+# Create the cache folder with appropriate permissions
+RUN mkdir -p $TRANSFORMERS_CACHE && chmod -R 777 $TRANSFORMERS_CACHE
+# Set NLTK data directory
+ENV NLTK_DATA=/tmp/nltk_data
+# Create the NLTK data directory with appropriate permissions
+RUN mkdir -p $NLTK_DATA && chmod -R 777 $NLTK_DATA
+# Copy the source code into the container.
+COPY . .
+# Expose the port that the application listens on.
+EXPOSE 8000
+# Run the application.
+CMD uvicorn 'main:app' --host=0.0.0.0 --port=7860

README (1).md ADDED Viewed

	@@ -0,0 +1,11 @@

+---
+title: Text Classification API
+emoji: 🐢
+colorFrom: red
+colorTo: gray
+sdk: docker
+pinned: false
+license: apache-2.0
+---
+Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

README.Docker.md ADDED Viewed

	@@ -0,0 +1,65 @@

+# Text Classification API
+**FastAPI, Docker, and Hugging Face Transformers**\
+This API provides text classification capabilities using a pre-trained model for sentiment analysis. It allows users to analyze the sentiment of text inputs and obtain the corresponding sentiment labels.
+- The API has been built using the Hugging Face `transformers` library.
+- It uses the following pre-trained transformer model from Hugging Face:
+  - `cardiffnlp/twitter-roberta-base-sentiment-latest`
+- It classifies the text as `positive`, `negative`, or `neutral`.
+## Table of Contents
+- [Text Classification API](#text-classification-api)
+  - [Table of Contents](#table-of-contents)
+  - [Introduction](#introduction)
+  - [Installation](#installation)
+  - [Usage](#usage)
+  - [Documentation](#documentation)
+  - [Building and Running the Docker Container](#building-and-running-the-docker-container)
+  - [Interacting with the API](#interacting-with-the-api)
+  - [Acknowledgments](#acknowledgments)
+  - [License](#license)
+## Introduction
+This API is built using FastAPI and leverages a pre-trained sentiment analysis model from the Hugging Face model hub. It preprocesses the input text and passes it through the model to classify the sentiment as positive, negative, or neutral.
+## Installation
+To install and run the API locally, follow these steps:
+1. Clone this repository to your local machine.
+2. Ensure you have Docker installed.
+3. Change the port to 8000 in the Dockerfile.
+4. Build the Docker container using the provided Dockerfile.
+5. Run the Docker container.
+## Usage
+To use the API, send HTTP requests to the appropriate endpoints. The API provides the following endpoints:
+- `GET /`: Welcome endpoint, returns a greeting message.
+- `POST /analyze/{text}`: Analyze endpoint, classifies the sentiment of the provided text.
+## Documentation
+The API is documented using FastAPI's automatic documentation features. You can access the API documentation using the Swagger UI or ReDoc interface. Simply navigate to the appropriate URL after starting the API server.
+- **Swagger UI**  `http://localhost:8000/docs`
+- **ReDoc**  `http://localhost:8000/redoc`
+## Building and Running the Docker Container
+To build and run the Docker container, follow these steps:
+1. Navigate to the folder in which your FastAPI app resides.
+2. Build a Docker image using the following command
+    ```
+    docker build -t text-classification-api .
+    ```
+3. Containerize the application by creating a Docker container from the built image
+    ```
+    docker run -d -p 8000:8000 text-classification-api
+    ```
+4. The API will be available at `http://localhost:8000`
+5. The API documentaion will be avaialable at `http://localhost:8000/docs` or `http://localhost:8000/redoc`
+## Interacting with the API
+Once the API is running, you can interact with it using HTTP requests.
+## Acknowledgments
+This API was built with inspiration from various open-source projects and libraries. Special thanks to the developers and contributors of FastAPI, Hugging Face Transformers, and NLTK.
+## License
+This project is licensed under the [Apache license version 2.0](LICENSE).

compose.yaml ADDED Viewed

	@@ -0,0 +1,49 @@

+# Comments are provided throughout this file to help you get started.
+# If you need more help, visit the Docker Compose reference guide at
+# https://docs.docker.com/go/compose-spec-reference/
+# Here the instructions define your application as a service called "server".
+# This service is built from the Dockerfile in the current directory.
+# You can add other services your application may depend on here, such as a
+# database or a cache. For examples, see the Awesome Compose repository:
+# https://github.com/docker/awesome-compose
+services:
+  server:
+    build:
+      context: .
+    ports:
+      - 8000:8000
+# The commented out section below is an example of how to define a PostgreSQL
+# database that your application can use. `depends_on` tells Docker Compose to
+# start the database before your application. The `db-data` volume persists the
+# database data between container restarts. The `db-password` secret is used
+# to set the database password. You must create `db/password.txt` and add
+# a password of your choosing to it before running `docker compose up`.
+#     depends_on:
+#       db:
+#         condition: service_healthy
+#   db:
+#     image: postgres
+#     restart: always
+#     user: postgres
+#     secrets:
+#       - db-password
+#     volumes:
+#       - db-data:/var/lib/postgresql/data
+#     environment:
+#       - POSTGRES_DB=example
+#       - POSTGRES_PASSWORD_FILE=/run/secrets/db-password
+#     expose:
+#       - 5432
+#     healthcheck:
+#       test: [ "CMD", "pg_isready" ]
+#       interval: 10s
+#       timeout: 5s
+#       retries: 5
+# volumes:
+#   db-data:
+# secrets:
+#   db-password:
+#     file: db/password.txt

dockerignore ADDED Viewed

	@@ -0,0 +1,34 @@

+# Include any files or directories that you don't want to be copied to your
+# container here (e.g., local build artifacts, temporary files, etc.).
+#
+# For more help, visit the .dockerignore file reference guide at
+# https://docs.docker.com/go/build-context-dockerignore/
+**/.DS_Store
+**/__pycache__
+**/.venv
+**/.classpath
+**/.dockerignore
+**/.env
+**/.git
+**/.gitignore
+**/.project
+**/.settings
+**/.toolstarget
+**/.vs
+**/.vscode
+**/*.*proj.user
+**/*.dbmdl
+**/*.jfm
+**/bin
+**/charts
+**/docker-compose*
+**/compose*
+**/Dockerfile*
+**/node_modules
+**/npm-debug.log
+**/obj
+**/secrets.dev.yaml
+**/values.dev.yaml
+LICENSE
+README.md

main.py ADDED Viewed

	@@ -0,0 +1,106 @@

+from contextlib import asynccontextmanager
+from fastapi import FastAPI, HTTPException
+from pydantic import BaseModel, ValidationError
+from fastapi.encoders import jsonable_encoder
+# TEXT PREPROCESSING
+# --------------------------------------------------------------------
+import re
+import string
+import nltk
+nltk.download('punkt')
+nltk.download('wordnet')
+nltk.download('omw-1.4')
+from nltk.stem import WordNetLemmatizer
+# Function to remove URLs from text
+def remove_urls(text):
+    return re.sub(r'http[s]?://\S+', '', text)
+# Function to remove punctuations from text
+def remove_punctuation(text):
+    regular_punct = string.punctuation
+    return str(re.sub(r'['+regular_punct+']', '', str(text)))
+# Function to convert the text into lower case
+def lower_case(text):
+    return text.lower()
+# Function to lemmatize text
+def lemmatize(text):
+    wordnet_lemmatizer = WordNetLemmatizer()
+    tokens = nltk.word_tokenize(text)
+    lemma_txt = ''
+    for w in tokens:
+        lemma_txt = lemma_txt + wordnet_lemmatizer.lemmatize(w) + ' '
+    return lemma_txt
+def preprocess_text(text):
+    # Preprocess the input text
+    text = remove_urls(text)
+    text = remove_punctuation(text)
+    text = lower_case(text)
+    text = lemmatize(text)
+    return text
+# Load the model using FastAPI lifespan event so that teh model is loaded at the beginning for efficiency
+@asynccontextmanager
+async def lifespan(app: FastAPI):
+    # Load the model from HuggingFace transformers library
+    from transformers import pipeline
+    global sentiment_task
+    sentiment_task = pipeline("sentiment-analysis", model="cardiffnlp/twitter-roberta-base-sentiment-latest", tokenizer="cardiffnlp/twitter-roberta-base-sentiment-latest")
+    yield
+    # Clean up the model and release the resources
+    del sentiment_task
+description = """
+## Text Classification API
+This app shows the sentiment of the text (positive, negative, or neutral).
+Check out the docs for the `/analyze/{text}` endpoint below to try it out!
+"""
+# Initialize the FastAPI app
+app = FastAPI(lifespan=lifespan, docs_url="/", description=description)
+# Define the input data model
+class TextInput(BaseModel):
+    text: str
+# Define the welcome endpoint
+@app.get('/')
+async def welcome():
+    return "Welcome to our Text Classification API"
+# Validate input text length
+MAX_TEXT_LENGTH = 1000
+# Define the sentiment analysis endpoint
+@app.post('/analyze/{text}')
+async def classify_text(text_input:TextInput):
+    try:
+        # Convert input data to JSON serializable dictionary
+        text_input_dict = jsonable_encoder(text_input)
+        # Validate input data using Pydantic model
+        text_data = TextInput(**text_input_dict)  # Convert to Pydantic model
+        # Validate input text length
+        if len(text_input.text) > MAX_TEXT_LENGTH:
+            raise HTTPException(status_code=400, detail="Text length exceeds maximum allowed length")
+        elif len(text_input.text) == 0:
+            raise HTTPException(status_code=400, detail="Text cannot be empty")
+    except ValidationError as e:
+        # Handle validation error
+        raise HTTPException(status_code=422, detail=str(e))
+    try:
+        # Perform text classification
+        return sentiment_task(preprocess_text(text_input.text))
+    except ValueError as ve:
+        # Handle value error
+        raise HTTPException(status_code=400, detail=str(ve))
+    except Exception as e:
+        # Handle other server errors
+        raise HTTPException(status_code=500, detail=str(e))

requirements.txt ADDED Viewed

	@@ -0,0 +1,8 @@

+fastapi==0.110.1
+nest_asyncio==1.6.0
+nltk==3.8.1
+pydantic==2.7.0
+transformers==4.38.2
+uvicorn==0.29.0
+torch==2.2.2
+pytest

test_main.py ADDED Viewed

	@@ -0,0 +1,70 @@

+rom fastapi.testclient import TestClient
+from main import app
+from main import TextInput
+from fastapi.encoders import jsonable_encoder
+client = TestClient(app)
+# Test the welcome endpoint
+def test_welcome():
+    # Test the welcome endpoint
+    response = client.get("/")
+    assert response.status_code == 200
+    assert response.json() == "Welcome to our Text Classification API"
+# Test the sentiment analysis endpoint for positive sentiment
+def test_positive_sentiment():
+    with client:
+        # Define the request payload
+        # Initialize payload as a TextInput object
+        payload = TextInput(text="I love this product! It's amazing!")
+        # Convert TextInput object to JSON-serializable dictionary
+        payload_dict = jsonable_encoder(payload)
+        # Send a POST request to the sentiment analysis endpoint
+        response = client.post("/analyze/{text}", json=payload_dict)
+        # Assert that the response status code is 200 OK
+        assert response.status_code == 200
+        # Assert that the sentiment returned is positive
+        assert response.json()[0]['label'] == "positive"
+# Test the sentiment analysis endpoint for negative sentiment
+def test_negative_sentiment():
+    with client:
+        # Define the request payload
+        # Initialize payload as a TextInput object
+        payload = TextInput(text="I'm really disappointed with this service. It's terrible.")
+        # Convert TextInput object to JSON-serializable dictionary
+        payload_dict = jsonable_encoder(payload)
+        # Send a POST request to the sentiment analysis endpoint
+        response = client.post("/analyze/{text}", json=payload_dict)
+        # Assert that the response status code is 200 OK
+        assert response.status_code == 200
+        # Assert that the sentiment returned is positive
+        assert response.json()[0]['label'] == "negative"
+# Test the sentiment analysis endpoint for neutral sentiment
+def test_neutral_sentiment():
+    with client:
+        # Define the request payload
+        # Initialize payload as a TextInput object
+        payload = TextInput(text="This is a neutral statement.")
+        # Convert TextInput object to JSON-serializable dictionary
+        payload_dict = jsonable_encoder(payload)
+        # Send a POST request to the sentiment analysis endpoint
+        response = client.post("/analyze/{text}", json=payload_dict)
+        # Assert that the response status code is 200 OK
+        assert response.status_code == 200
+        # Assert that the sentiment returned is positive
+        assert response.json()[0]['label'] == "neutral"