Spaces:

Tech-di
/

WallTD-v.1

Sleeping

App Files Files Community

Feriel080 commited on Apr 6, 2025

Commit

700265c

verified ·

1 Parent(s): f8fc09b

Update README.md

Browse files

Files changed (1) hide show

README.md +118 -4

README.md CHANGED Viewed

@@ -1,11 +1,125 @@
 ---
 title: WallTD V.1
-emoji: 📊
-colorFrom: green
-colorTo: yellow
 sdk: docker
 pinned: false
 license: afl-3.0
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
 title: WallTD V.1
+emoji: 💻
+colorFrom: purple
+colorTo: purple
 sdk: docker
 pinned: false
 license: afl-3.0
 ---
+# WallD-v.1
+A FastAPI-based application for ***document summarization***, ***image interpretation***, ***data visualization***, and ***text translation*** using state-of-the-art machine learning models from Hugging Face.
+## Overview
+This project provides a web API that allows users to:
+* **Summarize** documents (DOCX, XLSX, PPTX, PDF, TXT) into concise, factual summaries.
+* **Interpret** images (PNG, JPG, JPEG, WEBP) by generating detailed descriptions.
+* **Generate Visualizations** from Excel data using AI-generated plotting code.
+* **Translate** text from documents into various languages.
+* **Answer Questions** about the content of documents and images.
+The application leverages models like `BART` for summarization, `Kosmos-2` for image interpretation, `StarCoder` for code generation, `M2M100` for translation, and includes a question-answering capability, all powered by Hugging Face's `Inference` API and `Transformers` library.
+## Features
+* **Document Summarization:** Extracts key points from large documents.
+* **Image Interpretation:** Describes image content, including any visible text.
+* **Data Visualization:** Generates Python plotting code for Excel data using `pandas`, `matplotlib`, and `seaborn`.
+* **Text Translation:** Translates document text into supported languages.
+* **Question Answering:** Answers user questions about document content or image details.
+* **File Management:** Uploads files, processes them, and provides downloadable results.
+## Requirements
+The app needs `python 3.9.11` (visit [python 3.9.11 ](https://www.python.org/downloads/release/python-3911/)to download it).
+All requirements are listed on `requirements.txt`
+## Installation
+1. **Clone the Repository:**
+   ```
+   git clone https://github.com/yourusername/docsumm-vision-api.git
+   cd docsumm-vision-api
+   ```
+2. **Install Dependencies:**
+   ```
+   pip install -r requirements.txt
+   ```
+3. **Set Environment Variables:**
+   * Create a `.env` file or set the `HF_TOKEN` environment variable with your Hugging Face API token:
+     **On Linux:** `export HF_TOKEN="your-huggingface-api-token"`
+     **On Windows:** `set HF_TOKEN=your-huggingface-api-token`
+4. **Run the Application:**
+   `uvicorn main:app --reload`
+## Usage
+### Endpoints
+**1. Document Summarization & Image Interpretation (`/docsum_imginter`):**
+* **Method:** POST
+* **Form Data:**
+  * `file`: Upload a file (DOCX, XLSX, PPTX, PDF, TXT, PNG, JPG, JPEG, WEBP)
+  * `task`: `"summarize"` (for documents) or `"interpret"` (for images)
+* **Response:**
+  * For documents: A summarized file download
+  * For images: JSON with a `caption` field (e.g., `{"caption": "A tiger in a forest"}`)
+**2. Data Visualization (`/generate-visualization`):**
+* **Method:** POST
+* **Form Data:**
+  * `file`: Upload an Excel file (XLSX)
+  * `task`: Description of the desired plot (e.g., "A bar chart of sales by region")
+* **Response:** The desired python code and a png image file of the generated plot.
+**3. Text Translation (`/translate`):**
+* **Method:** POST
+* **Form Data:**
+  * `file`: Upload a file (DOCX, XLSX, PPTX, PDF, TXT)
+  * `task`: Target language (e.g., French)
+* **Response:** A translated file download
+**4. Question Answering (`/ask`):**
+* **Method:** POST
+* **Form Data:**
+  * `file`: Upload a file (DOCX, XLSX, PPTX, PDF, TXT, PNG, JPG, JPEG, WEBP)
+  * `task`: A question about the file content
+* **Response:** JSON with an answer field
+**5. List Processed Files (`/processed_files`):**
+* **Method:** GET
+* **Response:** JSON list of processed file names
+**6. Download Processed File (`/download/{filename}`):**
+* **Method:** GET
+* **Response:** File download
+### Frontend
+* Access the basic frontend at `http://localhost:8000/` (serves `frontend/index.html`).
+  ![1743848842014](image/README/1743848842014.png)
+## Notes
+* **API Token:** You must have a valid Hugging Face API token (`HF_TOKEN`) to use the InferenceClient.
+* **File Cleanup:** Processed files are stored in the `processed/` directory; temporary uploads are in `updates/` and deleted after image interpretation.
+* **Limitations:**
+  * Visualization supports only Excel files.
+  * Summarization supports only files written in english
+  * Image interpretation can only be applied to images with no text on them
+  * Translation supports the following languages: *French*, *English*, *Spanish*, *German*, *Arabic*, *Chinese (Mandarin Chinese)*, *Japanese*, *Russian*
+Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference