|
|
--- |
|
|
title: Multilingual Text Summarizer |
|
|
emoji: π |
|
|
colorFrom: blue |
|
|
colorTo: pink |
|
|
sdk: gradio |
|
|
sdk_version: 5.29.0 |
|
|
app_file: app.py |
|
|
pinned: false |
|
|
license: apache-2.0 |
|
|
--- |
|
|
|
|
|
|
|
|
# π§ Multilingual Text Summarizer with Transformers |
|
|
|
|
|
This project is a web-based application that summarizes English or French text using LLMs. It supports direct input, `.txt`, and `.pdf` files with automatic language detection. |
|
|
|
|
|
The project uses **Large Language Models (LLMs)** such as **BART** or **T5**, deployed via a simple, interactive **Gradio** interface. |
|
|
|
|
|
|
|
|
## π Objectives |
|
|
|
|
|
- Automate the **synthesis of long texts** (e-mails, reports, news...) |
|
|
- Apply **automatic summarization techniques with LLMs**. |
|
|
- Propose a **simple and responsive user interface**. |
|
|
- Demonstrate a **real-life case of NLP model industrialization**. |
|
|
|
|
|
|
|
|
## π§ Technical stack |
|
|
|
|
|
- [Transformers](https://huggingface.co/docs/transformers/index) - Pre-trained models (BART, T5...) |
|
|
- [Streamlit](https://streamlit.io) - Web interface |
|
|
- [Gradio](https://www.gradio.app/) - Web interface |
|
|
- [Python](https://www.python.org) - Processing & pipeline |
|
|
- [Data](https://huggingface.co/datasets/abisee/cnn_dailymail/viewer/2.0.0?views%5B%5D=_200_train) - abisee/cnn_dailymail |
|
|
|
|
|
- (Bonus) Docker, FastAPI, GitHub Actions - MLOps |
|
|
|
|
|
|
|
|
## β¨ Features |
|
|
- Automatic language detection (English or French) |
|
|
- Summarization using state-of-the-art models |
|
|
- Gradio-based web interface |
|
|
- Supports text, .txt and .pdf inputs |
|
|
|
|
|
## π Run the App |
|
|
|
|
|
```bash |
|
|
git clone https://github.com/issa-kabore/SmartSummarizer.git |
|
|
cd SmartSummarizer |
|
|
pip install -r requirements.txt |
|
|
python app_gradio.py |
|
|
``` |
|
|
|
|
|
|
|
|
## π Demo |
|
|
π [Link to deployed app](https://...) |
|
|
πΈ See screenshots below |
|
|
|
|
|
|
|
|
## π Project structure |
|
|
```bash |
|
|
SmartSummarizer/ |
|
|
β |
|
|
βββ app_gradio.py # Gradio main script (user interface) |
|
|
βββ summarizer/ |
|
|
β βββ __init__.py |
|
|
β βββ models.py # Loading models and pipelines |
|
|
β βββ utils.py # Import functions .txt/.pdf and Language detection |
|
|
β βββ summarize.py # Main summary function |
|
|
β |
|
|
βββ assets/ # (Optional) static files: images, logos, etc. |
|
|
β |
|
|
βββ requirements.txt # Dependencies to install |
|
|
βββ README.md # Project presentation |
|
|
βββ .gitignore # Files to be ignored by Git |
|
|
|
|
|
``` |