Spaces:
Runtime error
Runtime error
File size: 4,937 Bytes
8eab63f | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 | ---
title: Controlled Text Summarization
emoji: π
colorFrom: yellow
colorTo: purple
sdk: gradio # VERY IMPORTANT: Specifies the framework (gradio, streamlit, docker, static)
sdk_version: 5.20.1 # Optional but recommended: Specify the SDK version
app_file: main.py # VERY IMPORTANT: The main Python file to run
pinned: false # Optional: Whether to pin the Space in your profile
license: mit # Optional: The license of your project (e.g., mit, apache-2.0, agpl-3.0)
---
# Creative Text Summarization with Style Control
A machine learning system that summarizes text in different stylistic variations (formal, informal, humorous, poetic) while preserving the content.
## Overview
This project creates an AI-powered text summarization system that not only condenses text but adapts its output to different stylistic preferences. It uses transformer models fine-tuned with style-specific summaries to generate summaries that match requested styles while maintaining accuracy.
## Features
- **Multiple Summary Styles**: Generate summaries in formal, informal, humorous, or poetic styles
- **Pre-trained Models**: Based on BART and other transformer architectures
- **User-friendly Interface**: Simple Gradio UI for interactive summary generation
- **Evaluation Metrics**: ROUGE and BLEU scores to evaluate summary quality
## Installation
1. Clone this repository:
```bash
git clone https://github.com/AriachAmine/controlled-text-summarization.git
cd controlled-text-summarization
```
2. Install dependencies:
```bash
pip install -r requirements.txt
```
3. Set up the Gemini API key:
```bash
# Linux/MacOS
export GEMINI_API_KEY="your-api-key-here"
# Windows
set GEMINI_API_KEY="your-api-key-here"
```
## Usage
### Running the Application
```bash
python main.py
```
This will:
1. Load the base model or a fine-tuned model (if available)
2. Prepare a dataset with stylized summaries (if needed)
3. Fine-tune the model on the prepared dataset (if no fine-tuned model exists)
4. Launch the Gradio interface for interactive summarization
### Using the Gradio Interface
1. Enter the text you want to summarize in the text box
2. Select your desired summary style from the dropdown (formal, informal, humorous, poetic)
3. Click "Submit" to generate the stylized summary
## How It Works
1. **Base Model**: Starts with a pre-trained text summarization model (BART)
2. **Style Training**: Fine-tunes the model on summaries with specific styles
3. **Style Control**: Uses style tokens to control output style during generation
4. **Evaluation**: Measures quality using ROUGE and BLEU metrics
## Project Structure
```
controlled-text-summarization/
βββ main.py # Main script to run the application
βββ model.py # Model loading and summarization functions
βββ data.py # Data preparation and processing
βββ evaluation.py # Metrics for evaluating summaries
βββ ui.py # Gradio interface
βββ requirements.txt # Project dependencies
βββ summarization_model/ # Directory for fine-tuned models (created after training)
```
## Dependencies
- torch, transformers: For model loading and fine-tuning
- gradio: For the user interface
- google-generativeai: For generating style-specific training data
- datasets, rouge_score, nltk: For data handling and evaluation
## Example
Input:
```
Scientists have discovered a new species of deep-sea fish that can withstand extreme pressure. The fish, found at depths of over 8,000 meters, has unique adaptations including specialized cell membranes and pressure-resistant proteins. This discovery may lead to new applications in biotechnology and materials science.
```
Output (Formal Style):
```
Researchers have identified a novel deep-sea fish species capable of surviving extreme pressures at depths exceeding 8,000 meters. The species exhibits specialized adaptations in cell membrane structure and pressure-resistant proteins, potentially offering valuable insights for biotechnology and materials science applications.
```
Output (Humorous Style):
```
Talk about a fish out of water... or rather, a fish VERY deep IN water! Scientists just found a super fish that laughs in the face of crushing ocean pressure. This deep-sea champion, chilling at 8,000 meters down, has fancy cell membranes and proteins that basically say "pressure, what pressure?" Scientists are already dreaming up ways to copy these deep-sea survival tricks for cool new tech!
```
## Future Improvements
- Add more styles (technical, narrative, etc.)
- Implement user feedback collection to improve models
- Add style strength control (slightly humorous vs. very humorous)
- Create a web API for integration with other applications
## License
[MIT License](LICENSE)
|