# Bayan - Arabic Text Summarization Setup Guide
## Overview
Bayan is an Arabic text summarization application with a web interface. This guide will help you set up and run the application.
## Prerequisites
- Python 3.8 or higher
- pip (Python package manager)
- At least 4GB RAM (8GB+ recommended for better performance)
- Model files in the correct location (see below)
## Installation Steps
### 1. Install Dependencies
```bash
pip install -r requirements.txt
```
**Note:** If you encounter issues installing PyTorch, you may need to install it separately:
- For CPU: `pip install torch --index-url https://download.pytorch.org/whl/cpu`
- For CUDA: Visit https://pytorch.org/get-started/locally/ for the appropriate command
### 2. Verify Model Location
The model should be located at:
```
models/arabic_summarization_model/content/drive/MyDrive/arabic_summarization_model/
```
Required files:
- `config.json`
- `tokenizer.json`
- `model.safetensors`
- `sentencepiece.bpe.model`
- Other tokenizer/model files
### 3. Run the Application
#### Option A: Using the run script (Recommended)
```bash
python run_app.py
```
#### Option B: Direct Flask run
```bash
cd src
python app.py
```
#### Option C: Using Flask CLI
```bash
cd src
export FLASK_APP=app.py
flask run
```
### 4. Access the Application
Open your browser and navigate to:
```
http://localhost:5000
```
## Configuration
### Environment Variables
- `PORT`: Server port (default: 5000)
- `DEBUG`: Enable debug mode (default: False)
```bash
export DEBUG=True
export PORT=8080
```
### Supabase Authentication (Phase 5)
See `.env.example` and `PHASE_5_IMPLEMENTATION_PLAN.md`.
1. Create a Supabase project and enable **Anonymous** + **Google** auth.
2. Run `supabase/migrations/001_profiles.sql` in the SQL Editor.
3. Set meta tags in `src/index.html`:
```html
```
4. Add redirect URL: `http://localhost:5000/**`
If Supabase is not configured, the editor still works in offline auth mode.
### Model Not Found Error
If you see "Model not found" error:
1. Verify the model path exists
2. Check that all required files are present
3. The application will search multiple possible paths automatically
### Out of Memory Error
If you encounter memory issues:
1. Close other applications
2. Use CPU mode (it will automatically use CPU if CUDA is not available)
3. Reduce the `MAX_TEXT_LENGTH` in `src/app.py` if needed
### Port Already in Use
If port 5000 is already in use:
```bash
export PORT=5001
python run_app.py
```
### Slow Performance
- First run will be slower as the model loads
- Subsequent requests will be faster
- Using GPU (CUDA) significantly improves performance
## API Endpoints
### Health Check
```
GET /api/health
```
Returns server status and model loading state.
### Summarize Text
```
POST /api/summarize
Content-Type: application/json
{
"text": "النص العربي المراد تلخيصه...",
"length": 2, // 1=short, 2=medium, 3=long
"full_text": true
}
```
Response:
```json
{
"status": "success",
"summary": "الملخص المولد...",
"original_length": 500,
"summary_length": 150
}
```
## Security Features
- Input validation (text length limits)
- CORS enabled for web interface
- Error handling and logging
- Path validation for model files
- Safe model loading with fallbacks
## Development
### Running in Debug Mode
```bash
export DEBUG=True
python run_app.py
```
### Testing the API
```bash
curl -X POST http://localhost:5000/api/summarize \
-H "Content-Type: application/json" \
-d '{"text": "نص تجريبي للاختبار", "length": 2, "full_text": true}'
```
## Support
For issues or questions:
1. Check the logs in the terminal
2. Verify model files are correct
3. Ensure all dependencies are installed
4. Check Python version compatibility