Bayan - Arabic Text Summarization Setup Guide
Overview
Bayan is an Arabic text summarization application with a web interface. This guide will help you set up and run the application.
Prerequisites
- Python 3.8 or higher
- pip (Python package manager)
- At least 4GB RAM (8GB+ recommended for better performance)
- Model files in the correct location (see below)
Installation Steps
1. Install Dependencies
pip install -r requirements.txt
Note: If you encounter issues installing PyTorch, you may need to install it separately:
- For CPU:
pip install torch --index-url https://download.pytorch.org/whl/cpu - For CUDA: Visit https://pytorch.org/get-started/locally/ for the appropriate command
2. Verify Model Location
The model should be located at:
models/arabic_summarization_model/content/drive/MyDrive/arabic_summarization_model/
Required files:
config.jsontokenizer.jsonmodel.safetensorssentencepiece.bpe.model- Other tokenizer/model files
3. Run the Application
Option A: Using the run script (Recommended)
python run_app.py
Option B: Direct Flask run
cd src
python app.py
Option C: Using Flask CLI
cd src
export FLASK_APP=app.py
flask run
4. Access the Application
Open your browser and navigate to:
http://localhost:5000
Configuration
Environment Variables
PORT: Server port (default: 5000)DEBUG: Enable debug mode (default: False)export DEBUG=True export PORT=8080
Supabase Authentication (Phase 5)
See .env.example and PHASE_5_IMPLEMENTATION_PLAN.md.
- Create a Supabase project and enable Anonymous + Google auth.
- Run
supabase/migrations/001_profiles.sqlin the SQL Editor. - Set meta tags in
src/index.html:<meta name="supabase-url" content="https://YOUR_PROJECT.supabase.co"> <meta name="supabase-anon-key" content="YOUR_ANON_KEY"> - Add redirect URL:
http://localhost:5000/**
If Supabase is not configured, the editor still works in offline auth mode.
Model Not Found Error
If you see "Model not found" error:
- Verify the model path exists
- Check that all required files are present
- The application will search multiple possible paths automatically
Out of Memory Error
If you encounter memory issues:
- Close other applications
- Use CPU mode (it will automatically use CPU if CUDA is not available)
- Reduce the
MAX_TEXT_LENGTHinsrc/app.pyif needed
Port Already in Use
If port 5000 is already in use:
export PORT=5001
python run_app.py
Slow Performance
- First run will be slower as the model loads
- Subsequent requests will be faster
- Using GPU (CUDA) significantly improves performance
API Endpoints
Health Check
GET /api/health
Returns server status and model loading state.
Summarize Text
POST /api/summarize
Content-Type: application/json
{
"text": "النص العربي المراد تلخيصه...",
"length": 2, // 1=short, 2=medium, 3=long
"full_text": true
}
Response:
{
"status": "success",
"summary": "الملخص المولد...",
"original_length": 500,
"summary_length": 150
}
Security Features
- Input validation (text length limits)
- CORS enabled for web interface
- Error handling and logging
- Path validation for model files
- Safe model loading with fallbacks
Development
Running in Debug Mode
export DEBUG=True
python run_app.py
Testing the API
curl -X POST http://localhost:5000/api/summarize \
-H "Content-Type: application/json" \
-d '{"text": "نص تجريبي للاختبار", "length": 2, "full_text": true}'
Support
For issues or questions:
- Check the logs in the terminal
- Verify model files are correct
- Ensure all dependencies are installed
- Check Python version compatibility