Spaces:
Sleeping
Sleeping
metadata
title: Rephrasia - AI-Powered Text Processing API
emoji: π
colorFrom: blue
colorTo: indigo
sdk: docker
app_port: 7860
Rephrasia - AI-Powered Text Processing API
A Flask-based API for text paraphrasing, translation, chat, and more.
Features
- Paraphrasing: Generate multiple paraphrased versions using PEGASUS
- Translation: Bidirectional English β Urdu translation using NLLB-200
- Chat: Conversational AI with DialoGPT maintaining session history
- OCR (Image to Text): Extract text from images and process it
- Batch Processing: Process multiple texts in one request
- Text-to-Speech: Convert text to audio (Urdu/English)
- Export: Save results as PDF or DOCX
System Requirements
- Python 3.9+
- 8GB+ RAM recommended (for models)
- 5GB+ disk space for model weights
- Tesseract OCR installed (for image-to-text)
- Windows: Download from UB-Mannheim
- Install to
C:\Program Files\Tesseract-OCR\ - Or adjust path in
ocr.py
Installation
# Clone the repository
cd Rephrasia
# Install dependencies
pip install -r requirements.txt
Quick Start
# Run the Flask server
python app.py
# Server starts at http://127.0.0.1:5000
API Endpoints
1. Paraphrasing
Single Text:
POST /api/rephrase
Content-Type: application/json
{
"text": "Education shapes society."
}
Batch Processing:
POST /api/batch/rephrase
Content-Type: application/json
{
"texts": ["Text 1", "Text 2", "Text 3"]
}
2. Translation
Single Translation:
POST /api/translate
Content-Type: application/json
{
"text": "Hello world",
"language": "urdu" # or "english"
}
Batch Translation:
POST /api/batch/translate
Content-Type: application/json
{
"texts": ["Hello", "Goodbye"],
"language": "urdu"
}
3. Chat
POST /chat
Content-Type: application/json
{
"text": "Hello, how are you?",
"session_id": "optional-session-id"
}
4. Text-to-Speech
POST /api/tts
Content-Type: application/json
{
"text": "ΫΫ Ψ§ΫΪ© ΩΉΫΨ³ΩΉ ΫΫ",
"language": "urdu" # or "english"
}
5. Export
Export to PDF:
POST /api/export/pdf
Content-Type: application/json
{
"original_text": "Original sentence",
"results": ["Paraphrase 1", "Paraphrase 2"],
"result_type": "Paraphrased"
}
Export to DOCX:
POST /api/export/docx
Content-Type: application/json
{
"original_text": "Original sentence",
"results": ["Translation"],
"result_type": "Translated"
}
6. OCR (Image to Text)
Extract Text from Image:
POST /api/ocr
Content-Type: multipart/form-data
image: [image file]
language: eng # or 'urd', 'eng+urd'
preprocess: true # optional, enhances image quality
OCR + Rephrase (Combined):
POST /api/ocr-rephrase
Content-Type: multipart/form-data
image: [image file]
language: eng
preprocess: true
OCR + Translate (Combined):
POST /api/ocr-translate
Content-Type: multipart/form-data
image: [image file]
ocr_language: eng
target_language: urdu
preprocess: true
Get Supported OCR Languages:
GET /api/ocr/languages
Testing
# Run all tests
pytest
# Run with verbose output
pytest -v
# Run specific test file
pytest tests/test_api.py
PowerShell Examples
# Rephrase
Invoke-RestMethod -Method Post -Uri http://127.0.0.1:5000/api/rephrase `
-Body (@{text="Education shapes society."} | ConvertTo-Json) `
-ContentType "application/json"
# Translate
Invoke-RestMethod -Method Post -Uri http://127.0.0.1:5000/api/translate `
-Body (@{text="Hello"; language="urdu"} | ConvertTo-Json) `
-ContentType "application/json"
# Batch rephrase
Invoke-RestMethod -Method Post -Uri http://127.0.0.1:5000/api/batch/rephrase `
-Body (@{texts=@("Text 1", "Text 2")} | ConvertTo-Json) `
-ContentType "application/json"
# OCR from image
$imagePath = "C:\path\to\image.png"
$form = @{
image = Get-Item -Path $imagePath
language = "eng"
preprocess = "true"
}
Invoke-RestMethod -Method Post -Uri http://127.0.0.1:5000/api/ocr -Form $form
# OCR + Rephrase
Invoke-RestMethod -Method Post -Uri http://127.0.0.1:5000/api/ocr-rephrase -Form $form
Model Information
| Feature | Model | Size |
|---|---|---|
| Paraphrasing | tuner007/pegasus_paraphrase | ~2GB |
| Translation | facebook/nllb-200-distilled-600M | ~2.5GB |
| Chat | microsoft/DialoGPT-medium | ~800MB |
| TTS | Google Text-to-Speech (gTTS) | API-based |
| OCR | Tesseract OCR | ~100MB (separate install) |
Project Structure
Rephrasia/
βββ app.py # Main Flask application
βββ model.py # Paraphrasing logic
βββ translation.py # Translation logic
βββ chat.py # Chat session management
βββ batch_processor.py # Bulk processing
βββ tts.py # Text-to-speech
βββ export_utils.py # PDF/DOCX export
βββ ocr.py # Image to text (OCR)
βββ requirements.txt # Dependencies
βββ tests/ # Test suite
β βββ test_api.py
β βββ test_chat_manager.py
βββ static/ # Generated files
βββ audio/ # TTS audio files
βββ exports/ # PDF/DOCX exports
Performance Tips
- Models are lazy-loaded on first request
- Use batch endpoints for multiple texts
- Audio/export files auto-cleanup after 24 hours
- Chat sessions are stored in-memory (consider Redis for production)
Troubleshooting
Tesseract Not Found:
- Download Tesseract from here
- Install to default location or update path in
ocr.py - Restart terminal after installation
OCR Not Detecting Text:
- Try with
preprocess: trueparameter - Ensure image has clear, readable text
- Check image format (PNG, JPG supported)
- Use correct language code (eng, urd, etc.)
Out of Memory:
- Reduce
num_beamsin model generation - Use smaller model variants (DialoGPT-small)
- Process fewer items in batch requests
Slow First Request:
- Models download and load on first use (~5-10 mins)
- Subsequent requests are fast
Import Errors:
- Ensure all requirements are installed:
pip install -r requirements.txt - Python 3.9+ required
License
Open source - free to use and modify
Support
For issues or questions, check the code comments or test files for usage examples.