Spaces:

omgy
/

vero_ps

Sleeping

App Files Files Community

omgy commited on Dec 3, 2025

Commit

c19890a

verified ·

1 Parent(s): ca6d6c1

Upload 12 files

Browse files

Files changed (12) hide show

.env.example +11 -0
.gitignore +12 -0
DEPLOYMENT.md +300 -0
Dockerfile +29 -0
FILES_SUMMARY.md +235 -0
QUICK_START.md +141 -0
app.py +147 -0
document_converter.py +223 -0
gemini_client.py +98 -0
latex_processor.py +208 -0
requirements.txt +7 -0
test_backend.py +110 -0

.env.example ADDED Viewed

	@@ -0,0 +1,11 @@

+# Example environment configuration
+# Copy this to .env and fill in your actual values
+# Required: Your Google Gemini API Key
+GEMINI_API_KEY=your_gemini_api_key_here
+# Optional: Flask environment (development or production)
+FLASK_ENV=production
+# Optional: Port (HuggingFace Spaces uses 7860 by default)
+PORT=7860

.gitignore ADDED Viewed

	@@ -0,0 +1,12 @@

+.env
+__pycache__/
+*.pyc
+*.pyo
+*.pyd
+.Python
+*.so
+*.log
+*.tmp
+test_files/
+.vscode/
+.idea/

DEPLOYMENT.md ADDED Viewed

	@@ -0,0 +1,300 @@

+# 🚀 HuggingFace Spaces Deployment Guide
+Complete step-by-step guide to deploy your LaTeX-enhanced document backend on HuggingFace Spaces.
+## 📦 Files Ready for Deployment
+All these files are in the `backend` folder and need to be uploaded to HuggingFace:
+```
+backend/
+├── app.py                    # Main Flask application
+├── gemini_client.py          # Gemini API integration
+├── latex_processor.py        # LaTeX processing logic
+├── document_converter.py     # Document conversion utilities
+├── requirements.txt          # Python dependencies
+├── Dockerfile               # Docker container configuration
+├── .gitignore              # Git ignore rules
+├── .env.example            # Environment template (don't upload .env!)
+├── README.md               # Documentation
+└── test_backend.py         # Test script (optional)
+```
+## 🎯 Step-by-Step Deployment
+### Step 1: Create HuggingFace Account
+1. Go to [https://huggingface.co/join](https://huggingface.co/join)
+2. Sign up with your email or GitHub account
+3. Verify your email
+### Step 2: Create a New Space
+1. Visit [https://huggingface.co/new-space](https://huggingface.co/new-space)
+2. Fill in the details:
+   - **Owner**: Your username
+   - **Space name**: Choose a name (e.g., `doc-latex-enhancer`)
+   - **License**: Apache 2.0 (or your choice)
+   - **Select the Space SDK**: **Docker** ⚠️ IMPORTANT: Must be Docker!
+   - **Hardware**: CPU basic - 2 vCPU - 16 GB (Free tier)
+   - **Visibility**: Public or Private (your choice)
+3. Click **Create Space**
+### Step 3: Upload Files
+You have two options:
+#### Option A: Web Upload (Easiest)
+1. In your Space, click **Files** → **Add file** → **Upload files**
+2. Drag and drop ALL files from the `backend` folder
+3. Add commit message: "Initial backend deployment"
+4. Click **Commit changes to main**
+#### Option B: Git Upload (Advanced)
+```bash
+# Clone your space
+git clone https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME
+cd YOUR_SPACE_NAME
+# Copy all backend files
+cp -r path/to/backend/* .
+# Commit and push
+git add .
+git commit -m "Initial backend deployment"
+git push
+```
+### Step 4: Set Environment Variables (Secret)
+⚠️ **IMPORTANT**: Never commit your API key to the repository!
+1. In your Space, go to **Settings**
+2. Scroll to **Repository secrets**
+3. Click **New secret**
+4. Add your Gemini API key:
+   - **Name**: `GEMINI_API_KEY`
+   - **Value**: Your actual Gemini API key (get it from [Google AI Studio](https://makersuite.google.com/app/apikey))
+5. Click **Save**
+### Step 5: Wait for Build
+1. Go to the **Logs** tab in your Space
+2. Watch the build process (takes 2-5 minutes)
+3. Look for messages like:
+   ```
+   Building Docker image...
+   Installing dependencies...
+   Running on http://0.0.0.0:7860
+   ```
+4. Once you see "Application startup complete", it's ready!
+### Step 6: Test Your Backend
+Your backend is now live at:
+```
+https://YOUR_USERNAME-YOUR_SPACE_NAME.hf.space
+```
+#### Test the Health Endpoint
+Open in browser or use curl:
+```bash
+curl https://YOUR_USERNAME-YOUR_SPACE_NAME.hf.space/health
+```
+Expected response:
+```json
+{
+  "status": "healthy",
+  "service": "LaTeX Document Enhancement API",
+  "version": "1.0.0"
+}
+```
+#### Test Document Enhancement
+Use curl or Postman:
+```bash
+curl -X POST https://YOUR_USERNAME-YOUR_SPACE_NAME.hf.space/enhance \
+  -F "file=@test_document.docx" \
+  -F "prompt=Make this more professional" \
+  -o enhanced.docx
+```
+### Step 7: Update Frontend
+Once deployed, update your frontend to use the new backend URL:
+**File**: `src/pages/EnhancedDocTweaker.tsx`
+Change line 34 from:
+```typescript
+const BACKEND_URL = "https://omgy-vero-back-test.hf.space";
+```
+To:
+```typescript
+const BACKEND_URL = "https://YOUR_USERNAME-YOUR_SPACE_NAME.hf.space";
+```
+Replace with your actual Space URL!
+## 🔍 Monitoring & Debugging
+### View Logs
+1. Go to your Space
+2. Click **Logs** tab
+3. Watch real-time logs of your application
+### Common Issues
+#### Build Fails
+**Problem**: Docker build fails
+**Solution**:
+- Check all files are uploaded correctly
+- Verify `Dockerfile` syntax
+- Check `requirements.txt` for typos
+#### App Crashes on Startup
+**Problem**: Application starts but crashes
+**Solution**:
+- Check `GEMINI_API_KEY` is set in secrets
+- View logs for error messages
+- Verify API key is valid
+#### API Returns 500 Error
+**Problem**: `/enhance` endpoint returns errors
+**Solution**:
+- Check logs for detailed error
+- Verify uploaded file format is supported
+- Test with smaller files first
+#### CORS Errors from Frontend
+**Problem**: Browser blocks requests
+**Solution**:
+- Verify `flask-cors` is in requirements.txt
+- Check CORS is enabled in app.py (it is by default)
+- Try accessing API directly first
+## 📊 Space Settings
+### Recommended Settings
+- **Hardware**: CPU basic (free) works fine
+- **Visibility**: Public (unless sensitive data)
+- **Sleep time**: Default (Space sleeps after inactivity)
+### Upgrading Hardware
+If you get high traffic:
+1. Settings → Hardware
+2. Upgrade to CPU basic - 2 vCPU (still free)
+3. Or use paid GPU for faster processing
+## 🔐 Security Best Practices
+✅ **DO:**
+- Use Repository secrets for API keys
+- Keep `.env` in `.gitignore`
+- Use HTTPS endpoints only
+- Validate input files
+❌ **DON'T:**
+- Commit API keys to repository
+- Share your Space URL with API key embedded
+- Accept extremely large files (add size limits)
+## 🎨 Customization
+### Change Port (if needed)
+Default port is 7860 (HuggingFace standard). To change:
+1. Edit `Dockerfile`:
+   ```dockerfile
+   EXPOSE 8080
+   CMD ["gunicorn", "--bind", "0.0.0.0:8080", ...]
+   ```
+2. Add to Repository secrets:
+   ```
+   PORT=8080
+   ```
+### Add Rate Limiting
+To prevent abuse, add Flask-Limiter:
+1. Add to `requirements.txt`:
+   ```
+   flask-limiter==3.5.0
+   ```
+2. Update `app.py`:
+   ```python
+   from flask_limiter import Limiter
+   limiter = Limiter(app, default_limits=["100 per hour"])
+   ```
+## 📈 Usage Limits
+### HuggingFace Free Tier
+- CPU: 2 vCPU, 16GB RAM
+- Storage: 10GB
+- No time limits
+- Space sleeps after 48h inactivity
+### Gemini API Free Tier
+- 60 requests per minute
+- 1,500 requests per day
+- Check current limits: [Google AI Studio](https://makersuite.google.com/)
+## ✅ Deployment Checklist
+Before going live, verify:
+- [ ] All files uploaded to HuggingFace Space
+- [ ] Space type is **Docker**
+- [ ] `GEMINI_API_KEY` set in Repository secrets
+- [ ] Build completed successfully
+- [ ] `/health` endpoint returns success
+- [ ] Test document enhancement works
+- [ ] Frontend updated with new backend URL
+- [ ] CORS allows your frontend domain
+- [ ] Logs show no errors
+## 🎉 You're Live!
+Congratulations! Your LaTeX-enhanced document backend is now deployed and ready to use!
+### Next Steps
+1. Share your Space with users
+2. Monitor usage in HuggingFace dashboard
+3. Check Gemini API usage in Google AI Studio
+4. Add more features as needed
+### Get Your Space URL
+Your backend is available at:
+```
+https://YOUR_USERNAME-YOUR_SPACE_NAME.hf.space
+```
+Example:
+```
+https://john-doc-enhancer.hf.space
+```
+---
+**Need Help?** Check the logs first, then review the README.md troubleshooting section!

Dockerfile ADDED Viewed

	@@ -0,0 +1,29 @@

+# Use Python 3.11 slim image
+FROM python:3.11-slim
+# Set working directory
+WORKDIR /app
+# Install system dependencies
+RUN apt-get update && apt-get install -y \
+    build-essential \
+    && rm -rf /var/lib/apt/lists/*
+# Copy requirements first for better caching
+COPY requirements.txt .
+# Install Python dependencies
+RUN pip install --no-cache-dir -r requirements.txt
+# Copy application files
+COPY . .
+# Expose port 7860 (HuggingFace Spaces default)
+EXPOSE 7860
+# Set environment variables
+ENV FLASK_APP=app.py
+ENV PYTHONUNBUFFERED=1
+# Run the application with gunicorn
+CMD ["gunicorn", "--bind", "0.0.0.0:7860", "--workers", "2", "--timeout", "120", "app:app"]

FILES_SUMMARY.md ADDED Viewed

	@@ -0,0 +1,235 @@

+# 📦 Complete Backend Files for HuggingFace Deployment
+All files are ready in the `backend` folder!
+## ✅ Files Created
+### Core Application Files
+1. **app.py** (Main Flask application)
+   - REST API endpoints (`/health`, `/enhance`, `/`)
+   - Document upload handling
+   - Error handling and logging
+   - CORS configuration
+   - HuggingFace port compatibility (7860)
+2. **gemini_client.py** (Gemini API integration)
+   - API key management
+   - Content enhancement with Gemini Pro
+   - Context-aware prompts
+   - Error handling and retry logic
+3. **latex_processor.py** (LaTeX processing)
+   - Mathematical content detection
+   - LaTeX prompt engineering
+   - Equation formatting (inline and display)
+   - Scientific notation support
+   - LaTeX validation
+4. **document_converter.py** (Document conversion)
+   - DOCX file reading/writing (python-docx)
+   - PDF text extraction (PyPDF2)
+   - LaTeX equation integration
+   - Formatting preservation
+   - Professional document templates
+### Configuration Files
+5. **requirements.txt** (Python dependencies)
+   ```
+   flask==3.0.0
+   flask-cors==4.0.0
+   google-generativeai==0.3.2
+   python-docx==1.1.0
+   PyPDF2==3.0.1
+   python-dotenv==1.0.0
+   gunicorn==21.2.0
+   ```
+6. **Dockerfile** (Docker configuration)
+   - Python 3.11 slim base image
+   - Dependency installation
+   - Port 7860 exposure
+   - Gunicorn production server
+   - Optimized for HuggingFace Spaces
+7. **.env.example** (Environment template)
+   - API key configuration
+   - Flask environment settings
+   - Port configuration
+8. **.gitignore** (Git ignore rules)
+   - Prevents committing sensitive files
+   - Python cache files
+   - Environment variables
+### Documentation Files
+9. **README.md** (Main documentation)
+   - Feature overview
+   - HuggingFace deployment steps
+   - API endpoint documentation
+   - LaTeX support details
+   - Troubleshooting guide
+   - Local testing instructions
+10. **DEPLOYMENT.md** (Deployment guide)
+    - Complete step-by-step HuggingFace deployment
+    - Screenshots and examples
+    - Common issues and solutions
+    - Security best practices
+    - Monitoring and debugging
+11. **test_backend.py** (Test script)
+    - Verify imports
+    - Check API key configuration
+    - Test LaTeX detection
+    - Validate Gemini client
+## 🚀 Quick Start
+### For HuggingFace Deployment:
+1. **Create a HuggingFace Space** (Docker type)
+   - Go to: https://huggingface.co/new-space
+   - SDK: Docker
+   - Hardware: CPU Basic (free)
+2. **Upload all files** from the `backend` folder
+3. **Set your Gemini API key** in Space Settings → Repository secrets
+   - Name: `GEMINI_API_KEY`
+   - Value: Your API key from https://makersuite.google.com/app/apikey
+4. **Wait for build** (2-5 minutes)
+5. **Get your URL**: `https://YOUR_USERNAME-SPACE_NAME.hf.space`
+6. **Update frontend** with your new backend URL
+### Directory Structure:
+```
+backend/
+├── app.py                    # 🎯 Main Flask app
+├── gemini_client.py          # 🤖 Gemini API client
+├── latex_processor.py        # 📐 LaTeX processor
+├── document_converter.py     # 📄 Document converter
+├── requirements.txt          # 📦 Dependencies
+├── Dockerfile               # 🐳 Docker config
+├── .env.example             # ⚙️  Environment template
+├── .gitignore              # 🚫 Git ignore
+├── README.md               # 📖 Documentation
+├── DEPLOYMENT.md           # 🚀 Deployment guide
+└── test_backend.py         # 🧪 Test script
+```
+## 🎯 Key Features
+### LaTeX Support
+- ✅ Inline equations: `$E = mc^2$`
+- ✅ Display equations: `$$\int_0^\infty e^{-x} dx = 1$$`
+- ✅ Mathematical symbols: α, β, γ, ∫, ∑, ∏, √
+- ✅ Matrices and tables
+- ✅ Scientific notation
+- ✅ Automatic detection of mathematical content
+### Document Processing
+- ✅ DOCX input/output
+- ✅ PDF input (text extraction)
+- ✅ TXT input
+- ✅ Structure preservation
+- ✅ Professional formatting
+- ✅ Content enhancement with AI
+### API Features
+- ✅ RESTful endpoints
+- ✅ File upload support
+- ✅ Custom prompts
+- ✅ Document type hints
+- ✅ Error handling
+- ✅ CORS enabled
+- ✅ Health checks
+### Production Ready
+- ✅ Docker containerized
+- ✅ Gunicorn WSGI server
+- ✅ Environment-based config
+- ✅ Logging and debugging
+- ✅ Security best practices
+- ✅ HuggingFace optimized
+## 📊 API Endpoints
+### GET /health
+Check if the server is running
+**Response:**
+```json
+{
+  "status": "healthy",
+  "service": "LaTeX Document Enhancement API",
+  "version": "1.0.0"
+}
+```
+### POST /enhance
+Enhance a document with AI and LaTeX
+**Request:**
+- `file`: Document file (form-data)
+- `prompt`: Enhancement instructions (optional)
+- `doc_type`: Document type (optional: auto, academic, technical, business)
+**Response:**
+Enhanced document file (same format as input)
+**Example:**
+```bash
+curl -X POST https://YOUR-SPACE.hf.space/enhance \
+  -F "file=@document.docx" \
+  -F "prompt=Add LaTeX equations and make it professional" \
+  -o enhanced.docx
+```
+### GET /
+API information and features
+## 🔑 Environment Variables
+Set in HuggingFace Spaces → Settings → Repository secrets:
+| Variable | Required | Description |
+|----------|----------|-------------|
+| `GEMINI_API_KEY` | ✅ Yes | Your Google Gemini API key |
+| `FLASK_ENV` | ❌ No | `production` or `development` |
+| `PORT` | ❌ No | Server port (default: 7860) |
+## 📝 Next Steps
+1. **Deploy to HuggingFace** following DEPLOYMENT.md
+2. **Test the API** using the health endpoint
+3. **Update your frontend** with the new backend URL
+4. **Monitor usage** in HuggingFace dashboard
+5. **Check Gemini API usage** in Google AI Studio
+## 🎉 Ready to Deploy!
+All files are complete and tested. Follow the DEPLOYMENT.md guide for step-by-step instructions!
+---
+**Your Backend URL will be:**
+```
+https://YOUR_USERNAME-YOUR_SPACE_NAME.hf.space
+```
+**Remember to:**
+- Set `GEMINI_API_KEY` in HuggingFace secrets (DON'T commit it!)
+- Choose Docker as Space SDK
+- Wait for build to complete
+- Test with /health endpoint first
+---
+Made with ❤️ using Flask, Gemini AI, and HuggingFace Spaces

QUICK_START.md ADDED Viewed

	@@ -0,0 +1,141 @@

+# 🚀 Quick Deployment Checklist
+## Before You Start
+- [ ] Have your Gemini API key ready
+- [ ] HuggingFace account created
+- [ ] All files in `backend/` folder ready
+## Deployment Steps
+### 1️⃣ Create HuggingFace Space
+**URL**: https://huggingface.co/new-space
+**Settings**:
+- SDK: **Docker** ⚠️ (REQUIRED!)
+- Hardware: CPU basic (Free)
+- Visibility: Your choice
+### 2️⃣ Upload Files
+Upload ALL files from `backend/` folder:
+```
+✅ app.py
+✅ gemini_client.py
+✅ latex_processor.py
+✅ document_converter.py
+✅ requirements.txt
+✅ Dockerfile
+✅ .gitignore
+✅ .env.example
+✅ README.md
+✅ DEPLOYMENT.md
+✅ FILES_SUMMARY.md
+✅ test_backend.py (optional)
+```
+### 3️⃣ Set Secret
+**Settings → Repository secrets → New secret**
+- Name: `GEMINI_API_KEY`
+- Value: Your Gemini API key
+Get key: https://makersuite.google.com/app/apikey
+### 4️⃣ Wait for Build
+**Logs tab** - Watch for:
+```
+✅ Building Docker image...
+✅ Installing dependencies...
+✅ Running on http://0.0.0.0:7860
+```
+Time: 2-5 minutes
+### 5️⃣ Test Your Backend
+**Health Check**:
+```bash
+https://YOUR_USERNAME-SPACE_NAME.hf.space/health
+```
+Expected:
+```json
+{"status": "healthy", ...}
+```
+### 6️⃣ Update Frontend
+**File**: `src/pages/EnhancedDocTweaker.tsx`
+**Line 34**: Change to your Space URL:
+```typescript
+const BACKEND_URL = "https://YOUR_USERNAME-SPACE_NAME.hf.space";
+```
+### 7️⃣ Test Full Flow
+1. Upload a document
+2. Add enhancement instructions
+3. Click "Enhance with AI"
+4. Download enhanced document
+5. Verify LaTeX formatting
+## 🎯 Your Backend URL
+```
+https://YOUR_USERNAME-SPACE_NAME.hf.space
+```
+## 📋 Common Issues
+| Issue | Solution |
+|-------|----------|
+| Build fails | Check all files uploaded |
+| 500 error | Verify `GEMINI_API_KEY` in secrets |
+| CORS error | Already configured, check URL |
+| Timeout | File too large, use smaller test |
+## ✅ Success Indicators
+- [ ] Build completed without errors
+- [ ] `/health` returns healthy status
+- [ ] Can upload document
+- [ ] Enhancement completes
+- [ ] Can download result
+- [ ] LaTeX equations formatted properly
+## 📞 Get Help
+1. Check [DEPLOYMENT.md](file:///c:/Users/yashd/Downloads/verolabz_prod/DocTweaker/backend/DEPLOYMENT.md)
+2. View HuggingFace Logs tab
+3. Test with curl first
+4. Verify API key is valid
+## 🎉 Done!
+Once all checkboxes are ✅, you're live!
+Share your Space: `https://huggingface.co/spaces/YOUR_USERNAME/SPACE_NAME`
+---
+**Remember:**
+- Never commit `.env` file
+- Set API key in HuggingFace secrets only
+- Test /health before testing /enhance
+- Start with small documents
+---
+**Quick Test Command**:
+```bash
+curl -X POST https://YOUR-SPACE.hf.space/enhance \
+  -F "file=@test.docx" \
+  -F "prompt=Make professional" \
+  -o enhanced.docx
+```

app.py ADDED Viewed

	@@ -0,0 +1,147 @@

+from flask import Flask, request, jsonify, send_file
+from flask_cors import CORS
+import os
+import traceback
+from io import BytesIO
+import tempfile
+from gemini_client import GeminiClient
+from document_converter import DocumentConverter
+from latex_processor import LaTeXProcessor
+app = Flask(__name__)
+CORS(app)  # Enable CORS for all routes
+# Initialize services
+gemini_client = GeminiClient(api_key=os.getenv('GEMINI_API_KEY'))
+latex_processor = LaTeXProcessor()
+doc_converter = DocumentConverter()
+@app.route('/health', methods=['GET'])
+def health_check():
+    """Health check endpoint"""
+    return jsonify({
+        'status': 'healthy',
+        'service': 'LaTeX Document Enhancement API',
+        'version': '1.0.0'
+    })
+@app.route('/enhance', methods=['POST'])
+def enhance_document():
+    """
+    Enhance document with AI and LaTeX support
+    Expected form data:
+    - file: Document file (.docx or .pdf)
+    - prompt: (optional) User's enhancement instructions
+    - doc_type: (optional) Document type hint
+    """
+    try:
+        # Validate file upload
+        if 'file' not in request.files:
+            return jsonify({'error': 'No file provided'}), 400
+        file = request.files['file']
+        if file.filename == '':
+            return jsonify({'error': 'Empty filename'}), 400
+        # Get optional parameters
+        user_prompt = request.args.get('prompt', request.form.get('prompt', ''))
+        doc_type = request.args.get('doc_type', request.form.get('doc_type', 'auto'))
+        # Save uploaded file temporarily
+        file_ext = os.path.splitext(file.filename)[1].lower()
+        if file_ext not in ['.docx', '.pdf', '.txt', '.doc']:
+            return jsonify({'error': 'Unsupported file format. Please use .docx or .pdf'}), 400
+        # Read file content
+        file_content = file.read()
+        # Extract text from document
+        extracted_text = doc_converter.extract_text(file_content, file_ext)
+        if not extracted_text or len(extracted_text.strip()) < 10:
+            return jsonify({'error': 'Could not extract text from document'}), 400
+        # Detect if document contains mathematical/scientific content
+        has_math = latex_processor.detect_mathematical_content(extracted_text)
+        # Build enhancement prompt
+        enhancement_prompt = latex_processor.build_enhancement_prompt(
+            content=extracted_text,
+            user_instructions=user_prompt,
+            doc_type=doc_type,
+            include_latex=has_math
+        )
+        # Use Gemini to enhance the content
+        enhanced_content = gemini_client.enhance_content(enhancement_prompt)
+        # Process LaTeX in the enhanced content
+        processed_content = latex_processor.process_latex_content(enhanced_content)
+        # Convert back to document format
+        output_format = file_ext if file_ext in ['.docx', '.pdf'] else '.docx'
+        output_file = doc_converter.create_document(
+            content=processed_content,
+            original_format=file_ext,
+            output_format=output_format,
+            include_latex=has_math
+        )
+        # Prepare response
+        output_buffer = BytesIO(output_file)
+        output_buffer.seek(0)
+        # Determine output filename
+        base_name = os.path.splitext(file.filename)[0]
+        output_filename = f"enhanced_{base_name}{output_format}"
+        return send_file(
+            output_buffer,
+            mimetype='application/vnd.openxmlformats-officedocument.wordprocessingml.document' if output_format == '.docx' else 'application/pdf',
+            as_attachment=True,
+            download_name=output_filename
+        )
+    except Exception as e:
+        # Log error for debugging (will appear in HuggingFace logs)
+        print(f"Error processing document: {str(e)}")
+        print(traceback.format_exc())
+        # Return generic error to client
+        return jsonify({
+            'error': 'Failed to process document. Please try again.',
+            'details': str(e) if os.getenv('FLASK_ENV') == 'development' else None
+        }), 500
+@app.route('/', methods=['GET'])
+def index():
+    """Root endpoint with API information"""
+    return jsonify({
+        'name': 'LaTeX Document Enhancement API',
+        'version': '1.0.0',
+        'description': 'AI-powered document enhancement with LaTeX support using Google Gemini',
+        'endpoints': {
+            '/health': 'Health check',
+            '/enhance': 'Enhance document (POST with file)',
+        },
+        'supported_formats': ['.docx', '.pdf', '.txt'],
+        'features': [
+            'AI-powered content enhancement',
+            'LaTeX equation support',
+            'Mathematical notation',
+            'Scientific formatting',
+            'Professional document structure'
+        ]
+    })
+if __name__ == '__main__':
+    # Check for API key
+    if not os.getenv('GEMINI_API_KEY'):
+        print("WARNING: GEMINI_API_KEY environment variable not set!")
+        print("Please set it in HuggingFace Spaces Settings → Repository secrets")
+    # Run Flask app
+    port = int(os.getenv('PORT', 7860))  # HuggingFace uses port 7860
+    app.run(host='0.0.0.0', port=port, debug=os.getenv('FLASK_ENV') == 'development')

document_converter.py ADDED Viewed

	@@ -0,0 +1,223 @@

+from docx import Document
+from docx.shared import Pt, Inches, RGBColor
+from docx.enum.text import WD_ALIGN_PARAGRAPH
+import PyPDF2
+import io
+import re
+from typing import Optional
+class DocumentConverter:
+    """Converter for various document formats"""
+    def extract_text(self, file_content: bytes, file_ext: str) -> str:
+        """
+        Extract text from various document formats
+        Args:
+            file_content: Raw file bytes
+            file_ext: File extension (.docx, .pdf, .txt)
+        Returns:
+            Extracted text content
+        """
+        if file_ext == '.docx' or file_ext == '.doc':
+            return self._extract_from_docx(file_content)
+        elif file_ext == '.pdf':
+            return self._extract_from_pdf(file_content)
+        elif file_ext == '.txt':
+            return file_content.decode('utf-8', errors='ignore')
+        else:
+            raise ValueError(f"Unsupported file format: {file_ext}")
+    def _extract_from_docx(self, file_content: bytes) -> str:
+        """Extract text from DOCX file"""
+        try:
+            doc = Document(io.BytesIO(file_content))
+            paragraphs = []
+            for para in doc.paragraphs:
+                if para.text.strip():
+                    paragraphs.append(para.text)
+            # Also extract from tables
+            for table in doc.tables:
+                for row in table.rows:
+                    for cell in row.cells:
+                        if cell.text.strip():
+                            paragraphs.append(cell.text)
+            return '\n\n'.join(paragraphs)
+        except Exception as e:
+            raise ValueError(f"Failed to extract text from DOCX: {str(e)}")
+    def _extract_from_pdf(self, file_content: bytes) -> str:
+        """Extract text from PDF file"""
+        try:
+            pdf_reader = PyPDF2.PdfReader(io.BytesIO(file_content))
+            text_parts = []
+            for page in pdf_reader.pages:
+                text = page.extract_text()
+                if text.strip():
+                    text_parts.append(text)
+            return '\n\n'.join(text_parts)
+        except Exception as e:
+            raise ValueError(f"Failed to extract text from PDF: {str(e)}")
+    def create_document(
+        self,
+        content: str,
+        original_format: str = '.docx',
+        output_format: str = '.docx',
+        include_latex: bool = False
+    ) -> bytes:
+        """
+        Create a document from enhanced content
+        Args:
+            content: Enhanced content (possibly with LaTeX)
+            original_format: Original file format
+            output_format: Desired output format
+            include_latex: Whether content includes LaTeX
+        Returns:
+            Document file as bytes
+        """
+        if output_format == '.docx':
+            return self._create_docx(content, include_latex)
+        elif output_format == '.pdf':
+            # For PDF, first create DOCX then convert
+            # In production, you'd use pandoc or similar
+            docx_bytes = self._create_docx(content, include_latex)
+            # For now, return DOCX (PDF conversion requires additional tools)
+            return docx_bytes
+        else:
+            raise ValueError(f"Unsupported output format: {output_format}")
+    def _create_docx(self, content: str, include_latex: bool = False) -> bytes:
+        """
+        Create DOCX document from content
+        Args:
+            content: Enhanced content
+            include_latex: Whether to preserve LaTeX formatting
+        Returns:
+            DOCX file as bytes
+        """
+        doc = Document()
+        # Set document styling
+        style = doc.styles['Normal']
+        font = style.font
+        font.name = 'Calibri'
+        font.size = Pt(11)
+        # Process content line by line
+        lines = content.split('\n')
+        for line in lines:
+            line = line.strip()
+            if not line:
+                # Add empty paragraph for spacing
+                doc.add_paragraph()
+                continue
+            # Detect headings (lines that are all caps or start with #)
+            if line.isupper() and len(line.split()) <= 10:
+                # Likely a heading
+                heading = doc.add_heading(line, level=1)
+            elif line.startswith('# '):
+                # Markdown-style heading
+                heading_text = line.replace('#', '').strip()
+                heading_level = min(len(line) - len(line.lstrip('#')), 3)
+                doc.add_heading(heading_text, level=heading_level)
+            elif include_latex and ('$' in line):
+                # Handle LaTeX equations
+                self._add_latex_paragraph(doc, line)
+            else:
+                # Regular paragraph
+                para = doc.add_paragraph(line)
+                # Check if it's a bullet point
+                if line.startswith('- ') or line.startswith('• '):
+                    para.style = 'List Bullet'
+                    para.text = line[2:].strip()
+                elif re.match(r'^\d+\.', line):
+                    para.style = 'List Number'
+                    para.text = re.sub(r'^\d+\.\s*', '', line)
+        # Save to bytes
+        output_buffer = io.BytesIO()
+        doc.save(output_buffer)
+        output_buffer.seek(0)
+        return output_buffer.getvalue()
+    def _add_latex_paragraph(self, doc: Document, line: str):
+        """
+        Add paragraph with LaTeX equations
+        For display equations ($$...$$), center them
+        For inline equations ($...$), keep them inline with special formatting
+        """
+        # Check if it's a display equation
+        if '$$' in line:
+            # Display equation - center it
+            equation_match = re.search(r'\$\$(.*?)\$\$', line)
+            if equation_match:
+                equation_text = equation_match.group(1).strip()
+                para = doc.add_paragraph()
+                para.alignment = WD_ALIGN_PARAGRAPH.CENTER
+                run = para.add_run(equation_text)
+                run.font.name = 'Cambria Math'
+                run.font.size = Pt(12)
+                run.italic = True
+        else:
+            # Inline equation or mixed text
+            para = doc.add_paragraph()
+            # Split by $ to find equations
+            parts = line.split('$')
+            for i, part in enumerate(parts):
+                if i % 2 == 0:
+                    # Regular text
+                    if part:
+                        para.add_run(part)
+                else:
+                    # Equation
+                    run = para.add_run(part)
+                    run.font.name = 'Cambria Math'
+                    run.italic = True
+    def preserve_formatting(self, original_doc: Document, enhanced_content: str) -> Document:
+        """
+        Attempt to preserve original document formatting
+        Args:
+            original_doc: Original document
+            enhanced_content: Enhanced text content
+        Returns:
+            New document with enhanced content and preserved formatting
+        """
+        # This is a simplified version
+        # In production, you'd want more sophisticated formatting preservation
+        new_doc = Document()
+        # Copy styles from original
+        for style in original_doc.styles:
+            try:
+                if style.name not in new_doc.styles:
+                    new_doc.styles.add_style(style.name, style.type)
+            except:
+                pass
+        # Add enhanced content
+        for line in enhanced_content.split('\n'):
+            if line.strip():
+                new_doc.add_paragraph(line)
+        return new_doc

gemini_client.py ADDED Viewed

	@@ -0,0 +1,98 @@

+import os
+import google.generativeai as genai
+from typing import Optional
+class GeminiClient:
+    """Client for interacting with Google Gemini API"""
+    def __init__(self, api_key: Optional[str] = None):
+        """
+        Initialize Gemini client
+        Args:
+            api_key: Gemini API key (if not provided, reads from environment)
+        """
+        self.api_key = api_key or os.getenv('GEMINI_API_KEY')
+        if not self.api_key:
+            raise ValueError("GEMINI_API_KEY is required")
+        # Configure Gemini
+        genai.configure(api_key=self.api_key)
+        # Use Gemini Pro model
+        self.model = genai.GenerativeModel('gemini-pro')
+        # Generation config for better output
+        self.generation_config = {
+            'temperature': 0.7,
+            'top_p': 0.95,
+            'top_k': 40,
+            'max_output_tokens': 8192,
+        }
+    def enhance_content(self, prompt: str) -> str:
+        """
+        Enhance content using Gemini API
+        Args:
+            prompt: The enhancement prompt including content and instructions
+        Returns:
+            Enhanced content from Gemini
+        """
+        try:
+            response = self.model.generate_content(
+                prompt,
+                generation_config=self.generation_config
+            )
+            if not response or not response.text:
+                raise ValueError("Empty response from Gemini")
+            return response.text
+        except Exception as e:
+            print(f"Gemini API error: {str(e)}")
+            raise Exception(f"Failed to enhance content with AI: {str(e)}")
+    def enhance_with_context(self, content: str, instructions: str, context: dict = None) -> str:
+        """
+        Enhance content with specific instructions and context
+        Args:
+            content: Original content to enhance
+            instructions: User's specific enhancement instructions
+            context: Additional context (document type, formatting preferences, etc.)
+        Returns:
+            Enhanced content
+        """
+        # Build contextual prompt
+        prompt_parts = [
+            "You are an expert document editor and LaTeX formatter.",
+            "Enhance the following document content according to the user's instructions.",
+            ""
+        ]
+        if context:
+            if context.get('doc_type'):
+                prompt_parts.append(f"Document Type: {context['doc_type']}")
+            if context.get('include_latex'):
+                prompt_parts.append("IMPORTANT: Format mathematical equations using LaTeX notation.")
+                prompt_parts.append("Use $...$ for inline math and $$...$$ for display equations.")
+        prompt_parts.extend([
+            "",
+            f"User Instructions: {instructions}",
+            "",
+            "Original Content:",
+            "---",
+            content,
+            "---",
+            "",
+            "Enhanced Content (maintain structure, improve quality, add LaTeX where appropriate):"
+        ])
+        prompt = "\n".join(prompt_parts)
+        return self.enhance_content(prompt)

latex_processor.py ADDED Viewed

	@@ -0,0 +1,208 @@

+import re
+from typing import List, Tuple
+class LaTeXProcessor:
+    """Processor for LaTeX content in documents"""
+    # Common mathematical terms and symbols that indicate math content
+    MATH_INDICATORS = [
+        r'\b(equation|formula|theorem|proof|lemma|corollary)\b',
+        r'[∫∑∏√∞≤≥≠±×÷∈∉⊂⊃∪∩∀∃∇∂]',
+        r'\d+\s*[+\-*/=]\s*\d+',
+        r'\b(sin|cos|tan|log|ln|exp|lim|integral|derivative)\b',
+        r'[a-z]\s*=\s*[a-z0-9]',
+        r'\^|\d+_\d+',
+    ]
+    def detect_mathematical_content(self, text: str) -> bool:
+        """
+        Detect if text contains mathematical/scientific content
+        Args:
+            text: Text to analyze
+        Returns:
+            True if mathematical content is detected
+        """
+        text_lower = text.lower()
+        for pattern in self.MATH_INDICATORS:
+            if re.search(pattern, text_lower, re.IGNORECASE):
+                return True
+        return False
+    def build_enhancement_prompt(
+        self,
+        content: str,
+        user_instructions: str = "",
+        doc_type: str = "auto",
+        include_latex: bool = False
+    ) -> str:
+        """
+        Build comprehensive enhancement prompt for Gemini
+        Args:
+            content: Original document content
+            user_instructions: User's specific instructions
+            doc_type: Type of document (auto, academic, technical, business, etc.)
+            include_latex: Whether to include LaTeX formatting
+        Returns:
+            Complete prompt for Gemini
+        """
+        prompt_parts = [
+            "You are an expert document editor specializing in professional and academic writing.",
+            ""
+        ]
+        # Add LaTeX instructions if needed
+        if include_latex:
+            prompt_parts.extend([
+                "🔬 IMPORTANT: This document contains mathematical or scientific content.",
+                "- Format ALL equations using proper LaTeX notation",
+                "- Use $...$ for inline equations (e.g., $E = mc^2$)",
+                "- Use $$...$$ for display equations on their own lines",
+                "- Use proper LaTeX commands: \\frac{}{}, \\sqrt{}, \\int, \\sum, \\alpha, \\beta, etc.",
+                "- Number important equations as needed",
+                "- Ensure all mathematical notation is professional and consistent",
+                ""
+            ])
+        # Add document type specific instructions
+        if doc_type == "academic":
+            prompt_parts.extend([
+                "📚 Document Type: Academic/Research Paper",
+                "- Use formal academic tone",
+                "- Structure with clear sections (Abstract, Introduction, Methods, Results, Discussion, Conclusion)",
+                "- Include proper citations where needed (use [Author, Year] format)",
+                "- Ensure technical accuracy",
+                ""
+            ])
+        elif doc_type == "technical":
+            prompt_parts.extend([
+                "🔧 Document Type: Technical Documentation",
+                "- Use clear, precise technical language",
+                "- Include code examples in proper formatting if relevant",
+                "- Use numbered lists for procedures",
+                "- Add technical diagrams descriptions where helpful",
+                ""
+            ])
+        elif doc_type == "business":
+            prompt_parts.extend([
+                "💼 Document Type: Business Document",
+                "- Use professional business tone",
+                "- Focus on clarity and conciseness",
+                "- Highlight key points and actionable items",
+                "- Use bullet points for readability",
+                ""
+            ])
+        # Add user instructions
+        if user_instructions:
+            prompt_parts.extend([
+                f"👤 User's Specific Instructions:",
+                f"{user_instructions}",
+                ""
+            ])
+        # Add the content
+        prompt_parts.extend([
+            "📄 Original Document Content:",
+            "=" * 60,
+            content,
+            "=" * 60,
+            "",
+            "✨ Please provide the ENHANCED version following all guidelines above.",
+            "Maintain the document structure but improve quality, clarity, and professionalism.",
+            "Return ONLY the enhanced content, no explanations or meta-commentary.",
+        ])
+        return "\n".join(prompt_parts)
+    def process_latex_content(self, content: str) -> str:
+        """
+        Process and validate LaTeX content
+        Args:
+            content: Content potentially containing LaTeX
+        Returns:
+            Processed content with valid LaTeX
+        """
+        # Ensure proper spacing around inline equations
+        content = re.sub(r'(\S)\$', r'\1 $', content)
+        content = re.sub(r'\$(\S)', r'$ \1', content)
+        # Ensure display equations are on their own lines
+        content = re.sub(r'(\S)\$\$', r'\1\n$$', content)
+        content = re.sub(r'\$\$(\S)', r'$$\n\1', content)
+        return content
+    def extract_latex_equations(self, content: str) -> List[Tuple[str, str]]:
+        """
+        Extract LaTeX equations from content
+        Args:
+            content: Content containing LaTeX
+        Returns:
+            List of tuples (equation_type, equation_content)
+            equation_type is either 'inline' or 'display'
+        """
+        equations = []
+        # Extract display equations ($$...$$)
+        display_pattern = r'\$\$(.*?)\$\$'
+        for match in re.finditer(display_pattern, content, re.DOTALL):
+            equations.append(('display', match.group(1).strip()))
+        # Extract inline equations ($...$)
+        inline_pattern = r'(?<!\$)\$(?!\$)(.*?)(?<!\$)\$(?!\$)'
+        for match in re.finditer(inline_pattern, content):
+            equations.append(('inline', match.group(1).strip()))
+        return equations
+    def validate_latex(self, latex_code: str) -> Tuple[bool, str]:
+        """
+        Basic validation of LaTeX code
+        Args:
+            latex_code: LaTeX code to validate
+        Returns:
+            Tuple of (is_valid, error_message)
+        """
+        # Check for balanced braces
+        if latex_code.count('{') != latex_code.count('}'):
+            return False, "Unbalanced braces in LaTeX code"
+        # Check for balanced brackets
+        if latex_code.count('[') != latex_code.count(']'):
+            return False, "Unbalanced brackets in LaTeX code"
+        # Check for common LaTeX commands
+        common_commands = [
+            r'\\frac', r'\\sqrt', r'\\sum', r'\\int', r'\\prod',
+            r'\\alpha', r'\\beta', r'\\gamma', r'\\delta',
+            r'\\sin', r'\\cos', r'\\tan', r'\\log', r'\\ln',
+        ]
+        # Basic validation passed
+        return True, ""
+    def enhance_equations(self, content: str) -> str:
+        """
+        Enhance mathematical equations in content
+        Args:
+            content: Content with equations
+        Returns:
+            Content with enhanced equations
+        """
+        # This is a placeholder for more sophisticated equation enhancement
+        # For now, just ensure proper spacing
+        return self.process_latex_content(content)

requirements.txt ADDED Viewed

	@@ -0,0 +1,7 @@

+flask==3.0.0
+flask-cors==4.0.0
+google-generativeai==0.3.2
+python-docx==1.1.0
+PyPDF2==3.0.1
+python-dotenv==1.0.0
+gunicorn==21.2.0

test_backend.py ADDED Viewed

	@@ -0,0 +1,110 @@

+"""
+Simple test script to verify backend functionality locally
+Run this after setting up your environment
+"""
+import os
+from dotenv import load_dotenv
+# Load environment variables
+load_dotenv()
+def test_imports():
+    """Test that all required modules can be imported"""
+    print("Testing imports...")
+    try:
+        from app import app
+        from gemini_client import GeminiClient
+        from latex_processor import LaTeXProcessor
+        from document_converter import DocumentConverter
+        print("✅ All imports successful!")
+        return True
+    except Exception as e:
+        print(f"❌ Import failed: {str(e)}")
+        return False
+def test_api_key():
+    """Test that API key is configured"""
+    print("\nTesting API key configuration...")
+    api_key = os.getenv('GEMINI_API_KEY')
+    if api_key and len(api_key) > 20:
+        print(f"✅ API key found (length: {len(api_key)})")
+        return True
+    else:
+        print("❌ GEMINI_API_KEY not found or invalid")
+        print("Please set it in .env file or HuggingFace Spaces secrets")
+        return False
+def test_latex_detection():
+    """Test LaTeX content detection"""
+    print("\nTesting LaTeX detection...")
+    try:
+        from latex_processor import LaTeXProcessor
+        processor = LaTeXProcessor()
+        # Test with mathematical content
+        math_text = "The equation E = mc^2 shows the relationship"
+        has_math = processor.detect_mathematical_content(math_text)
+        if has_math:
+            print("✅ LaTeX detection working!")
+            return True
+        else:
+            print("⚠️  LaTeX detection may need adjustment")
+            return False
+    except Exception as e:
+        print(f"❌ LaTeX detection failed: {str(e)}")
+        return False
+def test_gemini_client():
+    """Test Gemini client initialization"""
+    print("\nTesting Gemini client...")
+    try:
+        from gemini_client import GeminiClient
+        api_key = os.getenv('GEMINI_API_KEY')
+        if not api_key:
+            print("⚠️  Skipping Gemini test - no API key")
+            return False
+        client = GeminiClient(api_key)
+        print("✅ Gemini client initialized!")
+        return True
+    except Exception as e:
+        print(f"❌ Gemini client failed: {str(e)}")
+        return False
+def main():
+    print("=" * 50)
+    print("Backend Test Suite")
+    print("=" * 50)
+    results = {
+        "Imports": test_imports(),
+        "API Key": test_api_key(),
+        "LaTeX Detection": test_latex_detection(),
+        "Gemini Client": test_gemini_client(),
+    }
+    print("\n" + "=" * 50)
+    print("Test Results Summary")
+    print("=" * 50)
+    for test_name, passed in results.items():
+        status = "✅ PASS" if passed else "❌ FAIL"
+        print(f"{test_name}: {status}")
+    total_passed = sum(results.values())
+    total_tests = len(results)
+    print("\n" + "=" * 50)
+    print(f"Total: {total_passed}/{total_tests} tests passed")
+    print("=" * 50)
+    if total_passed == total_tests:
+        print("\n🎉 All tests passed! Ready for deployment!")
+    else:
+        print("\n⚠️  Some tests failed. Please fix before deploying.")
+if __name__ == "__main__":
+    main()