Spaces:

empirenexus
/

WritingStudio

Sleeping

App Files Files Community

jmisak commited on Oct 25, 2025

Commit

2e32647

verified ·

1 Parent(s): dbb3036

Update README.md

Browse files

Files changed (1) hide show

README.md +336 -18

README.md CHANGED Viewed

@@ -7,39 +7,357 @@ sdk: gradio
 sdk_version: "4.0.0"
 app_file: app.py
 pinned: false
 ---
-# AI Writing Studio
-Production-grade AI writing assistant with real rubric-based scoring for educational use.
 ## Features
-- 🎯 Real rubric scoring (Clarity, Conciseness, Organization, Evidence, Grammar)
-- 🔄 AI-powered revision suggestions
-- 📊 Visual diff highlighting
-- 📝 5 specialized prompt packs (General, Literature, Tech Comm, Academic, Creative)
 ## Usage
-1. Paste your draft text
-2. Select a model (distilgpt2 recommended for free tier)
-3. Choose a prompt pack
-4. Click "Analyze & Compare"
 ## Models
-- **distilgpt2** (default) - Fast, works on free tier
-- **gpt2** - Better quality, slower
-- **gpt2-medium/large** - Best quality, requires upgraded hardware
 ## Performance
-First analysis: ~30-60 seconds (model loading)
-Subsequent: ~5-10 seconds (cached)
-## Source
-GitHub: [AI Writing Studio](https://github.com/yourusername/writing-studio)
-Built with [Gradio](https://gradio.app/) and [HuggingFace Transformers](https://huggingface.co/transformers/)

 sdk_version: "4.0.0"
 app_file: app.py
 pinned: false
+license: mit
+short_description: AI writing revision with FLAN-T5 and rubric scoring
+tags:
+  - education
+  - writing
+  - nlp
+  - text2text-generation
+  - instruction-following
+  - analysis
+suggested_hardware: cpu-basic
+suggested_storage: small
 ---
+# Writing Studio - HuggingFace Spaces Edition
+Production-grade AI Writing Studio powered by **FLAN-T5** for intelligent text revision.
+## About
+AI Writing Studio is a production-grade educational writing assistant that provides **real AI-powered text revision** using instruction-following models:
+- **🤖 AI-Powered Revision** using FLAN-T5 (instruction-tuned for text revision)
+- **📊 Real Rubric Scoring** across 5 criteria (Clarity, Conciseness, Organization, Evidence, Grammar)
+- **🔍 Visual Diff Highlighting** to see exactly what changed
+- **📝 5 Specialized Modes** (General, Literature, Tech Comm, Academic, Creative)
+## 🆕 What's New: FLAN-T5 Integration
+**Major Update**: Replaced GPT-2 with FLAN-T5 for **real AI-powered text revision**.
+**What Changed**:
+- ✅ **FLAN-T5** now default model (instruction-following, actually revises text)
+- ❌ **GPT-2 removed** (only continues text, doesn't revise)
+- 🎯 **Instruction-optimized prompts** for better revision quality
+- 🚀 **Automatic model detection** (supports both T5 and GPT-2 pipelines)
+**Why This Matters**:
+GPT-2 couldn't revise text—it only continued it with unrelated content. FLAN-T5 understands revision instructions and produces genuine improvements to your writing.
+**Trade-off**: First load is ~60s instead of ~30s, but you get actual AI revision instead of gibberish!
+## Quick Start
+1. Open the app on HuggingFace Spaces
+2. Paste text (200-500 words recommended for first try)
+3. Choose revision mode (try "General" first)
+4. Click "✨ Revise & Analyze"
+5. Wait ~60s for first analysis (model loading)
+6. Compare original vs AI-revised text
+7. Review rubric scores and highlighted changes
 ## Features
+### ✨ AI-Powered Revision with FLAN-T5
+**Why FLAN-T5?**
+FLAN-T5 is an **instruction-tuned model** specifically trained to follow revision instructions. Unlike GPT-2 (which only continues text), FLAN-T5 actually understands and executes revision tasks like:
+- Improving clarity and readability
+- Enhancing academic tone
+- Strengthening evidence and support
+- Refining technical precision
+- Enriching creative imagery
+**Real Text Revision**: The AI doesn't just continue your text—it genuinely revises it based on the selected mode.
+### 📊 Real Rubric Analysis
+Unlike simple prototypes, this version includes actual analysis algorithms:
+- **Clarity**: Analyzes sentence length, complexity, and structure
+- **Conciseness**: Detects wordy phrases and redundancy
+- **Organization**: Checks paragraph structure and transitions
+- **Evidence**: Looks for supporting examples and data
+- **Grammar**: Basic error detection
+### 📝 5 Specialized Revision Modes
+Choose from instruction-tuned templates optimized for FLAN-T5:
+- **General**: Improve clarity and readability for everyday writing
+- **Literature**: Strengthen literary analysis with better evidence and terminology
+- **Tech Comm**: Enhance technical precision and professional tone
+- **Academic**: Improve formal tone, organization, and scholarly voice
+- **Creative**: Enhance imagery, voice, and reader engagement
+### 🔍 Visual Diff Highlighting
+See exactly what the AI changed with side-by-side comparison and highlighted differences.
+### 🏭 Production Quality
+- Comprehensive error handling
+- Input validation and sanitization
+- Structured logging
+- Intelligent caching for faster responses
+- Type-safe configuration with Pydantic
+- Automatic model type detection
 ## Usage
+1. **Paste your text** in the input box (up to 10,000 characters)
+2. **Choose a revision mode** matching your writing context (General, Literature, Tech Comm, Academic, Creative)
+3. **Click "✨ Revise & Analyze"** to get AI revision + rubric feedback
+4. **Review results**: Compare original vs revised text, check rubric scores, view highlighted changes
+### Tips
+- **First analysis takes ~60 seconds** (FLAN-T5 model loading) - this is normal!
+- **Subsequent analyses are much faster** (~5-10s) thanks to caching
+- Start with shorter texts (200-500 words) for quicker results
+- Try different revision modes to see how the AI adapts its approach
+- Use the rubric feedback to understand what improved
+- The diff view shows exactly what changed and why
 ## Models
+### Default: google/flan-t5-base
+**Why FLAN-T5?**
+FLAN-T5 (Fine-tuned Language Net) is an **instruction-following model** from Google Research, specifically designed to understand and execute text revision tasks. This is fundamentally different from GPT-2 style models:
+| Feature | FLAN-T5 (Current) | GPT-2 (Previous) |
+|---------|------------------|------------------|
+| **Task Type** | Instruction following | Text continuation |
+| **Can Revise Text?** | ✅ Yes | ❌ No (only continues) |
+| **Understands Instructions?** | ✅ Yes | ❌ No |
+| **Works with Revision Modes?** | ✅ Yes | ❌ No |
+| **Model Size** | ~250M parameters | ~124M parameters |
+| **First Load Time** | ~60s | ~30s |
+| **Quality** | High (task-specific) | Low (off-task) |
+**FLAN-T5 Advantages:**
+- ✅ Actually revises text (not just continuation)
+- ✅ Follows mode-specific instructions (General, Academic, etc.)
+- ✅ Produces contextually appropriate output
+- ✅ Understands the task at hand
+**Why Not GPT-2?**
+GPT-2 and distilgpt2 are **autoregressive text generators** trained only to continue text. When given revision instructions, they ignore them and generate unrelated continuations. FLAN-T5 was explicitly trained on instruction-following tasks, making it ideal for text revision.
+### Alternative Models (Advanced)
+You can change the model in the UI, but these require more resources:
+**google/flan-t5-large** (780M params)
+- Better revision quality
+- Requires CPU upgrade or GPU
+- ~2-3 minutes first load
+**google/flan-t5-xl** (3B params)
+- Best quality revisions
+- Requires T4 GPU on HF Spaces
+- ~5 minutes first load
 ## Performance
+### Hardware Recommendations
+**Free Tier (CPU Basic)** ⭐ Recommended
+- Works well with **google/flan-t5-base**
+- First load: ~60 seconds (model download + initialization)
+- Subsequent analyses: ~5-10 seconds
+- Perfect for educational use and demos
+**CPU Upgrade**
+- Handles **google/flan-t5-large** comfortably
+- First load: ~2-3 minutes
+- Subsequent: ~10-15 seconds
+- Better revision quality
+**T4 GPU** ⚡ Best Performance
+- Runs **google/flan-t5-xl** smoothly
+- First load: ~5 minutes
+- Subsequent: ~3-5 seconds
+- Highest quality revisions
+### FLAN-T5 vs GPT-2 Performance
+FLAN-T5 is slightly larger than distilgpt2, but the quality difference is dramatic:
+- FLAN-T5: Slower but **actually revises text correctly**
+- GPT-2: Faster but **produces unusable output** (wrong task)
+**The extra 30 seconds of load time is worth it for functional AI revision!**
+### Optimization
+The app includes production-grade optimizations:
+- **Model caching**: Loaded once, reused for all requests
+- **Result caching**: Same input = instant cached response
+- **Intelligent pipeline selection**: Automatically uses correct pipeline for model type
+- **Lazy loading**: Services initialized only when needed
+- **Efficient text processing**: Minimizes unnecessary operations
+## Configuration
+The app works out-of-the-box with sensible defaults optimized for FLAN-T5. To customize, you can set environment variables in your HuggingFace Space settings.
+### Available Environment Variables
+```bash
+# Model Configuration
+DEFAULT_MODEL=google/flan-t5-base  # HuggingFace model ID (use FLAN-T5 variants)
+MAX_MODEL_LENGTH=512               # Maximum model input/output length
+DEFAULT_MAX_LENGTH=512             # Default generation length
+# Application Settings
+ENVIRONMENT=production             # Runtime environment (development/staging/production)
+LOG_LEVEL=INFO                     # Logging level (DEBUG/INFO/WARNING/ERROR)
+LOG_FORMAT=text                    # Log format (json/text) - text is easier on HF Spaces
+MAX_TEXT_LENGTH=10000              # Maximum input text length
+# Performance
+ENABLE_CACHE=true                  # Enable result caching
+CACHE_MAX_SIZE=100                 # Maximum cache entries
+ENABLE_METRICS=false               # Disable metrics server on HF Spaces
+# Features
+ENABLE_DIFF_HIGHLIGHTING=true      # Enable visual diff view
+ENABLE_RUBRIC_SCORING=true         # Enable rubric analysis
+ENABLE_PROMPT_PACKS=true           # Enable revision mode selection
+```
+## Troubleshooting
+### "Out of Memory" Error
+**Problem**: Space crashes or shows OOM error
+**Solutions**:
+- ✅ Stick with `google/flan-t5-base` on free tier (works well)
+- ✅ Reduce input text length (try 200-500 words)
+- ✅ Upgrade to CPU upgrade tier for larger models
+- ❌ Don't try flan-t5-large or flan-t5-xl without GPU
+### Slow First Load (~60 seconds)
+**This is normal!** FLAN-T5-base is ~250M parameters.
+- First analysis: ~60s (model download + initialization)
+- Subsequent: ~5-10s (model cached in memory)
+- If it times out: Refresh and try again (HF Spaces issue)
+### "Model Loading Failed"
+**Problem**: Error during model initialization
+**Solutions**:
+- Check model name spelling (must be exact HuggingFace ID)
+- Ensure internet connectivity for model download
+- Try default: `google/flan-t5-base`
+- Check HF Spaces logs for specific error
+### AI Revision Doesn't Make Sense
+**Problem**: Revision output is garbled or off-topic
+**Solutions**:
+- ✅ Make sure you're using FLAN-T5 (not GPT-2!)
+- ✅ Try a different revision mode (General, Academic, etc.)
+- ✅ Check input text is clear and well-formed
+- ✅ Try shorter input text (model has 512 token limit)
+- Remember: FLAN-T5 base is small; larger models (flan-t5-large) give better results
+### "Text Generation Failed"
+**Problem**: Error during AI revision generation
+**Solutions**:
+- Input too long (try shorter text)
+- Model timeout (refresh and retry)
+- Check HF Spaces status (temporary service issue)
+## Privacy
+- Text processed in-memory only
+- Results cached temporarily for speed
+- No long-term storage on HF Spaces
+- No user tracking
+## Technical Details
+### How FLAN-T5 Integration Works
+The app automatically detects model type and uses the appropriate pipeline:
+**For FLAN-T5 models** (text2text-generation):
+```python
+# Detects 't5' or 'flan' in model name
+pipeline("text2text-generation", model="google/flan-t5-base")
+```
+**For GPT-2 models** (text-generation):
+```python
+# Fallback for text continuation models
+pipeline("text-generation", model="gpt2")
+```
+**Instruction-Following Prompts**:
+FLAN-T5 requires structured instruction format:
+```
+Revise the following text to improve clarity, conciseness, and readability.
+Make it clear and easy to understand while maintaining the original meaning.
+Text: [user input]
+Revised text:
+```
+This format tells FLAN-T5 exactly what to do, resulting in actual revisions instead of text continuation.
+### Architecture
+**Production-Grade Layered Design**:
+```
+src/writing_studio/
+├── core/
+│   ├── analyzer.py      # Main orchestrator
+│   ├── config.py        # Pydantic settings (FLAN-T5 defaults)
+│   └── exceptions.py    # Custom error types
+├── services/
+│   ├── model_service.py    # FLAN-T5 pipeline management
+│   ├── prompt_service.py   # Instruction-following prompts
+│   ├── rubric_service.py   # Rule-based scoring algorithms
+│   └── diff_service.py     # Visual diff generation
+├── utils/
+│   ├── logging.py       # Structured logging
+│   ├── validation.py    # Input sanitization
+│   └── metrics.py       # Prometheus metrics
+└── app.py               # HuggingFace Spaces entry point
+```
+## Source Code
+Full source code available at: [GitHub Repository](https://github.com/yourusername/writing-studio)
+### Local Development
+```bash
+git clone https://github.com/yourusername/writing-studio
+cd writing-studio
+pip install -r requirements.txt
+python app.py
+```
+## Contributing
+Contributions welcome! See [GitHub](https://github.com/yourusername/writing-studio) for:
+- Full documentation
+- Development setup
+- Testing guidelines
+- Code quality standards
+## License
+MIT License - See LICENSE file
+## Acknowledgments
+- **FLAN-T5**: [google/flan-t5-base](https://huggingface.co/google/flan-t5-base) by Google Research
+- Built with [Gradio](https://gradio.app/) - Python web UI for ML
+- Powered by [HuggingFace Transformers](https://huggingface.co/transformers/) - State-of-the-art NLP
+- Hosted on [HuggingFace Spaces](https://huggingface.co/spaces) - Free ML app hosting
+- Instruction-tuning research: [FLAN paper](https://arxiv.org/abs/2210.11416)
+## Support
+Need help?
+- Issues: [GitHub Issues](https://github.com/yourusername/writing-studio/issues)
+- Documentation: [GitHub Docs](https://github.com/yourusername/writing-studio/tree/main/docs)
+- Questions: [GitHub Discussions](https://github.com/yourusername/writing-studio/discussions)