| # Memo: Production-Grade Transformers + Safetensors Implementation | |
|  | |
|  | |
|  | |
|  | |
| ## Overview | |
| **Memo** is a complete transformation from toy logic to production-grade machine learning infrastructure. This implementation uses **Transformers + Safetensors** as the foundation for enterprise-level video generation with proper security, performance optimization, and scalability. | |
| ## π― What This Guarantees | |
| β **Transformers-based** - Real ML understanding, not toy logic | |
| β **Safetensors-only** - Zero security vulnerabilities | |
| β **Production-ready** - Enterprise architecture with proper error handling | |
| β **Memory optimized** - xFormers, attention slicing, CPU offload | |
| β **Tier-based scaling** - Free/Pro/Enterprise configurations | |
| β **Security compliant** - Audit trails and validation | |
| ## ποΈ Architecture | |
| ### Core Components | |
| 1. **Bangla Text Parser** (`models/text/bangla_parser.py`) | |
| - Transformer-based scene extraction using `google/mt5-small` | |
| - Proper tokenization with memory optimization | |
| - Deterministic output with controlled parameters | |
| 2. **Scene Planner** (`core/scene_planner.py`) | |
| - ML-based scene planning (no more toy logic) | |
| - Intelligent timing and pacing calculations | |
| - Visual style determination | |
| 3. **Stable Diffusion Generator** (`models/image/sd_generator.py`) | |
| - **Safetensors-only model loading** (`use_safetensors=True`) | |
| - Memory optimizations (xFormers, attention slicing, CPU offload) | |
| - LoRA support with safetensors validation | |
| - LCM acceleration for faster inference | |
| 4. **Model Tier System** (`config/model_tiers.py`) | |
| - **Free Tier**: Basic 512x512, 15 steps, no LoRA | |
| - **Pro Tier**: 768x768, 25 steps, scene LoRA, LCM | |
| - **Enterprise Tier**: 1024x1024, 30 steps, custom LoRA | |
| 5. **Training Pipeline** (`scripts/train_scene_lora.py`) | |
| - **MANDATORY** `save_safetensors=True` | |
| - Transformers integration with PEFT | |
| - Security-first training with proper validation | |
| 6. **Production API** (`api/main.py`) | |
| - FastAPI endpoint with tier-based routing | |
| - Background processing for long-running tasks | |
| - Security validation endpoints | |
| ## π Security Implementation | |
| ### Model Weight Security | |
| - **ONLY .safetensors files allowed** - No .bin, .ckpt, or pickle files | |
| - Model signature verification | |
| - File format enforcement | |
| - Memory-safe loading practices | |
| ### LoRA Configuration (`data/lora/README.md`) | |
| - **ONLY .safetensors files** - No .bin, .ckpt, or other formats allowed | |
| - Model signatures required | |
| - Version tracking and audit trails | |
| ## π Usage Examples | |
| ### Basic Scene Planning | |
| ```python | |
| from core.scene_planner import plan_scenes | |
| scenes = plan_scenes( | |
| text_bn="ΰ¦ΰ¦ΰ¦ΰ§ΰ¦° দিনΰ¦ΰ¦Ώ ΰ¦ΰ§ΰ¦¬ ΰ¦Έΰ§ΰ¦¨ΰ§ΰ¦¦ΰ¦° ΰ¦ΰ¦Ώΰ¦²ΰ₯€", | |
| duration=15 | |
| ) | |
| ``` | |
| ### Tier-Based Generation | |
| ```python | |
| from config.model_tiers import get_tier_config | |
| from models.image.sd_generator import get_generator | |
| config = get_tier_config("pro") | |
| generator = get_generator(lora_path=config.lora_path, use_lcm=config.lcm_enabled) | |
| ``` | |
| ### Security Validation | |
| ```python | |
| from config.model_tiers import validate_model_weights_security | |
| result = validate_model_weights_security("data/lora/memo-scene-lora.safetensors") | |
| ``` | |
| ## π Model Tiers | |
| | Tier | Resolution | Inference Steps | LoRA | LCM | Credits/min | Memory | | |
| |------|------------|-----------------|------|-----|-------------|--------| | |
| | Free | 512Γ512 | 15 | β | β | $5.0 | 4GB | | |
| | Pro | 768Γ768 | 25 | β | β | $15.0 | 8GB | | |
| | Enterprise | 1024Γ1024 | 30 | β | β | $50.0 | 16GB | | |
| ## π οΈ Installation | |
| ```bash | |
| # Clone the repository | |
| git clone https://huggingface.co/likhonsheikh/memo | |
| # Install dependencies | |
| pip install -r requirements.txt | |
| # Run the demonstration | |
| python demo.py | |
| # Start the API server | |
| python api/main.py | |
| ``` | |
| ## π¬ API Usage | |
| ### Health Check | |
| ```bash | |
| curl http://localhost:8000/health | |
| ``` | |
| ### Generate Video | |
| ```bash | |
| curl -X POST "http://localhost:8000/generate" \ | |
| -H "Content-Type: application/json" \ | |
| -d '{ | |
| "text": "ΰ¦ΰ¦ΰ¦ΰ§ΰ¦° দিনΰ¦ΰ¦Ώ ΰ¦ΰ§ΰ¦¬ ΰ¦Έΰ§ΰ¦¨ΰ§ΰ¦¦ΰ¦° ΰ¦ΰ¦Ώΰ¦²ΰ₯€", | |
| "duration": 15, | |
| "tier": "pro" | |
| }' | |
| ``` | |
| ### Check Status | |
| ```bash | |
| curl http://localhost:8000/status/{request_id} | |
| ``` | |
| ## π§ͺ Training Custom LoRA | |
| ```python | |
| from scripts.train_scene_lora import SceneLoRATrainer, TrainingConfig | |
| config = TrainingConfig( | |
| base_model="google/mt5-small", | |
| rank=32, | |
| alpha=64, | |
| save_safetensors=True # MANDATORY | |
| ) | |
| trainer = SceneLoRATrainer(config) | |
| trainer.load_model() | |
| trainer.setup_lora() | |
| trainer.train(training_data) | |
| ``` | |
| ## β‘ Performance Features | |
| - **Memory Optimization**: xFormers, attention slicing, CPU offload | |
| - **FP16 Precision**: 50% memory reduction with maintained quality | |
| - **LCM Acceleration**: Faster inference when available | |
| - **Device Mapping**: Optimal GPU/CPU utilization | |
| - **Background Processing**: Async handling of long-running tasks | |
| ## π Security Validation | |
| ```python | |
| from config.model_tiers import validate_model_weights_security | |
| # Validate any model file | |
| result = validate_model_weights_security("path/to/model.safetensors") | |
| print(f"Secure: {result['is_secure']}") | |
| print(f"Format: {result['format']}") | |
| print(f"Issues: {result['issues']}") | |
| ``` | |
| ## π File Structure | |
| ``` | |
| π Memo/ | |
| βββ π requirements.txt # Production dependencies | |
| βββ π models/ | |
| β βββ π text/ | |
| β βββ π bangla_parser.py # Transformer-based Bangla parser | |
| βββ π core/ | |
| β βββ π scene_planner.py # ML-based scene planning | |
| βββ π models/ | |
| β βββ π image/ | |
| β βββ π sd_generator.py # Stable Diffusion + Safetensors | |
| βββ π data/ | |
| β βββ π lora/ | |
| β βββ π README.md # LoRA configuration (safetensors only) | |
| βββ π scripts/ | |
| β βββ π train_scene_lora.py # Training with safetensors output | |
| βββ π config/ | |
| β βββ π model_tiers.py # Tier management system | |
| βββ π api/ | |
| β βββ π main.py # Production API endpoint | |
| βββ π demo.py # Complete system demonstration | |
| ``` | |
| ## π― What This Doesn't Do | |
| β Make GPUs cheap | |
| β Fix bad prompts | |
| β Read your mind | |
| β Guarantee perfect results | |
| ## π Production Readiness | |
| This implementation is now: | |
| - β **Correct** - Uses proper ML frameworks (transformers, safetensors) | |
| - β **Modern** - 2025-grade architecture with security best practices | |
| - β **Secure** - Zero tolerance for unsafe model formats | |
| - β **Scalable** - Tier-based resource management | |
| - β **Defensible** - Production-grade security and validation | |
| ## π License | |
| This project is licensed under the Apache License 2.0 - see the [LICENSE](LICENSE) file for details. | |
| ## π€ Contributing | |
| Contributions are welcome! Please feel free to submit a Pull Request. | |
| ## π Support | |
| For support, email support@memo.ai or join our [Discord community](https://discord.gg/memo). | |
| --- | |
| **If your API claims "state-of-the-art" without these features, you're lying.** Memo now actually delivers on that promise with proper Transformers + Safetensors integration. |