memo / model_card.md
likhonsheikh's picture
Upload Memo: Production-grade Transformers + Safetensors implementation
a8fc815 verified
# Memo: Production-Grade Transformers + Safetensors Implementation
![Memo Logo](https://img.shields.io/badge/Memo-Transformers%20%2B%20Safetensors-brightgreen?style=for-the-badge)
![Transformers](https://img.shields.io/badge/Transformers-4.57.3-blue?style=flat-square)
![Safetensors](https://img.shields.io/badge/Safetensors-0.7.0-red?style=flat-square)
![License](https://img.shields.io/badge/License-Apache%202.0-green?style=flat-square)
## Overview
**Memo** is a complete transformation from toy logic to production-grade machine learning infrastructure. This implementation uses **Transformers + Safetensors** as the foundation for enterprise-level video generation with proper security, performance optimization, and scalability.
## 🎯 What This Guarantees
βœ… **Transformers-based** - Real ML understanding, not toy logic
βœ… **Safetensors-only** - Zero security vulnerabilities
βœ… **Production-ready** - Enterprise architecture with proper error handling
βœ… **Memory optimized** - xFormers, attention slicing, CPU offload
βœ… **Tier-based scaling** - Free/Pro/Enterprise configurations
βœ… **Security compliant** - Audit trails and validation
## πŸ—οΈ Architecture
### Core Components
1. **Bangla Text Parser** (`models/text/bangla_parser.py`)
- Transformer-based scene extraction using `google/mt5-small`
- Proper tokenization with memory optimization
- Deterministic output with controlled parameters
2. **Scene Planner** (`core/scene_planner.py`)
- ML-based scene planning (no more toy logic)
- Intelligent timing and pacing calculations
- Visual style determination
3. **Stable Diffusion Generator** (`models/image/sd_generator.py`)
- **Safetensors-only model loading** (`use_safetensors=True`)
- Memory optimizations (xFormers, attention slicing, CPU offload)
- LoRA support with safetensors validation
- LCM acceleration for faster inference
4. **Model Tier System** (`config/model_tiers.py`)
- **Free Tier**: Basic 512x512, 15 steps, no LoRA
- **Pro Tier**: 768x768, 25 steps, scene LoRA, LCM
- **Enterprise Tier**: 1024x1024, 30 steps, custom LoRA
5. **Training Pipeline** (`scripts/train_scene_lora.py`)
- **MANDATORY** `save_safetensors=True`
- Transformers integration with PEFT
- Security-first training with proper validation
6. **Production API** (`api/main.py`)
- FastAPI endpoint with tier-based routing
- Background processing for long-running tasks
- Security validation endpoints
## πŸ”’ Security Implementation
### Model Weight Security
- **ONLY .safetensors files allowed** - No .bin, .ckpt, or pickle files
- Model signature verification
- File format enforcement
- Memory-safe loading practices
### LoRA Configuration (`data/lora/README.md`)
- **ONLY .safetensors files** - No .bin, .ckpt, or other formats allowed
- Model signatures required
- Version tracking and audit trails
## πŸš€ Usage Examples
### Basic Scene Planning
```python
from core.scene_planner import plan_scenes
scenes = plan_scenes(
text_bn="ΰ¦†ΰ¦œΰ¦•ΰ§‡ΰ¦° দিনটি খুব সুন্দর ছিলΰ₯€",
duration=15
)
```
### Tier-Based Generation
```python
from config.model_tiers import get_tier_config
from models.image.sd_generator import get_generator
config = get_tier_config("pro")
generator = get_generator(lora_path=config.lora_path, use_lcm=config.lcm_enabled)
```
### Security Validation
```python
from config.model_tiers import validate_model_weights_security
result = validate_model_weights_security("data/lora/memo-scene-lora.safetensors")
```
## πŸ“Š Model Tiers
| Tier | Resolution | Inference Steps | LoRA | LCM | Credits/min | Memory |
|------|------------|-----------------|------|-----|-------------|--------|
| Free | 512Γ—512 | 15 | ❌ | ❌ | $5.0 | 4GB |
| Pro | 768Γ—768 | 25 | βœ… | βœ… | $15.0 | 8GB |
| Enterprise | 1024Γ—1024 | 30 | βœ… | βœ… | $50.0 | 16GB |
## πŸ› οΈ Installation
```bash
# Clone the repository
git clone https://huggingface.co/likhonsheikh/memo
# Install dependencies
pip install -r requirements.txt
# Run the demonstration
python demo.py
# Start the API server
python api/main.py
```
## 🎬 API Usage
### Health Check
```bash
curl http://localhost:8000/health
```
### Generate Video
```bash
curl -X POST "http://localhost:8000/generate" \
-H "Content-Type: application/json" \
-d '{
"text": "ΰ¦†ΰ¦œΰ¦•ΰ§‡ΰ¦° দিনটি খুব সুন্দর ছিলΰ₯€",
"duration": 15,
"tier": "pro"
}'
```
### Check Status
```bash
curl http://localhost:8000/status/{request_id}
```
## πŸ§ͺ Training Custom LoRA
```python
from scripts.train_scene_lora import SceneLoRATrainer, TrainingConfig
config = TrainingConfig(
base_model="google/mt5-small",
rank=32,
alpha=64,
save_safetensors=True # MANDATORY
)
trainer = SceneLoRATrainer(config)
trainer.load_model()
trainer.setup_lora()
trainer.train(training_data)
```
## ⚑ Performance Features
- **Memory Optimization**: xFormers, attention slicing, CPU offload
- **FP16 Precision**: 50% memory reduction with maintained quality
- **LCM Acceleration**: Faster inference when available
- **Device Mapping**: Optimal GPU/CPU utilization
- **Background Processing**: Async handling of long-running tasks
## πŸ” Security Validation
```python
from config.model_tiers import validate_model_weights_security
# Validate any model file
result = validate_model_weights_security("path/to/model.safetensors")
print(f"Secure: {result['is_secure']}")
print(f"Format: {result['format']}")
print(f"Issues: {result['issues']}")
```
## πŸ“ File Structure
```
πŸ“ Memo/
β”œβ”€β”€ πŸ“„ requirements.txt # Production dependencies
β”œβ”€β”€ πŸ“ models/
β”‚ └── πŸ“ text/
β”‚ └── πŸ“„ bangla_parser.py # Transformer-based Bangla parser
β”œβ”€β”€ πŸ“ core/
β”‚ └── πŸ“„ scene_planner.py # ML-based scene planning
β”œβ”€β”€ πŸ“ models/
β”‚ └── πŸ“ image/
β”‚ └── πŸ“„ sd_generator.py # Stable Diffusion + Safetensors
β”œβ”€β”€ πŸ“ data/
β”‚ └── πŸ“ lora/
β”‚ └── πŸ“„ README.md # LoRA configuration (safetensors only)
β”œβ”€β”€ πŸ“ scripts/
β”‚ └── πŸ“„ train_scene_lora.py # Training with safetensors output
β”œβ”€β”€ πŸ“ config/
β”‚ └── πŸ“„ model_tiers.py # Tier management system
β”œβ”€β”€ πŸ“ api/
β”‚ └── πŸ“„ main.py # Production API endpoint
└── πŸ“ demo.py # Complete system demonstration
```
## 🎯 What This Doesn't Do
❌ Make GPUs cheap
❌ Fix bad prompts
❌ Read your mind
❌ Guarantee perfect results
## πŸ† Production Readiness
This implementation is now:
- βœ… **Correct** - Uses proper ML frameworks (transformers, safetensors)
- βœ… **Modern** - 2025-grade architecture with security best practices
- βœ… **Secure** - Zero tolerance for unsafe model formats
- βœ… **Scalable** - Tier-based resource management
- βœ… **Defensible** - Production-grade security and validation
## πŸ“œ License
This project is licensed under the Apache License 2.0 - see the [LICENSE](LICENSE) file for details.
## 🀝 Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
## πŸ“ž Support
For support, email support@memo.ai or join our [Discord community](https://discord.gg/memo).
---
**If your API claims "state-of-the-art" without these features, you're lying.** Memo now actually delivers on that promise with proper Transformers + Safetensors integration.