memo / model_card.md

Upload Memo: Production-grade Transformers + Safetensors implementation

a8fc815 verified 28 days ago

7.62 kB

	# Memo: Production-Grade Transformers + Safetensors Implementation

	![Memo Logo](https://img.shields.io/badge/Memo-Transformers%20%2B%20Safetensors-brightgreen?style=for-the-badge)
	![Transformers](https://img.shields.io/badge/Transformers-4.57.3-blue?style=flat-square)
	![Safetensors](https://img.shields.io/badge/Safetensors-0.7.0-red?style=flat-square)
	![License](https://img.shields.io/badge/License-Apache%202.0-green?style=flat-square)

	## Overview

	Memo is a complete transformation from toy logic to production-grade machine learning infrastructure. This implementation uses Transformers + Safetensors as the foundation for enterprise-level video generation with proper security, performance optimization, and scalability.

	## 🎯 What This Guarantees

	✅ Transformers-based - Real ML understanding, not toy logic
	✅ Safetensors-only - Zero security vulnerabilities
	✅ Production-ready - Enterprise architecture with proper error handling
	✅ Memory optimized - xFormers, attention slicing, CPU offload
	✅ Tier-based scaling - Free/Pro/Enterprise configurations
	✅ Security compliant - Audit trails and validation

	## 🏗️ Architecture

	### Core Components

	1. Bangla Text Parser (`models/text/bangla_parser.py`)
	- Transformer-based scene extraction using `google/mt5-small`
	- Proper tokenization with memory optimization
	- Deterministic output with controlled parameters

	2. Scene Planner (`core/scene_planner.py`)
	- ML-based scene planning (no more toy logic)
	- Intelligent timing and pacing calculations
	- Visual style determination

	3. Stable Diffusion Generator (`models/image/sd_generator.py`)
	- Safetensors-only model loading (`use_safetensors=True`)
	- Memory optimizations (xFormers, attention slicing, CPU offload)
	- LoRA support with safetensors validation
	- LCM acceleration for faster inference

	4. Model Tier System (`config/model_tiers.py`)
	- Free Tier: Basic 512x512, 15 steps, no LoRA
	- Pro Tier: 768x768, 25 steps, scene LoRA, LCM
	- Enterprise Tier: 1024x1024, 30 steps, custom LoRA

	5. Training Pipeline (`scripts/train_scene_lora.py`)
	- MANDATORY `save_safetensors=True`
	- Transformers integration with PEFT
	- Security-first training with proper validation

	6. Production API (`api/main.py`)
	- FastAPI endpoint with tier-based routing
	- Background processing for long-running tasks
	- Security validation endpoints

	## 🔒 Security Implementation

	### Model Weight Security
	- ONLY .safetensors files allowed - No .bin, .ckpt, or pickle files
	- Model signature verification
	- File format enforcement
	- Memory-safe loading practices

	### LoRA Configuration (`data/lora/README.md`)
	- ONLY .safetensors files - No .bin, .ckpt, or other formats allowed
	- Model signatures required
	- Version tracking and audit trails

	## 🚀 Usage Examples

	### Basic Scene Planning
	```python
	from core.scene_planner import plan_scenes

	scenes = plan_scenes(
	text_bn="আজকের দিনটি খুব সুন্দর ছিল।",
	duration=15
	)
	```

	### Tier-Based Generation
	```python
	from config.model_tiers import get_tier_config
	from models.image.sd_generator import get_generator

	config = get_tier_config("pro")
	generator = get_generator(lora_path=config.lora_path, use_lcm=config.lcm_enabled)
	```

	### Security Validation
	```python
	from config.model_tiers import validate_model_weights_security

	result = validate_model_weights_security("data/lora/memo-scene-lora.safetensors")
	```

	## 📊 Model Tiers

	\| Tier \| Resolution \| Inference Steps \| LoRA \| LCM \| Credits/min \| Memory \|
	\|------\|------------\|-----------------\|------\|-----\|-------------\|--------\|
	\| Free \| 512×512 \| 15 \| ❌ \| ❌ \| $5.0 \| 4GB \|
	\| Pro \| 768×768 \| 25 \| ✅ \| ✅ \| $15.0 \| 8GB \|
	\| Enterprise \| 1024×1024 \| 30 \| ✅ \| ✅ \| $50.0 \| 16GB \|

	## 🛠️ Installation

	```bash
	# Clone the repository
	git clone https://huggingface.co/likhonsheikh/memo

	# Install dependencies
	pip install -r requirements.txt

	# Run the demonstration
	python demo.py

	# Start the API server
	python api/main.py
	```

	## 🎬 API Usage

	### Health Check
	```bash
	curl http://localhost:8000/health
	```

	### Generate Video
	```bash
	curl -X POST "http://localhost:8000/generate" \
	-H "Content-Type: application/json" \
	-d '{
	"text": "আজকের দিনটি খুব সুন্দর ছিল।",
	"duration": 15,
	"tier": "pro"
	}'
	```

	### Check Status
	```bash
	curl http://localhost:8000/status/{request_id}
	```

	## 🧪 Training Custom LoRA

	```python
	from scripts.train_scene_lora import SceneLoRATrainer, TrainingConfig

	config = TrainingConfig(
	base_model="google/mt5-small",
	rank=32,
	alpha=64,
	save_safetensors=True # MANDATORY
	)

	trainer = SceneLoRATrainer(config)
	trainer.load_model()
	trainer.setup_lora()
	trainer.train(training_data)
	```

	## ⚡ Performance Features

	- Memory Optimization: xFormers, attention slicing, CPU offload
	- FP16 Precision: 50% memory reduction with maintained quality
	- LCM Acceleration: Faster inference when available
	- Device Mapping: Optimal GPU/CPU utilization
	- Background Processing: Async handling of long-running tasks

	## 🔍 Security Validation

	```python
	from config.model_tiers import validate_model_weights_security

	# Validate any model file
	result = validate_model_weights_security("path/to/model.safetensors")
	print(f"Secure: {result['is_secure']}")
	print(f"Format: {result['format']}")
	print(f"Issues: {result['issues']}")
	```

	## 📁 File Structure

	```
	📁 Memo/
	├── 📄 requirements.txt # Production dependencies
	├── 📁 models/
	│ └── 📁 text/
	│ └── 📄 bangla_parser.py # Transformer-based Bangla parser
	├── 📁 core/
	│ └── 📄 scene_planner.py # ML-based scene planning
	├── 📁 models/
	│ └── 📁 image/
	│ └── 📄 sd_generator.py # Stable Diffusion + Safetensors
	├── 📁 data/
	│ └── 📁 lora/
	│ └── 📄 README.md # LoRA configuration (safetensors only)
	├── 📁 scripts/
	│ └── 📄 train_scene_lora.py # Training with safetensors output
	├── 📁 config/
	│ └── 📄 model_tiers.py # Tier management system
	├── 📁 api/
	│ └── 📄 main.py # Production API endpoint
	└── 📁 demo.py # Complete system demonstration
	```

	## 🎯 What This Doesn't Do

	❌ Make GPUs cheap
	❌ Fix bad prompts
	❌ Read your mind
	❌ Guarantee perfect results

	## 🏆 Production Readiness

	This implementation is now:
	- ✅ Correct - Uses proper ML frameworks (transformers, safetensors)
	- ✅ Modern - 2025-grade architecture with security best practices
	- ✅ Secure - Zero tolerance for unsafe model formats
	- ✅ Scalable - Tier-based resource management
	- ✅ Defensible - Production-grade security and validation

	## 📜 License

	This project is licensed under the Apache License 2.0 - see the [LICENSE](LICENSE) file for details.

	## 🤝 Contributing

	Contributions are welcome! Please feel free to submit a Pull Request.

	## 📞 Support

	For support, email support@memo.ai or join our [Discord community](https://discord.gg/memo).

	---

	If your API claims "state-of-the-art" without these features, you're lying. Memo now actually delivers on that promise with proper Transformers + Safetensors integration.