File size: 7,622 Bytes
a8fc815
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
# Memo: Production-Grade Transformers + Safetensors Implementation

![Memo Logo](https://img.shields.io/badge/Memo-Transformers%20%2B%20Safetensors-brightgreen?style=for-the-badge)
![Transformers](https://img.shields.io/badge/Transformers-4.57.3-blue?style=flat-square)
![Safetensors](https://img.shields.io/badge/Safetensors-0.7.0-red?style=flat-square)
![License](https://img.shields.io/badge/License-Apache%202.0-green?style=flat-square)

## Overview

**Memo** is a complete transformation from toy logic to production-grade machine learning infrastructure. This implementation uses **Transformers + Safetensors** as the foundation for enterprise-level video generation with proper security, performance optimization, and scalability.

## 🎯 What This Guarantees

βœ… **Transformers-based** - Real ML understanding, not toy logic  
βœ… **Safetensors-only** - Zero security vulnerabilities  
βœ… **Production-ready** - Enterprise architecture with proper error handling  
βœ… **Memory optimized** - xFormers, attention slicing, CPU offload  
βœ… **Tier-based scaling** - Free/Pro/Enterprise configurations  
βœ… **Security compliant** - Audit trails and validation  

## πŸ—οΈ Architecture

### Core Components

1. **Bangla Text Parser** (`models/text/bangla_parser.py`)
   - Transformer-based scene extraction using `google/mt5-small`
   - Proper tokenization with memory optimization
   - Deterministic output with controlled parameters

2. **Scene Planner** (`core/scene_planner.py`)
   - ML-based scene planning (no more toy logic)
   - Intelligent timing and pacing calculations
   - Visual style determination

3. **Stable Diffusion Generator** (`models/image/sd_generator.py`)
   - **Safetensors-only model loading** (`use_safetensors=True`)
   - Memory optimizations (xFormers, attention slicing, CPU offload)
   - LoRA support with safetensors validation
   - LCM acceleration for faster inference

4. **Model Tier System** (`config/model_tiers.py`)
   - **Free Tier**: Basic 512x512, 15 steps, no LoRA
   - **Pro Tier**: 768x768, 25 steps, scene LoRA, LCM
   - **Enterprise Tier**: 1024x1024, 30 steps, custom LoRA

5. **Training Pipeline** (`scripts/train_scene_lora.py`)
   - **MANDATORY** `save_safetensors=True`
   - Transformers integration with PEFT
   - Security-first training with proper validation

6. **Production API** (`api/main.py`)
   - FastAPI endpoint with tier-based routing
   - Background processing for long-running tasks
   - Security validation endpoints

## πŸ”’ Security Implementation

### Model Weight Security
- **ONLY .safetensors files allowed** - No .bin, .ckpt, or pickle files
- Model signature verification
- File format enforcement
- Memory-safe loading practices

### LoRA Configuration (`data/lora/README.md`)
- **ONLY .safetensors files** - No .bin, .ckpt, or other formats allowed
- Model signatures required
- Version tracking and audit trails

## πŸš€ Usage Examples

### Basic Scene Planning
```python
from core.scene_planner import plan_scenes

scenes = plan_scenes(
    text_bn="ΰ¦†ΰ¦œΰ¦•ΰ§‡ΰ¦° দিনটি খুব সুন্দর ছিলΰ₯€",
    duration=15
)
```

### Tier-Based Generation
```python
from config.model_tiers import get_tier_config
from models.image.sd_generator import get_generator

config = get_tier_config("pro")
generator = get_generator(lora_path=config.lora_path, use_lcm=config.lcm_enabled)
```

### Security Validation
```python
from config.model_tiers import validate_model_weights_security

result = validate_model_weights_security("data/lora/memo-scene-lora.safetensors")
```

## πŸ“Š Model Tiers

| Tier | Resolution | Inference Steps | LoRA | LCM | Credits/min | Memory |
|------|------------|-----------------|------|-----|-------------|--------|
| Free | 512Γ—512 | 15 | ❌ | ❌ | $5.0 | 4GB |
| Pro | 768Γ—768 | 25 | βœ… | βœ… | $15.0 | 8GB |
| Enterprise | 1024Γ—1024 | 30 | βœ… | βœ… | $50.0 | 16GB |

## πŸ› οΈ Installation

```bash
# Clone the repository
git clone https://huggingface.co/likhonsheikh/memo

# Install dependencies
pip install -r requirements.txt

# Run the demonstration
python demo.py

# Start the API server
python api/main.py
```

## 🎬 API Usage

### Health Check
```bash
curl http://localhost:8000/health
```

### Generate Video
```bash
curl -X POST "http://localhost:8000/generate" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "ΰ¦†ΰ¦œΰ¦•ΰ§‡ΰ¦° দিনটি খুব সুন্দর ছিলΰ₯€",
    "duration": 15,
    "tier": "pro"
  }'
```

### Check Status
```bash
curl http://localhost:8000/status/{request_id}
```

## πŸ§ͺ Training Custom LoRA

```python
from scripts.train_scene_lora import SceneLoRATrainer, TrainingConfig

config = TrainingConfig(
    base_model="google/mt5-small",
    rank=32,
    alpha=64,
    save_safetensors=True  # MANDATORY
)

trainer = SceneLoRATrainer(config)
trainer.load_model()
trainer.setup_lora()
trainer.train(training_data)
```

## ⚑ Performance Features

- **Memory Optimization**: xFormers, attention slicing, CPU offload
- **FP16 Precision**: 50% memory reduction with maintained quality
- **LCM Acceleration**: Faster inference when available
- **Device Mapping**: Optimal GPU/CPU utilization
- **Background Processing**: Async handling of long-running tasks

## πŸ” Security Validation

```python
from config.model_tiers import validate_model_weights_security

# Validate any model file
result = validate_model_weights_security("path/to/model.safetensors")
print(f"Secure: {result['is_secure']}")
print(f"Format: {result['format']}")
print(f"Issues: {result['issues']}")
```

## πŸ“ File Structure

```
πŸ“ Memo/
β”œβ”€β”€ πŸ“„ requirements.txt                    # Production dependencies
β”œβ”€β”€ πŸ“ models/
β”‚   └── πŸ“ text/
β”‚       └── πŸ“„ bangla_parser.py           # Transformer-based Bangla parser
β”œβ”€β”€ πŸ“ core/
β”‚   └── πŸ“„ scene_planner.py               # ML-based scene planning
β”œβ”€β”€ πŸ“ models/
β”‚   └── πŸ“ image/
β”‚       └── πŸ“„ sd_generator.py            # Stable Diffusion + Safetensors
β”œβ”€β”€ πŸ“ data/
β”‚   └── πŸ“ lora/
β”‚       └── πŸ“„ README.md                  # LoRA configuration (safetensors only)
β”œβ”€β”€ πŸ“ scripts/
β”‚   └── πŸ“„ train_scene_lora.py            # Training with safetensors output
β”œβ”€β”€ πŸ“ config/
β”‚   └── πŸ“„ model_tiers.py                 # Tier management system
β”œβ”€β”€ πŸ“ api/
β”‚   └── πŸ“„ main.py                        # Production API endpoint
└── πŸ“ demo.py                            # Complete system demonstration
```

## 🎯 What This Doesn't Do

❌ Make GPUs cheap  
❌ Fix bad prompts  
❌ Read your mind  
❌ Guarantee perfect results  

## πŸ† Production Readiness

This implementation is now:
- βœ… **Correct** - Uses proper ML frameworks (transformers, safetensors)
- βœ… **Modern** - 2025-grade architecture with security best practices
- βœ… **Secure** - Zero tolerance for unsafe model formats
- βœ… **Scalable** - Tier-based resource management
- βœ… **Defensible** - Production-grade security and validation

## πŸ“œ License

This project is licensed under the Apache License 2.0 - see the [LICENSE](LICENSE) file for details.

## 🀝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

## πŸ“ž Support

For support, email support@memo.ai or join our [Discord community](https://discord.gg/memo).

---

**If your API claims "state-of-the-art" without these features, you're lying.** Memo now actually delivers on that promise with proper Transformers + Safetensors integration.