Spaces:

Deusxx1234
/

orbis-backend

Running

File size: 27,575 Bytes

c84fdae

# Orbis AI Media OS — Complete Documentation

> **Version:** 1.0 · **Last updated:** 2026-05-12  
> **Stack:** FastAPI · PostgreSQL · LangGraph · Pillow · APScheduler · React · Cloudinary · Railway + Netlify

---

## Table of Contents

1. [What This System Does](#1-what-this-system-does)
2. [Architecture Overview](#2-architecture-overview)
3. [Tech Stack](#3-tech-stack)
4. [Database Schema](#4-database-schema)
5. [Content Generation Pipeline](#5-content-generation-pipeline)
6. [Post Types](#6-post-types)
7. [text_image — PIL Card System](#7-text_image--pil-card-system)
8. [Scheduler — Automated Publishing](#8-scheduler--automated-publishing)
9. [LLM Agent & Model Fallbacks](#9-llm-agent--model-fallbacks)
10. [API Reference](#10-api-reference)
11. [WebSocket Real-Time Updates](#11-websocket-real-time-updates)
12. [Frontend Pages](#12-frontend-pages)
13. [Environment Variables](#13-environment-variables)
14. [Deployment Guide](#14-deployment-guide)
15. [Local Development](#15-local-development)
16. [Key Design Decisions](#16-key-design-decisions)
17. [Known Limitations & TODOs](#17-known-limitations--todos)

---

## 1. What This System Does

**Orbis AI Media OS** is a fully automated social media content engine. It:

1. Discovers trending topics (Reddit, manual seeds)
2. Generates platform-native posts using an LLM agent (OpenRouter / Groq)
3. Creates or sources images (Pollinations AI)
4. Burns text directly onto images for `text_image` posts using Pillow (quote-card style)
5. Uploads final media to Cloudinary CDN
6. Saves drafts for human review in a web dashboard
7. Publishes approved posts to Instagram (LinkedIn, Twitter wired but not yet active)

Everything runs autonomously on a schedule. A human moderator approves posts via the dashboard before they go live (unless `AUTO_APPROVE=true`).

---

## 2. Architecture Overview

```
┌─────────────────────────────────────────────────────────────────┐
│                        SCHEDULER (APScheduler)                  │
│   Every 6h: Fetch Trends   │   Daily: Generate Posts (N/day)   │
└───────────────┬─────────────────────────┬───────────────────────┘
                │                         │
                ▼                         ▼
         ┌──────────┐            ┌─────────────────┐
         │  Trends  │            │  LLM Agent      │
         │  Table   │──────────▶ │  (LangGraph)    │
         └──────────┘            │  OpenRouter     │
                                 └────────┬────────┘
                                          │ caption + image_prompt
                                          ▼
                                 ┌─────────────────┐
                                 │ Image Generation │
                                 │ Pollinations AI  │
                                 └────────┬────────┘
                                          │ raw image URL
                                          ▼
                              ┌──────────────────────┐
                              │  PIL Card Generator  │ (text_image only)
                              │  typography card     │
                              │  editorial photo card│
                              └──────────┬───────────┘
                                         │ composited image
                                         ▼
                                ┌─────────────────┐
                                │   Cloudinary    │
                                │   CDN Upload    │
                                └────────┬────────┘
                                         │ permanent URL
                                         ▼
                               ┌──────────────────┐
                               │  Post Draft (DB) │
                               │  status=draft    │
                               │  approval=pending│
                               └────────┬─────────┘
                                        │
               ┌────────────────────────┼────────────────────────┐
               │                        │                        │
               ▼                        ▼                        ▼
    ┌─────────────────┐     ┌─────────────────────┐   ┌──────────────────┐
    │  React Dashboard│     │  PostgreSQL NOTIFY  │   │  Instagram API   │
    │  (Netlify/Vite) │◀────│  WebSocket push     │   │  on approval     │
    │  Human review   │     └─────────────────────┘   └──────────────────┘
    └─────────────────┘
```

### Request Flow (Generate via Dashboard)

```
User clicks "Generate" → GenerateModal (pick type + platform)
  → POST /api/workflows/trigger  {post_type, platform}
  → pipeline.generate_post_content()   [LLM]
  → pipeline.generate_images()         [Pollinations]
  → pipeline.store_media_to_cdn()      [Cloudinary]
  → pipeline.save_post_draft()         [Pillow overlay if text_image]
  → DB INSERT post (status=draft)
  → PG NOTIFY posts_changed
  → WebSocket push to browser
  → React Query cache updates
  → Post appears in dashboard grid
```

---

## 3. Tech Stack

### Backend

| Layer | Technology |
|-------|-----------|
| API server | FastAPI (Python 3.11+) |
| Database | PostgreSQL (Neon on Railway) |
| ORM | SQLAlchemy 2.x async |
| Scheduler | APScheduler (AsyncIOScheduler) |
| LLM framework | LangGraph + LangChain |
| LLM providers | OpenRouter (primary), Groq (fallback) |
| Image generation | Pollinations AI (free, no key needed) |
| Image processing | Pillow (PIL) |
| CDN | Cloudinary |
| Social publishing | Instagram Graph API |
| HTTP client | httpx (async) |
| Logging | Loguru |
| Config | Pydantic BaseSettings |

### Frontend

| Layer | Technology |
|-------|-----------|
| Framework | React 18 + Vite |
| State / data | TanStack Query (React Query) |
| Routing | React Router v6 |
| Charts | Recharts |
| Real-time | Native WebSocket |
| Styling | Custom CSS variables (dark-first) |
| Deployment | Netlify |

### Infrastructure

| Service | Provider |
|---------|---------|
| Backend hosting | Railway |
| Database | Neon PostgreSQL (via Railway) |
| Frontend hosting | Netlify |
| Media CDN | Cloudinary |
| Font delivery | GitHub rsms/inter (downloaded at runtime) |

---

## 4. Database Schema

### `agents`

| Column | Type | Notes |
|--------|------|-------|
| id | PK | |
| name | VARCHAR | e.g. "Default Agent" |
| description | TEXT | |
| is_active | BOOL | default true |

### `agent_versions`

| Column | Type | Notes |
|--------|------|-------|
| id | PK | |
| agent_id | FK → agents | |
| version | INT | |
| config | JSON | `{model, temperature, ...}` |

### `trends`

| Column | Type | Notes |
|--------|------|-------|
| id | PK | |
| topic | VARCHAR | e.g. "AI in healthcare 2025" |
| source | VARCHAR | reddit, manual, rss |
| score | FLOAT | relevance score |
| status | VARCHAR | pending / used / skipped |
| raw_data | JSON | source-specific metadata |
| created_at | TIMESTAMP | |

### `posts`

| Column | Type | Notes |
|--------|------|-------|
| id | PK | |
| trend_id | FK → trends | |
| agent_version_id | FK → agent_versions | |
| content | TEXT | final caption |
| post_type | VARCHAR | image / text_only / carousel / text_image / link |
| status | VARCHAR | draft / scheduled / published |
| approval_status | VARCHAR | pending / approved / rejected |
| platform | VARCHAR | instagram / twitter / linkedin |
| image_url | VARCHAR | Cloudinary URL |
| carousel_urls | JSON | list of URLs (carousel only) |
| video_url | VARCHAR | |
| link_url | VARCHAR | link posts |
| link_title | VARCHAR | |
| link_description | VARCHAR | |
| platform_post_id | VARCHAR | IG media ID after publish |
| published_at | TIMESTAMP | |
| token_count | INT | LLM tokens used |
| generation_cost | FLOAT | USD |
| hallucination_score | FLOAT | 0–1 |
| created_at | TIMESTAMP | auto |

### `media`

| Column | Type | Notes |
|--------|------|-------|
| id | PK | |
| post_id | FK → posts | |
| url | VARCHAR | CDN URL |
| media_type | VARCHAR | image / video |
| nsfw_score | FLOAT | AWS Rekognition |
| is_approved | BOOL | |

### Supporting tables

- **`workflow_executions`** — Temporal workflow IDs and statuses
- **`observability_traces`** — Langfuse trace IDs, token counts, costs
- **`api_usage`** — Per-service usage tracking for cost monitoring
- **`system_alerts`** — Severity-tagged alerts with resolved status
- **`users`** — Firebase-authenticated dashboard users
- **`platform_configs`** — Per-platform OAuth tokens / credentials
- **`model_configs`** — Primary + fallback model configuration per agent
- **`notifications`** — In-app notifications (approval needed, publish failed)

---

## 5. Content Generation Pipeline

A single end-to-end run (triggered by scheduler or manually):

### Step 1 — Pick a Trend

```python
# scheduler.py
trend = session.execute(
    select(Trend)
    .where(Trend.status == "pending")
    .order_by(Trend.score.desc())
    .limit(1)
)
```

### Step 2 — Generate Post Content (LLM)

```
pipeline.generate_post_content(GeneratePostInput)
  → post_generator.generate_post_with_agent(trend_dict, agent_version_id)
    → _pick_post_type()           weighted random: text_image 70%, image 10%, text_only 10%, carousel 5%, link 5%
    → _build_system_prompt(post_type, trend)
    → LLM.invoke([SystemMessage])
    → _parse_image_post() / _parse_link_post()
  → returns PostData(content, post_type, image_prompt, ...)
```

### Step 3 — Generate Image

```
pipeline.generate_images(GenerateImagesInput)
  → builds Pollinations AI URL with encoded prompt
  → GET https://image.pollinations.ai/prompt/{encoded}?width=1080&height=1080&model=flux
  → retries up to 3× with increasing timeout (60s, 90s, 120s)
  → returns {"image_1": url}
```

### Step 4 — Upload to CDN

```
pipeline.store_media_to_cdn(StoreMediaInput)
  → cloudinary.uploader.upload(url, folder="ai_media_os/post_{id}")
  → returns {"image_1": cloudinary_secure_url}
```

### Step 5 — Save Draft

```
pipeline.save_post_draft(SavePostDraftInput)
  → if post_type == "text_image":
      apply_text_overlay(raw_url, caption)  ← PIL card generation
  → INSERT INTO posts (status="draft", approval_status="pending")
  → PG NOTIFY posts_changed
```

---

## 6. Post Types

| Type | Description | Has Image |
|------|-------------|-----------|
| `image` | Photo + caption | Yes (Pollinations) |
| `text_only` | Pure text, no image | No |
| `text_image` | Text burned onto image (PIL quote card) | Yes (composited) |
| `carousel` | Multi-slide swipeable post | Yes (multiple) |
| `link` | Link preview card + caption | No |

### Weighted Distribution (default)

```python
POST_TYPE_WEIGHTS = {
    "text_image":  70,   # dominant — best engagement
    "image":       10,
    "text_only":   10,
    "carousel":     5,
    "link":         5,
}
```

---

## 7. text_image — PIL Card System

File: `src/utils/image_overlay.py`

The text overlay system generates social cards that look hand-crafted, not AI-generated. It automatically chooses one of two layouts based on caption length.

### Layout Selection

```python
hook_word_count = len(hook.split())
if hook_word_count <= 9:
    canvas = _make_typo_card(hook, body, tags)      # Layout A
else:
    canvas = await _make_photo_card(hook, body, tags, image_url)  # Layout B
```

### Layout A — Typography Card (short hooks ≤ 9 words)

- **Size:** 1080×1080 (square, ideal for quotes)
- **Background:** Vertical gradient (no photo)
- **6 colour palettes:** warm parchment, deep purple night, dark forest, dark amber, cool lavender, cream rust
- **Text:** Auto-sized hook fills ~60% of width; body text at 1/3 hook size
- **Accents:** Horizontal bar above hook, bottom line, brand `@orbis` bottom-right
- **Best for:** "Stop scrolling hooks", quotes, bold statements

### Layout B — Editorial Photo Card (longer hooks > 9 words)

- **Size:** 1080×1350 (portrait, 4:5 ratio)
- **Background:** Full-bleed Pollinations photo, fill-cropped
- **Scrims:** Top 22% dark scrim (title area) + bottom 50% dark scrim (text area)
- **Headline:** `hook.upper()`, auto-sized to fill bottom zone, white text with dark shadow + accent bar left
- **Body:** 2 sub-lines below headline at 40px
- **Top bar:** `ORBIS STUDIO` brand + accent line
- **Best for:** News stories, editorial content, longer captions

### Font Loading Chain

```
1. System fonts (Linux paths + Windows C:/Windows/Fonts/)
2. /tmp/orbis_fonts/Inter-Bold.ttf  ← downloaded from GitHub rsms/inter on first run
3. PIL default bitmap (tiny fallback — only if download fails)
```

Font loading is cached in `_FONT_CACHE` dict to avoid repeated disk reads.

### Caption Parsing

Before rendering, the caption is cleaned of LLM structural labels:

```python
clean = re.sub(
    r"(?im)^(HOOK|SLIDE\s*\d+[:\.\-]?[^|\n]*\|?|CAPTION|IMAGE_PROMPT)[:\s]*",
    "", caption
)
```

Then split into: `hook` (line 1), `body` (lines 2–3), `tags` (hashtag line).

---

## 8. Scheduler — Automated Publishing

File: `src/services/scheduler.py`

### Jobs

| Job | Schedule | What it does |
|-----|----------|--------------|
| `run_trend_fetch` | Every 6h (0, 6, 12, 18 UTC) | Fetches fresh trends from all sources |
| `run_post_generation` | N times/day (from `POSTS_PER_DAY`) | Full pipeline run for one trend |
| `run_auto_publish` | Triggered by generation if `AUTO_APPROVE=true` | Publishes draft immediately |

### Post Scheduling Times

```python
# POSTS_PER_DAY=4  → 8am, 12pm, 4pm, 8pm UTC
hour = 8 + (i * (16 / posts_per_day))
```

### Auto-Approve vs Manual Review

| `AUTO_APPROVE` | Behaviour |
|----------------|-----------|
| `false` (default) | Posts saved as `draft`, wait in dashboard for human approval |
| `true` | Posts auto-published immediately after generation |

### Dev / Prod Safety

In non-production environments (`ENVIRONMENT != "production"`), all published posts get a `[TEST]` prefix so they're identifiable on real Instagram accounts:

```python
if settings.environment != "production":
    text = f"[TEST] {text}"
```

---

## 9. LLM Agent & Model Fallbacks

File: `src/agents/post_generator.py`

### Primary Model

Set via `OPENAI_MODEL` env var. Defaults to whatever is configured (typically a Groq or OpenRouter model).

API endpoint set via `OPENAI_API_BASE` (supports any OpenAI-compatible API).

### Automatic Fallback Chain

If the primary model returns a 429 (rate limit) or 404/400 error, the agent automatically tries these free OpenRouter models in order:

```python
FREE_MODEL_FALLBACKS = [
    "nousresearch/hermes-3-llama-3.1-405b:free",
    "arcee-ai/trinity-large-thinking:free",
    "meta-llama/llama-3.3-70b-instruct:free",
    "google/gemma-4-31b-it:free",
    # ... more
]
```

### LangGraph Pipeline

```
StateGraph:
  process_trend_node  → LLM invoke, parse output
  determine_images_node → set images_needed count
  finalize_node → set status="ready_for_moderation"
```

### Prompt Engineering

Each post type has a dedicated system prompt. Key rules enforced:
- `text_image`: "THIS IS A SINGLE-IMAGE POST. Do NOT write SLIDE 1/2/3 content."
- All image types: "No human faces. No text in image."
- LLM output parser strips echoed structural labels (`HOOK:`, `SLIDE N:`, `CAPTION:`, `IMAGE_PROMPT:`)

---

## 10. API Reference

Base URL: `https://<your-railway-domain>/api`

### Posts

| Method | Path | Description |
|--------|------|-------------|
| GET | `/posts` | List posts (filter: `status`, `platform`, `post_type`, pagination) |
| GET | `/posts/{id}` | Get single post |
| POST | `/posts/{id}/approve` | Approve → triggers Instagram publish |
| POST | `/posts/{id}/reject` | Reject draft |
| POST | `/posts/{id}/discard` | Hard-delete a draft post |

#### GET /posts params

| Param | Default | Notes |
|-------|---------|-------|
| `status` | (all) | draft / scheduled / published |
| `platform` | (all) | instagram / twitter / linkedin |
| `post_type` | (all) | image / text_image / etc. |
| `limit` | 20 | |
| `offset` | 0 | |

#### GET /posts response

```json
{
  "posts": [...],
  "total": 5,
  "all_total": 12
}
```

`total` = filtered count, `all_total` = unfiltered across all statuses (used for header badge).

### Workflows

| Method | Path | Description |
|--------|------|-------------|
| POST | `/workflows/trigger` | Manually trigger post generation |

#### POST /workflows/trigger body

```json
{
  "trend_id": 3,
  "agent_version_id": 1,
  "platform": "instagram",
  "post_type": "text_image"
}
```

If `trend_id` is omitted, the scheduler picks the highest-scoring pending trend.

### Trends

| Method | Path | Description |
|--------|------|-------------|
| GET | `/trends` | List trends |
| POST | `/trends` | Create manual trend |

### Agents

| Method | Path | Description |
|--------|------|-------------|
| GET | `/agents` | List agents |
| GET | `/agents/{id}/versions` | Get agent versions |

### Moderator

| Method | Path | Description |
|--------|------|-------------|
| GET | `/moderator/dashboard` | Posts pending approval + stats |

### Health

```
GET /health
→ { "status": "healthy", "database": "ok", "scheduler": "running" }
```

---

## 11. WebSocket Real-Time Updates

File: `src/api/routes/ws.py` + `src/services/notification_listener.py`

### How it works

1. PostgreSQL triggers fire `NOTIFY posts_changed` on INSERT/UPDATE to `posts`
2. Python `asyncpg` listener receives the notification
3. Sets an `asyncio.Event` dirty flag
4. WebSocket hub checks dirty flag and broadcasts `{"type": "posts_changed"}` to all connected browsers
5. React Query receives the message and calls `queryClient.invalidateQueries(["posts"])`
6. Dashboard re-fetches and updates without a page reload

### WebSocket endpoint

```
ws://<host>/api/ws/updates
```

### React connection (frontend)

```javascript
const ws = new WebSocket(`${WS_BASE}/api/ws/updates`);
ws.onmessage = (e) => {
  const msg = JSON.parse(e.data);
  if (msg.type === "posts_changed") {
    queryClient.invalidateQueries(["posts"]);
  }
};
```

---

## 12. Frontend Pages

All pages use TanStack Query for data fetching and live updates.

### Dashboard (`/`)

- **Scheduler strip:** running status, posts/day, today's generated + published count, next run time
- **Pipeline visualization:** shows 5 pipeline steps (Trend → LLM → Image → CDN → Draft)
- **KV stats:** total posts, published today, pending approval, trends available
- **Area chart:** post volume over last 7 days

### Posts (`/posts`)

- **Grid of PostCards:** thumbnail, type badge, status badge, time ago
- **Filter bar:** by status and post type
- **Header badge:** total post count (unfiltered)
- **Generate button:** opens GenerateModal
- **PostCard actions:** Preview (eye), Approve (check), Reject (×), Discard (trash) — contextual by status
- **GenerateModal:** 6 post type options (grid), 4 platform pills, Generate button

### Preview Modal

- Full-size image
- Caption text
- Post type and status chips
- Approve / Reject / Discard buttons with loading state
- Prevents double-click via `isPending` flags + backend idempotency guard

### Trends (`/trends`)

- List of detected trends with source, score, status
- Manual trend creation form

### Analytics (`/analytics`)

- Charts for post volume, engagement by type, cost tracking

### Workflows (`/workflows`)

- Workflow execution history
- Manual trigger panel

### Settings (`/settings`)

- Environment config display
- Platform connection status

---

## 13. Environment Variables

Copy `.env.example` to `.env` and fill in:

### Required

```env
DATABASE_URL=postgresql+asyncpg://user:pass@host/db
OPENAI_API_KEY=sk-...           # OpenRouter key (despite name)
OPENAI_API_BASE=https://openrouter.ai/api/v1
OPENAI_MODEL=meta-llama/llama-3.3-70b-instruct:free

CLOUDINARY_CLOUD_NAME=your_cloud
CLOUDINARY_API_KEY=123456
CLOUDINARY_API_SECRET=abc...

INSTAGRAM_ACCESS_TOKEN=EAA...
INSTAGRAM_BUSINESS_ACCOUNT_ID=17841...
```

### Optional (with defaults)

```env
ENVIRONMENT=development          # production → removes [TEST] prefix
AUTO_APPROVE=false               # true → publish immediately after generation
POSTS_PER_DAY=4                  # how many posts to generate per day
POST_PLATFORM=instagram

ALLOWED_ORIGINS=https://your-netlify-app.netlify.app

LOG_LEVEL=INFO
```

### Unused / future

```env
TEMPORAL_HOST=localhost:7233     # Temporal removed in favour of direct pipeline
REDIS_URL=localhost:6379
ANTHROPIC_API_KEY=
LANGFUSE_PUBLIC_KEY=
R2_ACCOUNT_ID=                   # Cloudflare R2 (replaced by Cloudinary)
AWS_ACCESS_KEY_ID=               # Rekognition (NSFW — not active)
```

---

## 14. Deployment Guide

### Backend (Railway)

1. Connect GitHub repo to Railway
2. Set all environment variables in Railway dashboard
3. Railway auto-detects `Procfile`:
   ```
   web: python -m uvicorn src.api.main:app --host 0.0.0.0 --port $PORT --access-log --use-colors 2>&1
   ```
4. On first deploy, startup migrations run automatically:
   - `src/utils/migrate.py` executes `ALTER TABLE ADD COLUMN IF NOT EXISTS` for all columns
   - Safe to run on every deploy — no-ops if columns already exist

#### Startup sequence

```
init_db()           → create tables if not exist (SQLAlchemy)
run_migrations()    → add any missing columns (idempotent)
install_triggers()  → PG NOTIFY triggers on posts table
start_listener()    → asyncpg LISTEN for notifications
scheduler.start()   → APScheduler jobs begin
```

### Frontend (Netlify)

1. Connect GitHub repo, set build command: `cd frontend && npm run build`
2. Set publish directory: `frontend/dist`
3. Set env var: `VITE_API_BASE_URL=https://your-railway-domain.up.railway.app`
4. Deploy

### Font availability on Railway

On startup (first `text_image` post), the system downloads Inter fonts from GitHub to `/tmp/orbis_fonts/`:
- `Inter-Regular.otf` → `/tmp/orbis_fonts/Inter-Regular.ttf`
- `Inter-Bold.otf` → `/tmp/orbis_fonts/Inter-Bold.ttf`

This is automatic. No manual setup needed. Fonts are cached in memory (`_FONT_CACHE`) for the process lifetime.

---

## 15. Local Development

### Requirements

- Python 3.11+
- Node.js 18+
- PostgreSQL (local or Neon free tier)

### Backend

```bash
# Create virtual environment
python -m venv venv
source venv/bin/activate   # Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Create .env
cp .env.example .env
# Fill in DATABASE_URL, OPENAI_API_KEY, etc.

# Run
uvicorn src.api.main:app --reload --port 8000
```

### Frontend

```bash
cd frontend
npm install

# Create .env.local
echo "VITE_API_BASE_URL=http://localhost:8000" > .env.local

npm run dev
# → http://localhost:5173
```

### Generate a test post manually

```bash
curl -X POST http://localhost:8000/api/workflows/trigger \
  -H "Content-Type: application/json" \
  -d '{"agent_version_id": 1, "platform": "instagram", "post_type": "text_image"}'
```

### VSCode Python setup

If Pyrefly shows `PIL` import errors, select the venv interpreter:
- `Ctrl+Shift+P` → "Python: Select Interpreter" → choose `.venv/Scripts/python.exe`

---

## 16. Key Design Decisions

### Why not Temporal?

The original design used Temporal for durable workflow orchestration. It was removed because:
- Railway free tier can't run a separate Temporal server
- The pipeline is short enough (~30s) that durability isn't critical
- APScheduler + direct async functions are simpler and free

Temporal stubs remain in the codebase (`src/temporal_workflows/`) but the functions inside are standalone async functions called directly.

### Why Cloudinary over R2?

Cloudinary was chosen because:
- Free tier (25GB storage, 25GB bandwidth)
- Python SDK with single-call upload from URL or BytesIO
- PIL output is uploaded directly as BytesIO without temp files

R2 env vars are in config but the upload path uses Cloudinary.

### Why Pollinations AI for images?

- Completely free, no API key
- Supports `flux` model with 1080px output
- URL-based (no SDK) — just build a URL, fetch, get image
- Sufficient quality for social cards

### Why PIL for text overlay instead of Cloudinary transformations?

Cloudinary text overlays are possible but:
- Limited typography control
- Can't do gradient backgrounds or custom layout logic
- PIL gives full pixel control for the typography card layout
- Output is uploaded to Cloudinary after compositing

### Why `ALTER TABLE ADD COLUMN IF NOT EXISTS` on every startup?

Railway reuses the same PostgreSQL database across deploys. New columns added in code would cause `column does not exist` errors if migrations weren't run. Running idempotent `IF NOT EXISTS` migrations on every startup ensures the schema is always in sync without a separate migration tool.

### Why `datetime.utcnow() + "Z"` for timestamps?

Python's `datetime.utcnow()` returns a naive datetime with no timezone indicator. JavaScript's `new Date("2025-01-01T10:00:00")` without a `Z` or `+offset` is treated as **local time** by browsers. In IST (UTC+5:30) this made posts appear 5.5h older than they were. Appending `Z` before parsing forces UTC interpretation.

### Why `[TEST]` prefix for non-prod publishing?

When `ENVIRONMENT != "production"`, any post that gets approved and published will have `[TEST]` prepended to its caption. This lets you test the full approval → publish flow against a real Instagram account without the post looking like real content to followers.

---

## 17. Known Limitations & TODOs

| Item | Status |
|------|--------|
| Twitter/LinkedIn publishing | Wired in model, not yet active |
| AWS Rekognition NSFW moderation | Code exists, not enabled by default |
| Carousel image generation | Only generates 1 image; needs multi-image support |
| Video post type | Schema ready, generation not implemented |
| Firebase auth on frontend | Schema has `users` table; login UI not built |
| Langfuse observability | Config wired; traces not actively sent |
| Rate limiting on API | Not implemented |
| `link` post type | LLM generates content; `link_url` not auto-fetched |
| Trend sources | Reddit and manual only; RSS/Twitter not active |
| Font persistence | `/tmp` is ephemeral on Railway; fonts re-download on each dyno restart (cached for process lifetime) |

---

*Documentation generated 2026-05-12. For questions, open an issue on the GitHub repo or contact the maintainer.*