ryu34
/

viralcut-agent

+---
+license: apache-2.0
+base_model: Qwen/Qwen2.5-3B-Instruct
+tags:
+  - video-editing
+  - social-media
+  - agent
+  - tool-calling
+  - sft
+  - trl
+  - viralcut
+datasets:
+  - ryu34/viralcut-agent-data
+  - benxh/tiktok-hooks-finetune
+  - NousResearch/hermes-function-calling-v1
+pipeline_tag: text-generation
+---
+# 🎬 ViralCut Agent
+**An autonomous AI agent that transforms raw video footage into professional, viral-worthy social media content.**
+ViralCut Agent is a fine-tuned [Qwen2.5-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-3B-Instruct) model trained with QLoRA SFT on tool-calling trajectories for video editing, social media optimization, and content strategy.
+## What It Does
+| Capability | How |
+|---|---|
+| 🎬 **Video Analysis** | Analyze raw footage, find best moments, detect scenes |
+| ✂️ **Professional Editing** | Trim, transitions, effects, text overlays, color grading via FFmpeg |
+| 🎵 **Audio Production** | Search & add trending royalty-free music, sound effects, audio mixing |
+| 📊 **Viral Optimization** | Score content for TikTok/Instagram/YouTube, optimize for algorithms |
+| 🔍 **Trend Research** | Search current trends, hooks, sounds via web search |
+| 🚫 **AI Slop Detection** | Filter out AI-generated junk content |
+| ✍️ **Caption Generation** | Platform-optimized captions, hashtags, posting strategy |
+## Tools
+The agent was trained to call these tools autonomously:
+```python
+# 1. FFmpeg for video processing
+ffmpeg_cmd(command="ffmpeg -y -i input.mp4 -vf 'eq=saturation=1.3' output.mp4",
+           description="Boost color saturation")
+# 2. Web search for assets and trends
+web_search(query="trending TikTok sounds food 2025", search_type="trending_content")
+web_search(query="royalty free lo-fi beat", search_type="royalty_free_music")
+# 3. Video analysis
+analyze_video(video_path="raw.mp4", analysis_type="full")
+# 4. Virality scoring
+score_virality(video_path="edit.mp4", platform="tiktok", niche="food")
+# 5. Caption generation
+generate_caption(video_description="...", platform="tiktok", tone="casual")
+# 6. AI content detection
+detect_ai_slop(content_path="broll.mp4", check_type="video")
+```
+## Quick Start
+### Install
+```bash
+pip install transformers torch peft bitsandbytes duckduckgo-search
+```
+### Use as Agent (with real tools)
+```bash
+# Clone the repo
+git clone https://huggingface.co/ryu34/viralcut-agent
+cd viralcut-agent
+# Edit a video
+python agent.py --video raw_footage.mp4 --platform tiktok --niche food
+# Get a content plan (no video needed)
+python agent.py --plan --niche "coffee shop" --platform tiktok
+# Check files for AI slop
+python agent.py --check-slop clip1.mp4 clip2.mp4
+# Interactive mode
+python agent.py
+```
+### Use as Model (inference only)
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model = AutoModelForCausalLM.from_pretrained("ryu34/viralcut-agent", device_map="auto")
+tokenizer = AutoTokenizer.from_pretrained("ryu34/viralcut-agent")
+messages = [
+    {"role": "system", "content": "You are ViralCut Agent..."},
+    {"role": "user", "content": "Edit my beach video into a TikTok with trending music and effects"}
+]
+text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+inputs = tokenizer(text, return_tensors="pt").to(model.device)
+outputs = model.generate(**inputs, max_new_tokens=1024)
+print(tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:]))
+```
+## Training
+### Data
+Mixed dataset of ~2,800 examples:
+- **10 synthetic video editing trajectories** — multi-turn conversations showing full edit pipelines (analyze → search → edit → score → caption)
+- **~1,300 TikTok hooks/captions** — real viral content data from [benxh/tiktok-hooks-finetune](https://huggingface.co/datasets/benxh/tiktok-hooks-finetune)
+- **~1,200 general function-calling** — tool-use backbone from [NousResearch/hermes-function-calling-v1](https://huggingface.co/datasets/NousResearch/hermes-function-calling-v1)
+Full dataset: [ryu34/viralcut-agent-data](https://huggingface.co/datasets/ryu34/viralcut-agent-data)
+### Method
+- **Base model**: Qwen/Qwen2.5-3B-Instruct
+- **Method**: QLoRA SFT (4-bit quantization, rank 16, alpha 32)
+- **Training**: 3 epochs, lr=2e-4, cosine schedule, assistant-only loss
+- **Hardware**: T4 16GB GPU (free tier compatible)
+- **Framework**: TRL v1.3+ SFTTrainer
+### Train It Yourself
+```bash
+# Option 1: Google Colab (free T4 GPU)
+# Open: https://huggingface.co/datasets/ryu34/viralcut-agent-data/blob/main/train_colab.ipynb
+# Option 2: Direct script
+wget https://huggingface.co/datasets/ryu34/viralcut-agent-data/resolve/main/train.py
+pip install transformers trl torch datasets accelerate peft bitsandbytes
+python train.py
+```
+## Architecture
+```
+User Request ("Edit my raw footage into a viral TikTok")
+    │
+    ▼
+┌─────────────────────────────────┐
+│   ViralCut Agent (Qwen2.5-3B)  │
+│   Fine-tuned for tool-calling   │
+│                                 │
+│   Thinks → Plans → Calls Tools  │
+└──────────┬──────────────────────┘
+           │
+    ┌──────┼──────────────────────┐
+    │      │      │      │        │
+    ▼      ▼      ▼      ▼        ▼
+ FFmpeg  Web    Video  Viral   AI Slop
+  Edit  Search  Anal.  Score  Detect
+    │      │      │      │        │
+    └──────┴──────┴──────┴────────┘
+           │
+           ▼
+    Final edited video + caption + strategy
+```
+## Example Output
+**Input:** "I have 8 minutes of raw ramen footage from Tokyo. Make a TikTok."
+**Agent actions:**
+1. 📊 `analyze_video(raw_ramen.mp4, "full")` → Found 8 scenes, best: noodle pull at 0.9 energy
+2. 🔍 `web_search("trending TikTok sounds food ASMR 2025")` → Lo-fi city pop trending
+3. 🎵 `web_search("royalty free lo-fi Japanese beat")` → Found "Tokyo Nights" CC BY 4.0
+4. ✂️ `ffmpeg_cmd(...)` → Extracted hook shot with color boost
+5. ✂️ `ffmpeg_cmd(...)` → Speed-ramped broth prep
+6. ✂️ `ffmpeg_cmd(...)` → Assembled with fadeblack + slideright transitions
+7. 🎵 `ffmpeg_cmd(...)` → Mixed lo-fi music at 70% with ambient
+8. 📝 `ffmpeg_cmd(...)` → Added text hook + location overlay
+9. 📈 `score_virality(...)` → 82/100
+10. 🚫 `detect_ai_slop(...)` → Authentic ✅
+11. ✍️ `generate_caption(...)` → "This man has been making ramen by hand for 30 years"
+**Output:** 17s vertical TikTok with professional transitions, trending music, text overlays. Score: 82/100.
+## Limitations
+- Model is 3B parameters — for complex creative decisions, larger models (7B+) would perform better
+- FFmpeg commands may need adjustment for specific file formats
+- Virality scoring is heuristic-based, not ML-based
+- Web search requires `duckduckgo-search` package
+- No actual video generation — this is an *editing* agent that works with your existing footage
+## Citation
+```bibtex
+@misc{viralcut-agent-2025,
+  title={ViralCut Agent: Autonomous Video Editing for Social Media},
+  author={ryu34},
+  year={2025},
+  url={https://huggingface.co/ryu34/viralcut-agent}
+}
+```