Text Generation
video-editing
social-media
agent
tool-calling
sft
trl
viralcut
File size: 7,343 Bytes
5c2da8e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
---
license: apache-2.0
base_model: Qwen/Qwen2.5-3B-Instruct
tags:
  - video-editing
  - social-media
  - agent
  - tool-calling
  - sft
  - trl
  - viralcut
datasets:
  - ryu34/viralcut-agent-data
  - benxh/tiktok-hooks-finetune
  - NousResearch/hermes-function-calling-v1
pipeline_tag: text-generation
---

# 🎬 ViralCut Agent

**An autonomous AI agent that transforms raw video footage into professional, viral-worthy social media content.**

ViralCut Agent is a fine-tuned [Qwen2.5-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-3B-Instruct) model trained with QLoRA SFT on tool-calling trajectories for video editing, social media optimization, and content strategy.

## What It Does

| Capability | How |
|---|---|
| 🎬 **Video Analysis** | Analyze raw footage, find best moments, detect scenes |
| βœ‚οΈ **Professional Editing** | Trim, transitions, effects, text overlays, color grading via FFmpeg |
| 🎡 **Audio Production** | Search & add trending royalty-free music, sound effects, audio mixing |
| πŸ“Š **Viral Optimization** | Score content for TikTok/Instagram/YouTube, optimize for algorithms |
| πŸ” **Trend Research** | Search current trends, hooks, sounds via web search |
| 🚫 **AI Slop Detection** | Filter out AI-generated junk content |
| ✍️ **Caption Generation** | Platform-optimized captions, hashtags, posting strategy |

## Tools

The agent was trained to call these tools autonomously:

```python
# 1. FFmpeg for video processing
ffmpeg_cmd(command="ffmpeg -y -i input.mp4 -vf 'eq=saturation=1.3' output.mp4", 
           description="Boost color saturation")

# 2. Web search for assets and trends  
web_search(query="trending TikTok sounds food 2025", search_type="trending_content")
web_search(query="royalty free lo-fi beat", search_type="royalty_free_music")

# 3. Video analysis
analyze_video(video_path="raw.mp4", analysis_type="full")

# 4. Virality scoring
score_virality(video_path="edit.mp4", platform="tiktok", niche="food")

# 5. Caption generation
generate_caption(video_description="...", platform="tiktok", tone="casual")

# 6. AI content detection
detect_ai_slop(content_path="broll.mp4", check_type="video")
```

## Quick Start

### Install
```bash
pip install transformers torch peft bitsandbytes duckduckgo-search
```

### Use as Agent (with real tools)
```bash
# Clone the repo
git clone https://huggingface.co/ryu34/viralcut-agent
cd viralcut-agent

# Edit a video
python agent.py --video raw_footage.mp4 --platform tiktok --niche food

# Get a content plan (no video needed)
python agent.py --plan --niche "coffee shop" --platform tiktok

# Check files for AI slop
python agent.py --check-slop clip1.mp4 clip2.mp4

# Interactive mode
python agent.py
```

### Use as Model (inference only)
```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("ryu34/viralcut-agent", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("ryu34/viralcut-agent")

messages = [
    {"role": "system", "content": "You are ViralCut Agent..."},
    {"role": "user", "content": "Edit my beach video into a TikTok with trending music and effects"}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=1024)
print(tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:]))
```

## Training

### Data
Mixed dataset of ~2,800 examples:
- **10 synthetic video editing trajectories** β€” multi-turn conversations showing full edit pipelines (analyze β†’ search β†’ edit β†’ score β†’ caption)
- **~1,300 TikTok hooks/captions** β€” real viral content data from [benxh/tiktok-hooks-finetune](https://huggingface.co/datasets/benxh/tiktok-hooks-finetune)
- **~1,200 general function-calling** β€” tool-use backbone from [NousResearch/hermes-function-calling-v1](https://huggingface.co/datasets/NousResearch/hermes-function-calling-v1)

Full dataset: [ryu34/viralcut-agent-data](https://huggingface.co/datasets/ryu34/viralcut-agent-data)

### Method
- **Base model**: Qwen/Qwen2.5-3B-Instruct
- **Method**: QLoRA SFT (4-bit quantization, rank 16, alpha 32)
- **Training**: 3 epochs, lr=2e-4, cosine schedule, assistant-only loss
- **Hardware**: T4 16GB GPU (free tier compatible)
- **Framework**: TRL v1.3+ SFTTrainer

### Train It Yourself
```bash
# Option 1: Google Colab (free T4 GPU)
# Open: https://huggingface.co/datasets/ryu34/viralcut-agent-data/blob/main/train_colab.ipynb

# Option 2: Direct script
wget https://huggingface.co/datasets/ryu34/viralcut-agent-data/resolve/main/train.py
pip install transformers trl torch datasets accelerate peft bitsandbytes
python train.py
```

## Architecture

```
User Request ("Edit my raw footage into a viral TikTok")
    β”‚
    β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   ViralCut Agent (Qwen2.5-3B)  β”‚
β”‚   Fine-tuned for tool-calling   β”‚
β”‚                                 β”‚
β”‚   Thinks β†’ Plans β†’ Calls Tools  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
           β”‚
    β”Œβ”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚      β”‚      β”‚      β”‚        β”‚
    β–Ό      β–Ό      β–Ό      β–Ό        β–Ό
 FFmpeg  Web    Video  Viral   AI Slop
  Edit  Search  Anal.  Score  Detect
    β”‚      β”‚      β”‚      β”‚        β”‚
    β””β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”˜
           β”‚
           β–Ό
    Final edited video + caption + strategy
```

## Example Output

**Input:** "I have 8 minutes of raw ramen footage from Tokyo. Make a TikTok."

**Agent actions:**
1. πŸ“Š `analyze_video(raw_ramen.mp4, "full")` β†’ Found 8 scenes, best: noodle pull at 0.9 energy
2. πŸ” `web_search("trending TikTok sounds food ASMR 2025")` β†’ Lo-fi city pop trending
3. 🎡 `web_search("royalty free lo-fi Japanese beat")` β†’ Found "Tokyo Nights" CC BY 4.0
4. βœ‚οΈ `ffmpeg_cmd(...)` β†’ Extracted hook shot with color boost
5. βœ‚οΈ `ffmpeg_cmd(...)` β†’ Speed-ramped broth prep
6. βœ‚οΈ `ffmpeg_cmd(...)` β†’ Assembled with fadeblack + slideright transitions
7. 🎡 `ffmpeg_cmd(...)` β†’ Mixed lo-fi music at 70% with ambient
8. πŸ“ `ffmpeg_cmd(...)` β†’ Added text hook + location overlay
9. πŸ“ˆ `score_virality(...)` β†’ 82/100
10. 🚫 `detect_ai_slop(...)` β†’ Authentic βœ…
11. ✍️ `generate_caption(...)` β†’ "This man has been making ramen by hand for 30 years"

**Output:** 17s vertical TikTok with professional transitions, trending music, text overlays. Score: 82/100.

## Limitations

- Model is 3B parameters β€” for complex creative decisions, larger models (7B+) would perform better
- FFmpeg commands may need adjustment for specific file formats
- Virality scoring is heuristic-based, not ML-based
- Web search requires `duckduckgo-search` package
- No actual video generation β€” this is an *editing* agent that works with your existing footage

## Citation

```bibtex
@misc{viralcut-agent-2025,
  title={ViralCut Agent: Autonomous Video Editing for Social Media},
  author={ryu34},
  year={2025},
  url={https://huggingface.co/ryu34/viralcut-agent}
}
```