| --- |
| title: GearCut |
| emoji: ✂️ |
| colorFrom: gray |
| colorTo: red |
| sdk: gradio |
| sdk_version: "5.29.1" |
| app_file: app.py |
| pinned: true |
| license: other |
| license_name: CKL |
| license_link: https://ameforge.tech |
| --- |
| |
| <div align="center"> |
|
|
| <img src="https://files.manuscdn.com/user_upload_by_module/session_file/310519663479543323/yIhmCPspKLtzwUJf.png" alt="GearCut Interface" width="800"/> |
|
|
| # ✂️ GearCut |
|
|
| **Natural Language Video Editor — Edit videos by describing what you want.** |
|
|
| [](https://huggingface.co/AMFORGE) |
| [](https://ameforge.tech) |
| [](https://huggingface.co/AMFORGE/GearCut) |
| [](https://gradio.app) |
| [](https://python.org) |
| [](https://www.youtube.com/@ameforge1) |
|
|
| [**Try the Demo →**](https://huggingface.co/spaces/AMFORGE/GearCut) · [**AMFORGE Organization**](https://huggingface.co/AMFORGE) · [**Website**](https://ameforge.tech) · [**YouTube**](https://www.youtube.com/@ameforge1) |
|
|
| </div> |
|
|
| --- |
|
|
| ## What is GearCut? |
|
|
| GearCut is a **natural language video editing engine** developed by [AMFORGE](https://huggingface.co/AMFORGE). Instead of learning complex video editing software, you simply describe your edit in plain English — and GearCut's model translates your instruction into a structured list of editing operations that the project compiler then executes. |
|
|
| The core model (`gc_editor`) is built on AMFORGE's in-house **SparseMind** architecture — sparse attention, sparse FFN, dynamic neuron typing, and episodic memory. It contains **28,759,300 parameters (~28.8M)** with a specialized vocabulary of **682 tokens** designed exclusively for video editing semantics. It understands temporal references, clip identifiers, and export configurations, then generates a structured operation plan with frame-accurate precision. |
|
|
| > **"remove the first 3 seconds"** → `[{"op":"trim","clip":"c1","in":3.0,"out":8.0}]` — done. |
|
|
| --- |
|
|
| ## Key Features |
|
|
| | Feature | Description | |
| |---|---| |
| | **Natural Language Input** | Describe edits in plain English — no syntax to memorize | |
| | **Trim Operations** | Remove from start, end, or both simultaneously | |
| | **Range Extraction** | Keep only a specific time window (e.g., seconds 120 to 600) | |
| | **Multi-clip Combination** | Concatenate multiple video files in any order | |
| | **Mute / Audio Control** | Strip or manipulate audio tracks via text instruction | |
| | **Web Interface** | Full Gradio UI for browser-based editing without CLI | |
| | **CLI Mode** | Scriptable command-line interface for automation | |
| | **Export Presets** | `youtube_1080p` and more — consistent output quality | |
|
|
| --- |
|
|
| ## Quick Start |
|
|
| ### Web Interface (Recommended) |
|
|
| The easiest way to use GearCut is through the live Gradio demo on this Space. Upload your video, type your instruction, and download the result. |
|
|
| <img src="https://files.manuscdn.com/user_upload_by_module/session_file/310519663479543323/NbmbzSLHdbkATMKE.png" alt="GearCut Web Interface Demo" width="750"/> |
|
|
| ### Local Installation |
|
|
| **Requirements:** Python 3.10+, ffmpeg installed and accessible in PATH. |
|
|
| ```bash |
| # Clone the repository |
| git clone https://github.com/Volgat/gearcut |
| cd gearcut |
| |
| # Install dependencies |
| pip install -r requirements.txt |
| |
| # Launch the web interface |
| python gearcut_app.py --ui |
| |
| # Or use the CLI directly |
| python gearcut_app.py "your instruction here" video.mp4 |
| ``` |
|
|
| **Project structure after setup:** |
|
|
| <img src="https://files.manuscdn.com/user_upload_by_module/session_file/310519663479543323/UXPTqakKspKwLpSJ.png" alt="GearCut Project Structure" width="650"/> |
|
|
| --- |
|
|
| ## CLI Usage |
|
|
| The command-line interface follows a simple pattern: |
|
|
| ```bash |
| python gearcut_app.py "<instruction>" <video1.mp4> [video2.mp4 ...] |
| ``` |
|
|
| GearCut loads the model, parses your instruction, builds a timeline, and renders the output — all in a single command. |
|
|
| ### Example 1 — Remove the beginning |
|
|
| ```bash |
| python gearcut_app.py "remove the first 3 seconds and export as out.mp4" test2.mp4 |
| ``` |
|
|
| **Output:** |
| ``` |
| GearCut Editor v1-editor by AMFORGE: 9,721,219 params |
| gc_editor loaded (9,721,219 params, vocab=682) |
| timeline given to the model: clips: c1=test2.mp4(0.0-109.9) |
| operations: [{"op": "trim", "clip": "c1", "cut": "start", "seconds": 3.0}, |
| {"op": "export", "path": "output.mp4", "preset": "youtube_1080p"}] |
| timeline: |
| c1: test2.mp4 [3.0 -> 109.9] (106.9s) |
| rendering -> output.mp4 |
| done: output.mp4 (106.9s) |
| ``` |
|
|
| <img src="https://files.manuscdn.com/user_upload_by_module/session_file/310519663479543323/tuOlkVtqKllpBvtS.png" alt="CLI Trim Demo" width="750"/> |
|
|
| ### Example 2 — Extract a specific range |
|
|
| ```bash |
| python gearcut_app.py "keep only 120 to 600 of clip 1 and export as final.mp4" lab2.mp4 |
| ``` |
|
|
| **Output:** |
| ``` |
| GearCut Editor v1-editor by AMFORGE: 9,721,219 params |
| gc_editor loaded (9,721,219 params, vocab=682) |
| timeline given to the model: clips: c1=lab2.mp4(0.0-848.5) |
| operations: [{"op": "trim", "clip": "c1", "cut": "range", "start": 120, "end": 600}, |
| {"op": "export", "path": "final.mp4", "preset": "youtube_1080p"}] |
| timeline: |
| c1: lab2.mp4 [120.0 -> 600.0] (480.0s) |
| rendering -> final.mp4 |
| done: final.mp4 (480.0s) |
| ``` |
|
|
| <img src="https://files.manuscdn.com/user_upload_by_module/session_file/310519663479543323/hmVNppDceAPJHWSC.png" alt="CLI Range Demo" width="750"/> |
|
|
| ### Example 3 — Combine multiple clips |
|
|
| ```bash |
| python gearcut_app.py "put final1.mov first then export" clip1.mp4 clip2.mp4 |
| ``` |
|
|
| **Output:** |
| ``` |
| GearCut Editor v1-editor by AMFORGE: 9,721,219 params |
| gc_editor loaded (9,721,219 params, vocab=682) |
| operations: [{"op": "export", "path": "final1.mov", "preset": "youtube_1080p"}] |
| timeline: |
| c1: clip1.mp4 [0.0 -> 15.015] (15.015s) |
| c2: clip2.mp4 [0.0 -> 15.015] (15.015s) |
| rendering -> final1.mov |
| done: final1.mov (30.033s) |
| ``` |
|
|
| <img src="https://files.manuscdn.com/user_upload_by_module/session_file/310519663479543323/kFgMBULZgmEoqFot.png" alt="CLI Combine Demo" width="750"/> |
|
|
| --- |
|
|
| ## Web Interface Walkthrough |
|
|
| The Gradio interface provides the same capabilities as the CLI in a browser-friendly format. Upload one or more video files, enter your editing instruction in natural language, and click **Submit**. The processed video appears in the output panel for preview and download. |
|
|
| <img src="https://files.manuscdn.com/user_upload_by_module/session_file/310519663479543323/iXHxrnwfpRSOmCDB.png" alt="GearCut Gradio Editing" width="750"/> |
|
|
| The interface supports side-by-side comparison of the original and edited video, making it easy to verify that the edit matches your intent before downloading. |
|
|
| <img src="https://files.manuscdn.com/user_upload_by_module/session_file/310519663479543323/uAbLLWebiqYpeCxw.png" alt="GearCut Result Comparison" width="750"/> |
|
|
| --- |
|
|
| ## Supported Instructions |
|
|
| GearCut understands a wide range of natural language patterns. The model was trained on video editing semantics and handles variations in phrasing naturally. |
|
|
| | Intent | Example Instruction | |
| |---|---| |
| | Remove start | `"remove the first 5 seconds"` | |
| | Remove end | `"cut the last 10 seconds and export"` | |
| | Keep range | `"keep only 30 to 120 seconds"` | |
| | Extract segment | `"keep only 2 to 9 of clip 1 and export as nature.mp4"` | |
| | Combine clips | `"put clip1 first then clip2 and export"` | |
| | Reorder | `"put final1.mov first then export"` | |
| | Named export | `"export as output.mp4"` / `"save as final.mov"` | |
|
|
| The grounding module validates parsed operations against the actual clip durations and corrects common misreads (e.g., filename mismatches, out-of-range timestamps) before rendering. |
|
|
| --- |
|
|
| ## Architecture |
|
|
| GearCut is built around a lightweight transformer encoder-decoder architecture trained end-to-end on video editing instruction pairs. |
|
|
| ``` |
| Input: natural language instruction + clip metadata |
| ↓ |
| [Tokenizer — vocab 682] |
| ↓ |
| [gc_editor transformer — 9.7M params] |
| ↓ |
| [Grounding module — validates against clip durations] |
| ↓ |
| Structured operation plan (JSON) |
| ↓ |
| [ffmpeg renderer — frame-accurate output] |
| ↓ |
| Output: edited video file |
| ``` |
|
|
| The tokenizer uses a custom vocabulary (`gearcut_tok.vocab`) optimized for temporal expressions, clip references, and export directives. The grounding module acts as a post-processing safety layer that rejects or corrects operations that would produce invalid results (e.g., trim beyond clip duration). |
|
|
| <img src="https://files.manuscdn.com/user_upload_by_module/session_file/310519663479543323/rcFAvaGqNPpySyHF.png" alt="GearCut Architecture Overview" width="750"/> |
|
|
| --- |
|
|
| ## Model Details |
|
|
| | Property | Value | |
| |---|---| |
| | **Architecture** | SparseMind (decoder-only, sparse) | |
| | **Parameters** | 28,759,300 (~28.8M) | |
| | **Hidden size / Layers** | 384 / 8 | |
| | **Context length** | 256 tokens | |
| | **Vocabulary size** | 682 tokens | |
| | **Tokenizer** | GearCut SentencePiece-BPE (`gearcut_tok.vocab` + `gearcut_tok.model`) | |
| | **Precision** | fp32 | |
| | **Model file** | `gc_editor.pt` | |
| | **Version** | v1-editor | |
| | **Developed by** | AMFORGE | |
|
|
| The core model files (`gearcut_compiler.py`, `gearcut_model.py`, `gearcut_infer.py`, `gearcut_ground.py`) are proprietary and maintained in a private repository. The public interface (`gearcut_app.py`, `gearcut_ui.py`) downloads the compiled modules at runtime. |
|
|
| --- |
|
|
| ## Evaluation |
|
|
| Measured on a held-out synthetic validation split. The meaningful metrics are not perplexity but whether the generated operations are directly usable: |
|
|
| | Metric | Score | |
| |---|---| |
| | **Valid JSON** | 100.0% | |
| | **Exact match** (operations == reference) | 76.5% | |
| | **Best exact match during training** | 88.0% | |
|
|
| --- |
|
|
| ## Requirements |
|
|
| ``` |
| gradio>=5.0 |
| torch>=2.0 |
| sentencepiece |
| ffmpeg-python |
| ``` |
|
|
| ffmpeg must be installed separately and available in your system PATH. On Windows, place the ffmpeg binary in the `gearcut/ffmpeg/` directory. |
|
|
| --- |
|
|
| ## License |
|
|
| GearCut is released under the **CKL License**. See [ameforge.tech](https://ameforge.tech) for full terms. The model weights and core inference modules are proprietary assets of AMFORGE. |
|
|
| --- |
|
|
| ## About AMFORGE |
|
|
| [AMFORGE](https://huggingface.co/AMFORGE) is an independent AI research studio focused on building efficient, practical AI systems. GearCut is part of a broader research direction exploring natural language interfaces for creative tools. |
|
|
| - **Website:** [ameforge.tech](https://ameforge.tech) |
| - **YouTube:** [youtube.com/@ameforge1](https://www.youtube.com/@ameforge1) |
| - **GitHub:** [github.com/Volgat](https://github.com/Volgat) |
| - **Contact:** contact@ameforge.tech |
|
|
| --- |
|
|
| <div align="center"> |
| <sub>Built with ❤️ by AMFORGE · <a href="https://ameforge.tech">ameforge.tech</a></sub> |
| </div> |
|
|