File size: 3,643 Bytes

cb582d2

---
license: apache-2.0
language:
- en
tags:
- video-editing
- instruction-following
- structured-generation
- text-to-json
- ffmpeg
- gearcut
- sparse-transformer
pipeline_tag: text-generation
inference: false
---

# GearCut Editor (gc_editor)

**gc_editor** is a compact instruction-to-operations model that powers
GearCut, an ultra-lightweight, FFmpeg-based
video editor. It translates a plain-English editing instruction into a list of
structured editing **operations** (JSON) that GearCut's `project -> ffmpeg`
compiler then executes. It is designed to run **locally, on CPU**, so the editor
needs no cloud service and no user video ever leaves the machine.

Developed by **AMEFORGE**. Built on the in-house **SparseMind** architecture
(sparse attention + sparse FFN, dynamic neuron typing, and episodic memory).

## What it does

- **Input:** the current timeline state + a natural-language instruction.
- **Output:** a JSON array of editing operations.

```text
INPUT
clips: c1=intro.mp4(0.0-8.0) | remove the first 3 seconds of the clip =>

OUTPUT
[{"op":"trim","clip":"c1","in":3.0,"out":8.0}]
```

Supported operations (v1): `trim`, `split`, `import`, `append`, `delete`,
`reorder`, `export`.

## Model details

| | |
|---|---|
| Architecture | SparseMind (decoder-only, sparse) |
| Parameters | 8,132,608 (~8.1M) |
| Hidden size / layers | 256 / 6 |
| Context length | 256 tokens |
| Tokenizer | GearCut dedicated SentencePiece-BPE, vocab 682 |
| Precision | fp32 |

## Evaluation
Model set with use diversity= false 
Measured on a held-out synthetic validation split. The meaningful metrics are
not perplexity but whether the generated operations are usable:

| Metric | Score |
|---|---|
| Valid JSON | 100.0% |
| Exact match (operations == reference) | 88.8% |
| Best exact match during training | 87.5% |

## Training data

Trained on **60,000** synthetically generated `(timeline + instruction -> operations)`
examples for 3000 steps. The generator covers the v1 operation set with
varied phrasings, clip references, file names, timestamps, and presets.

## Intended use & scope

Intended as the natural-language command layer inside the GearCut editor. It is
**not** a general-purpose assistant and only emits GearCut operations.

## Limitations

- **Synthetic training data.** The model is strongest on phrasings close to the
  generator's templates. Unusual real-world wording may be handled less reliably
  until the data is expanded with real examples.
- **English only (v1).** A bilingual (EN/FR) version is planned.
- **Narrow operation set (v1).** Transitions, multi-track, and effects are not
  yet covered.
- **Custom architecture.** The HF inference widget is disabled; load and run the
  model with the snippet below.

## How to use

```python
# Download gc_editor.pt + the GearCut tokenizer from this repo, then rebuild the
# SparseMind model with the same config stored in the checkpoint and load weights.
import torch, sentencepiece as spm
ckpt = torch.load("gc_editor.pt", map_location="cpu")
cfg  = ckpt["config"]          # the exact training config
# model = SparseMind(Config(**cfg)); model.load_state_dict(ckpt["model"]); model.eval()
sp = spm.SentencePieceProcessor(); sp.Load("gearcut_tok.model")
prompt = 'clips: c1=intro.mp4(0.0-8.0) | remove the first 3 seconds of the clip =>'
# ids = sp.EncodeAsIds(prompt) ; generate ; stop at EOS ; json.loads the output
```

## Citation

```bibtex
@misc{gearcut_editor,
  title  = {GearCut Editor: an instruction-to-operations model for lightweight video editing},
  author = {AMEFORGE},
  year   = {2026},
  note   = {Built on the SparseMind architecture}
}
```