--- license: apache-2.0 language: - en tags: - video-editing - instruction-following - structured-generation - text-to-json - ffmpeg - gearcut - sparse-transformer pipeline_tag: text-generation inference: false --- # GearCut Editor (gc_editor) **gc_editor** is a compact instruction-to-operations model that powers GearCut, an ultra-lightweight, FFmpeg-based video editor. It translates a plain-English editing instruction into a list of structured editing **operations** (JSON) that GearCut's `project -> ffmpeg` compiler then executes. It is designed to run **locally, on CPU**, so the editor needs no cloud service and no user video ever leaves the machine. Developed by **AMEFORGE**. Built on the in-house **SparseMind** architecture (sparse attention + sparse FFN, dynamic neuron typing, and episodic memory). ## What it does - **Input:** the current timeline state + a natural-language instruction. - **Output:** a JSON array of editing operations. ```text INPUT clips: c1=intro.mp4(0.0-8.0) | remove the first 3 seconds of the clip => OUTPUT [{"op":"trim","clip":"c1","in":3.0,"out":8.0}] ``` Supported operations (v1): `trim`, `split`, `import`, `append`, `delete`, `reorder`, `export`. ## Model details | | | |---|---| | Architecture | SparseMind (decoder-only, sparse) | | Parameters | 8,132,608 (~8.1M) | | Hidden size / layers | 256 / 6 | | Context length | 256 tokens | | Tokenizer | GearCut dedicated SentencePiece-BPE, vocab 682 | | Precision | fp32 | ## Evaluation Model set with use diversity= false Measured on a held-out synthetic validation split. The meaningful metrics are not perplexity but whether the generated operations are usable: | Metric | Score | |---|---| | Valid JSON | 100.0% | | Exact match (operations == reference) | 88.8% | | Best exact match during training | 87.5% | ## Training data Trained on **60,000** synthetically generated `(timeline + instruction -> operations)` examples for 3000 steps. The generator covers the v1 operation set with varied phrasings, clip references, file names, timestamps, and presets. ## Intended use & scope Intended as the natural-language command layer inside the GearCut editor. It is **not** a general-purpose assistant and only emits GearCut operations. ## Limitations - **Synthetic training data.** The model is strongest on phrasings close to the generator's templates. Unusual real-world wording may be handled less reliably until the data is expanded with real examples. - **English only (v1).** A bilingual (EN/FR) version is planned. - **Narrow operation set (v1).** Transitions, multi-track, and effects are not yet covered. - **Custom architecture.** The HF inference widget is disabled; load and run the model with the snippet below. ## How to use ```python # Download gc_editor.pt + the GearCut tokenizer from this repo, then rebuild the # SparseMind model with the same config stored in the checkpoint and load weights. import torch, sentencepiece as spm ckpt = torch.load("gc_editor.pt", map_location="cpu") cfg = ckpt["config"] # the exact training config # model = SparseMind(Config(**cfg)); model.load_state_dict(ckpt["model"]); model.eval() sp = spm.SentencePieceProcessor(); sp.Load("gearcut_tok.model") prompt = 'clips: c1=intro.mp4(0.0-8.0) | remove the first 3 seconds of the clip =>' # ids = sp.EncodeAsIds(prompt) ; generate ; stop at EOS ; json.loads the output ``` ## Citation ```bibtex @misc{gearcut_editor, title = {GearCut Editor: an instruction-to-operations model for lightweight video editing}, author = {AMEFORGE}, year = {2026}, note = {Built on the SparseMind architecture} } ```