--- language: en license: apache-2.0 library_name: transformers.js pipeline_tag: token-classification tags: - grammatical-error-correction - gector - onnx - transformers.js --- # GECToR Base 2020 (ONNX) ONNX quantized version of the original GECToR model from Grammarly for browser-based grammatical error correction with [Transformers.js](https://huggingface.co/docs/transformers.js). ## Original Model - **Source**: [Grammarly GECToR](https://github.com/grammarly/gector) - **Paper**: [GECToR – Grammatical Error Correction: Tag, Not Rewrite](https://arxiv.org/abs/2005.12592) (BEA Workshop 2020) - **Architecture**: RoBERTa-Base + token classification head - **Parameters**: ~125M ## Conversion Details - **Format**: ONNX - **Quantization**: INT8 (dynamic quantization) - **Size**: ~125MB - **Converted by**: Manual export from PyTorch (AllenNLP format) ## How It Works GECToR uses a token classification approach - instead of generating corrected text, it predicts edit operations for each token: - `$KEEP` - Keep token unchanged - `$DELETE` - Remove token - `$REPLACE_word` - Replace with specific word - `$APPEND_word` - Append word after token - `$TRANSFORM_*` - Apply transformation (case, verb form, etc.) The model runs iteratively (typically 2-3 passes) until no more edits are predicted. ## Usage with Transformers.js ```javascript import { pipeline } from '@huggingface/transformers'; const classifier = await pipeline( 'token-classification', 'YOUR_USERNAME/gector-base-2020', { dtype: 'q8' } ); const result = await classifier('He go to school yesterday.'); // Returns token predictions with edit tags ``` ## Performance Faster than the 2024 version with slightly lower accuracy. Good balance of speed and quality. ## License Apache 2.0 (following original model license)