gector-base-2020 / README.md
Meyssa's picture
Upload folder using huggingface_hub
ff9abe8 verified
metadata
language: en
license: apache-2.0
library_name: transformers.js
pipeline_tag: token-classification
tags:
  - grammatical-error-correction
  - gector
  - onnx
  - transformers.js

GECToR Base 2020 (ONNX)

ONNX quantized version of the original GECToR model from Grammarly for browser-based grammatical error correction with Transformers.js.

Original Model

Conversion Details

  • Format: ONNX
  • Quantization: INT8 (dynamic quantization)
  • Size: ~125MB
  • Converted by: Manual export from PyTorch (AllenNLP format)

How It Works

GECToR uses a token classification approach - instead of generating corrected text, it predicts edit operations for each token:

  • $KEEP - Keep token unchanged
  • $DELETE - Remove token
  • $REPLACE_word - Replace with specific word
  • $APPEND_word - Append word after token
  • $TRANSFORM_* - Apply transformation (case, verb form, etc.)

The model runs iteratively (typically 2-3 passes) until no more edits are predicted.

Usage with Transformers.js

import { pipeline } from '@huggingface/transformers';

const classifier = await pipeline(
  'token-classification',
  'YOUR_USERNAME/gector-base-2020',
  { dtype: 'q8' }
);

const result = await classifier('He go to school yesterday.');
// Returns token predictions with edit tags

Performance

Faster than the 2024 version with slightly lower accuracy. Good balance of speed and quality.

License

Apache 2.0 (following original model license)