Meyssa
/

gector-base-2020

Token Classification

Transformers.js

grammatical-error-correction

Model card Files Files and versions

gector-base-2020 / README.md

Meyssa's picture

Upload folder using huggingface_hub

ff9abe8 verified 24 days ago

|

history blame contribute delete

1.82 kB

	---
	language: en
	license: apache-2.0
	library_name: transformers.js
	pipeline_tag: token-classification
	tags:
	- grammatical-error-correction
	- gector
	- onnx
	- transformers.js
	---

	# GECToR Base 2020 (ONNX)

	ONNX quantized version of the original GECToR model from Grammarly for browser-based grammatical error correction with [Transformers.js](https://huggingface.co/docs/transformers.js).

	## Original Model

	- Source: [Grammarly GECToR](https://github.com/grammarly/gector)
	- Paper: [GECToR – Grammatical Error Correction: Tag, Not Rewrite](https://arxiv.org/abs/2005.12592) (BEA Workshop 2020)
	- Architecture: RoBERTa-Base + token classification head
	- Parameters: ~125M

	## Conversion Details

	- Format: ONNX
	- Quantization: INT8 (dynamic quantization)
	- Size: ~125MB
	- Converted by: Manual export from PyTorch (AllenNLP format)

	## How It Works

	GECToR uses a token classification approach - instead of generating corrected text, it predicts edit operations for each token:

	- `$KEEP` - Keep token unchanged
	- `$DELETE` - Remove token
	- `$REPLACE_word` - Replace with specific word
	- `$APPEND_word` - Append word after token
	- `$TRANSFORM_*` - Apply transformation (case, verb form, etc.)

	The model runs iteratively (typically 2-3 passes) until no more edits are predicted.

	## Usage with Transformers.js

	```javascript
	import { pipeline } from '@huggingface/transformers';

	const classifier = await pipeline(
	'token-classification',
	'YOUR_USERNAME/gector-base-2020',
	{ dtype: 'q8' }
	);

	const result = await classifier('He go to school yesterday.');
	// Returns token predictions with edit tags
	```

	## Performance

	Faster than the 2024 version with slightly lower accuracy. Good balance of speed and quality.

	## License

	Apache 2.0 (following original model license)