CoEdIT-Large ONNX (INT8 Quantized)

ONNX export of grammarly/coedit-large (770M params, flan-t5-large) optimized for @huggingface/transformers v3+.

Includes both FP32 and INT8 quantized versions. The INT8 quantized model is ~780MB total and runs in browser via WASM or WebGPU.

Usage

import { pipeline } from '@huggingface/transformers';

const pipe = await pipeline('text2text-generation', 'rabden/coedit-large-onnx', {
  quantized: true,
  dtype: 'q8',
});

const result = await pipe(
  'Fix grammatical errors in this sentence: ' +
  'The protocol utilize a novel encryption scheme that ensure data integrity across multiple node.',
  {
    max_new_tokens: 64,
  }
);

console.log(result[0].generated_text);
// "The protocol utilizes a novel encryption scheme that ensures data integrity across multiple nodes."

Generation Config

The model has repetition_penalty: 1.5 baked in by default to prevent repeated output. You can override it:

const result = await pipe(text, {
  max_new_tokens: 64,
  repetition_penalty: 1.0, // disable
});

Files

File	Size	Description
`onnx/encoder_model_quantized.onnx`	326 MB	INT8 quantized encoder
`onnx/decoder_model_merged_quantized.onnx`	454 MB	INT8 quantized decoder (with lm_head)
`onnx/encoder_model.onnx`	1302 MB	FP32 encoder
`onnx/decoder_model_merged.onnx`	1812 MB	FP32 decoder (with lm_head)
`config.json`	—	T5 config
`generation_config.json`	—	Generation parameters
`tokenizer.json` / `spiece.model`	—	T5 tokenizer

Performance

Tested on Node.js (WASM backend, Intel Xeon, quantized):

Load time: ~7s (cached)
Inference: 300ms–1300ms per sentence (varies with length)

WebGPU backend is faster but requires browser with WebGPU support.

Model Details

Base model: google/flan-t5-large fine-tuned on CoEdIT dataset
Architecture: T5 encoder-decoder (24 layers, d_model=1024, 16 heads)
Task: Text editing via instruction tuning
Paper: CoEdIT: Text Editing by Task-Specific Instruction Tuning
Original: grammarly/coedit-large (gated, requires accepting terms)

License

CC-BY-NC-4.0 (same as the original model).

Downloads last month: 147

Model tree for rabden/coedit-large-onnx

Base model

grammarly/coedit-large

Quantized

(5)

this model

Paper for rabden/coedit-large-onnx

CoEdIT: Text Editing by Task-Specific Instruction Tuning

Paper • 2305.09857 • Published May 17, 2023 • 10