--- base_model: - LiquidAI/LFM2.5-1.2B-Instruct library_name: peft pipeline_tag: text-generation tags: - lora - peft - qlora - grammar-correction - adapter - adapters --- license: apache-2.0 language: - en datasets: - jhu-clsp/jfleg --- # Model Card for LiquidAI Grammarly (LoRA) ## Model Details ### Model Description This repository contains **LoRA adapter weights** fine-tuned for **English grammar correction**. The adapters are trained on top of the **LiquidAI/LFM2.5-1.2B-Instruct** base model using **QLoRA**. The model is designed to: - Correct grammatical errors - Preserve the original meaning - Minimize unnecessary rewrites This repository **does not contain the base model weights**, only the LoRA adapters. --- ## ⚠️ About Hugging Face Auto-Generated Code Snippets Hugging Face may display examples such as: ```python pipeline("text-generation", model="arjunverma2004/LiquidAI-grammarly-lora") ``` or ```python AutoModel.from_pretrained("arjunverma2004/LiquidAI-grammarly-lora") ``` These examples are automatically generated by the Hub and will not work for this repository. Below I have provided the correct code ### Developed by Independent contributor ### Funded by Not applicable ### Shared by Community contribution ### Model type Causal Language Model (LoRA adapters) ### Language(s) English ### License Apache 2.0 (inherits base model license) ### Finetuned from model LiquidAI/LFM2.5-1.2B-Instruct --- ## Model Sources - **Repository**: https://huggingface.co/LiquidAI/LFM2.5-1.2B-Instruct - **Paper**: Not available - **Demo**: Not available --- ## Uses ### Direct Use - English grammar correction - Proofreading short and medium-length texts - Educational and language-learning tools ### Downstream Use - Writing assistants - Grammar checking pipelines - Preprocessing text for downstream NLP tasks ### Out-of-Scope Use - Content generation beyond grammar correction - Legal, medical, or professional advice - Multilingual grammar correction --- ## Bias, Risks, and Limitations - The model may reflect biases present in the training data. - It may over-correct stylistic choices in creative writing. - It is optimized for **grammatical correctness**, not factual accuracy. - Performance may degrade on very long or highly technical texts. --- ## Recommendations Users should: - Review corrections before final use - Avoid relying on the model for high-stakes or sensitive applications - Combine with human review for best results --- ## How to Get Started with the Model ```python from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModel base_model_name = "LiquidAI/LFM2.5-1.2B-Instruct" adapter_name = "arjunverma2004/LiquidAI-grammarly-lora" # Load base model base_model = AutoModelForCausalLM.from_pretrained( base_model_name, device_map="auto", trust_remote_code=True, ) # Attach LoRA adapters model = PeftModel.from_pretrained( base_model, adapter_name, ) # Load tokenizer tokenizer = AutoTokenizer.from_pretrained( base_model_name, trust_remote_code=True, ) tokenizer.pad_token = tokenizer.eos_token ``` ```python from transformers import pipeline # Use our predefined prompt template sentence = """Write this sentence correctly: Here was no promise of morning except that we looked up through the trees we saw how low the forest had swung . """ dic1 = [{'role': "user", 'content': sentence}] prompt = tokenizer.apply_chat_template(dic1, tokenize=False, add_generation_prompt=True) # Run our instruction-tuned model pipe = pipeline(task="text-generation", model=merged_model, tokenizer=tokenizer) print(pipe(prompt)[0]["generated_text"]) ``` ## Training Details ### Training Data - **JFLEG (JHU Fluency-Extended GUG Corpus)** - Dataset focused on grammatical error correction with multiple human references ### Training Procedure #### Preprocessing - Inputs formatted using the base model’s chat template - Each example consists of an erroneous sentence and a corrected version #### Training Hyperparameters - Training regime: Supervised Fine-Tuning (SFT) - Method: QLoRA - Precision: 4-bit (NF4) - Max sequence length: 512 tokens - Optimizer: AdamW (via TRL) - PEFT: LoRA ### Speeds, Sizes, Times - Training performed on a single GPU - Lightweight adapter-only training --- ## Evaluation ### Testing Data - Held-out samples from JFLEG - Custom manually written grammatical error examples ### Factors - Error type (tense, agreement, articles, prepositions) - Sentence length - Error density ### Metrics - Training loss (cross-entropy) - Qualitative human evaluation - (Optional) GLEU score ### Results - Rapid loss convergence - High-quality grammatical corrections - Minimal semantic drift --- ## Summary ### Model Examination The model demonstrates strong grammatical correction capabilities while preserving sentence meaning. It performs best on common ESL-style grammatical errors. --- ## Environmental Impact - Hardware Type: NVIDIA GPU (single device) - Hours Used: < 5 hours - Cloud Provider: Google Colab - Compute Region: Not specified - Carbon Emitted: Not estimated --- ## Technical Specifications ### Model Architecture and Objective - Base architecture: Transformer-based causal language model - Objective: Next-token prediction for grammar-corrected text ### Compute Infrastructure - Single-GPU training with quantization ### Hardware - NVIDIA GPU (Google Colab environment) ### Software - Python - PyTorch - Hugging Face Transformers - TRL - PEFT - bitsandbytes --- ## Citation ### BibTeX ```bibtex @misc{liquidai_grammarly_lora, title={LiquidAI Grammarly LoRA}, author={Anonymous}, year={2026}, url={https://huggingface.co/USERNAME/LiquidAI-grammarly-lora} } ``` ### APA LiquidAI Grammarly LoRA. (2026). Hugging Face. https://huggingface.co/USERNAME/LiquidAI-grammarly-lora ### Glossary LoRA: Low-Rank Adaptation QLoRA: Quantized LoRA SFT: Supervised Fine-Tuning JFLEG: Grammar correction dataset