LingoEDU-4B

πŸ“œ Paper πŸ’» Github Repo

πŸ“– Model Introduction

LingoEDU-4B is specialization of Qwen/Qwen3-4B for document structure analysis.

With this model, we transform a linear discourse sequence into a condensed hierarchical tree, where every node is strictly anchored to the source via coordinate pointers.

πŸ“Š Performance on StructBench

Method Type TED (Structure) ↓ DLA (Accuracy) ↑ Cost ($/doc) ↓ Latency (s) ↓
GPT-4o General LLM* 6.22 29.03% 0.0210 -
GPT-4.1 6.35 37.90% 0.0168 -
OpenAI o3 5.51 28.63% 0.0168 -
OpenAI o4-mini 5.87 32.66% 0.0092 -
Claude-3.7-Sonnet 6.65 35.08% 0.0286 -
Claude-4 5.08 43.15% 0.0286 -
Gemini-2.5-flash 5.82 27.82% 0.0040 -
Gemini-2.5-pro 5.61 32.66% 0.0162 -
DeepSeek-V3 6.32 33.47% 0.0012 -
DeepSeek-R1 6.26 30.65% 0.0046 -
Qwen3-32B 6.52 26.21% 0.0012 10.17†
Qwen3-235B 7.67 19.10% 0.0012 -
Jina-Reader Parser API 17.04 - 0.0004 -
Firecrawl 16.81 - 0.0007 -
Our Method (LingoEDU) Specialized 4.77 49.60% 0.0007 1.20†

πŸš€ Quickstart

Get system prompt, article input and guidance grammar

  • System prompt: a fixed system prompt
  • Article input: an input string built from sentence-segmented article
  • Guidance grammar: an lark grammar built from sentence-segmented article

See in our Github repository DeepLangAI/LingoEDU.

Generate with vLLM

from vllm import LLM
from vllm.config import StructuredOutputsConfig
from vllm.sampling_params import SamplingParams, StructuredOutputsParams

if __name__ == "__main__":

    model_name = "deeplang-ai/LingoEDU-4B"

    vllm_llm = LLM(
        model=model_name,
        structured_outputs_config=StructuredOutputsConfig(backend="guidance")
    )
    tokenizer = vllm_llm.get_tokenizer()

    system_prompt = ...
    article_input = ...
    guidance_grammar = ...

    prompt_str = tokenizer.apply_chat_template(
        [
            {
                "role": "system",
                "content": system_prompt,
            },
            {
                "role": "user",
                "content": article_input,
            },
        ],
        tokenize=False,
        add_generation_prompt=True,
        enable_thinking=False,
    )

    output = vllm_llm.generate(
        prompts=[prompt_str],
        sampling_params=SamplingParams(
            temperature=0.0,
            top_k=-1,
            top_p=1.0,
            max_tokens=8*1024,
            skip_special_tokens=False,
            n=1,
            structured_outputs=StructuredOutputsParams(grammar=guidance_grammar),
        ),
    )[0].outputs[0].text

πŸ“Œ Limitations

  • Not fine-tuned for general chat.
  • Handles only text-based documents; no multimodal input.

πŸ“ Citation

If you find our work helpful, feel free to give us a cite.

@misc{zhou2025contextedusfaithfulstructured,
      title={From Context to EDUs: Faithful and Structured Context Compression via Elementary Discourse Unit Decomposition}, 
      author={Yiqing Zhou and Yu Lei and Shuzheng Si and Qingyan Sun and Wei Wang and Yifei Wu and Hao Wen and Gang Chen and Fanchao Qi and Maosong Sun},
      year={2025},
      eprint={2512.14244},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2512.14244}, 
}
Downloads last month
47
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for deeplang-ai/LingoEDU-4B

Quantizations
2 models