LingoEDU-4B
π Paper π» Github Repo
π Model Introduction
LingoEDU-4B is specialization of Qwen/Qwen3-4B for document structure analysis.
With this model, we transform a linear discourse sequence into a condensed hierarchical tree, where every node is strictly anchored to the source via coordinate pointers.
π Performance on StructBench
| Method | Type | TED (Structure) β | DLA (Accuracy) β | Cost ($/doc) β | Latency (s) β |
|---|---|---|---|---|---|
| GPT-4o | General LLM* | 6.22 | 29.03% | 0.0210 | - |
| GPT-4.1 | 6.35 | 37.90% | 0.0168 | - | |
| OpenAI o3 | 5.51 | 28.63% | 0.0168 | - | |
| OpenAI o4-mini | 5.87 | 32.66% | 0.0092 | - | |
| Claude-3.7-Sonnet | 6.65 | 35.08% | 0.0286 | - | |
| Claude-4 | 5.08 | 43.15% | 0.0286 | - | |
| Gemini-2.5-flash | 5.82 | 27.82% | 0.0040 | - | |
| Gemini-2.5-pro | 5.61 | 32.66% | 0.0162 | - | |
| DeepSeek-V3 | 6.32 | 33.47% | 0.0012 | - | |
| DeepSeek-R1 | 6.26 | 30.65% | 0.0046 | - | |
| Qwen3-32B | 6.52 | 26.21% | 0.0012 | 10.17β | |
| Qwen3-235B | 7.67 | 19.10% | 0.0012 | - | |
| Jina-Reader | Parser API | 17.04 | - | 0.0004 | - |
| Firecrawl | 16.81 | - | 0.0007 | - | |
| Our Method (LingoEDU) | Specialized | 4.77 | 49.60% | 0.0007 | 1.20β |
π Quickstart
Get system prompt, article input and guidance grammar
- System prompt: a fixed system prompt
- Article input: an input string built from sentence-segmented article
- Guidance grammar: an lark grammar built from sentence-segmented article
See in our Github repository DeepLangAI/LingoEDU.
Generate with vLLM
from vllm import LLM
from vllm.config import StructuredOutputsConfig
from vllm.sampling_params import SamplingParams, StructuredOutputsParams
if __name__ == "__main__":
model_name = "deeplang-ai/LingoEDU-4B"
vllm_llm = LLM(
model=model_name,
structured_outputs_config=StructuredOutputsConfig(backend="guidance")
)
tokenizer = vllm_llm.get_tokenizer()
system_prompt = ...
article_input = ...
guidance_grammar = ...
prompt_str = tokenizer.apply_chat_template(
[
{
"role": "system",
"content": system_prompt,
},
{
"role": "user",
"content": article_input,
},
],
tokenize=False,
add_generation_prompt=True,
enable_thinking=False,
)
output = vllm_llm.generate(
prompts=[prompt_str],
sampling_params=SamplingParams(
temperature=0.0,
top_k=-1,
top_p=1.0,
max_tokens=8*1024,
skip_special_tokens=False,
n=1,
structured_outputs=StructuredOutputsParams(grammar=guidance_grammar),
),
)[0].outputs[0].text
π Limitations
- Not fine-tuned for general chat.
- Handles only text-based documents; no multimodal input.
π Citation
If you find our work helpful, feel free to give us a cite.
@misc{zhou2025contextedusfaithfulstructured,
title={From Context to EDUs: Faithful and Structured Context Compression via Elementary Discourse Unit Decomposition},
author={Yiqing Zhou and Yu Lei and Shuzheng Si and Qingyan Sun and Wei Wang and Yifei Wu and Hao Wen and Gang Chen and Fanchao Qi and Maosong Sun},
year={2025},
eprint={2512.14244},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2512.14244},
}
- Downloads last month
- 47
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support