YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

retrico-lm-4b

retrico-lm-4b is a 4B-parameter language model built for universal structured information extraction. Give it any text and a JSON schema — it returns a valid, schema-conformant JSON object with no post-processing required.

The model handles the full spectrum of extraction tasks from a single interface: plain text, Markdown, HTML, and XML as input; flat facts, deeply nested objects, typed arrays, NER, and open relation extraction as output. There is no need to switch between specialized models — one template drives all extraction modes.

Built on Qwen3.5-4B, retrico-lm-4b is designed for production use and works best served via vLLM.

Key Features

Universal input — plain text, Markdown documents, HTML pages, XML feeds
Universal output — flat facts, nested objects, typed arrays, entity lists, relation triplets
Template-driven — define any JSON schema and the model populates it from the input
Typed fields — respects string, integer, float, nested objects, arrays of objects
Null-safe — missing values return as null or [], never hallucinated
Production-ready — optimized for vLLM with language_model_only=True

Training

The model was trained in two stages:

Supervised fine-tuning on synthetic data — training examples were generated using a large teacher LLM across a diverse set of domains and schema types
Post-training on human-annotated data — further refined on a high-quality human-annotated dataset to improve precision, grounding, and schema adherence

Usage

The model uses a hybrid attention architecture and requires language_model_only=True and trust_remote_code=True.

from vllm import LLM, SamplingParams
from transformers import AutoTokenizer
import json

model_name = "knowledgator/retrico-lm-4b"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)

llm = LLM(
    model=model_name,
    language_model_only=True,
    gpu_memory_utilization=0.85,
    max_model_len=65536,
    trust_remote_code=True,
    dtype="bfloat16",
    enforce_eager=True,
)

sampling_params = SamplingParams(max_tokens=4096, temperature=0.0)

def build_prompt(text, template):
    if isinstance(template, (dict, list)):
        template = json.dumps(template, indent=1, ensure_ascii=False)
    content = (
        "/no_think\n"
        "Extract information from the following text according to the JSON template.\n\n"
        "Important rules:\n"
        "- If a field's value is not mentioned or cannot be found in the text, set it to null.\n"
        "- Do not infer, guess, or hallucinate values that are not explicitly stated.\n"
        "- If the template is completely unrelated to the text, return all fields as null.\n"
        "- For list fields with no values found, return [] not [null].\n"
        "- For dict/object fields with no values found, return {} not null.\n\n"
        f"Template:\n{template}\n\nText:\n{text}\n\n"
        "Return only the extracted JSON, nothing else."
    )
    return tokenizer.apply_chat_template(
        [{"role": "user", "content": content}],
        tokenize=False,
        add_generation_prompt=True,
        enable_thinking=False,
    )

def extract(text, template):
    prompt = build_prompt(text, template)
    output = llm.generate([prompt], sampling_params)[0]
    raw = output.outputs[0].text.strip()
    try:
        return json.loads(raw)
    except json.JSONDecodeError:
        return {"__raw__": raw}

Examples

Plain Text Extraction

Structured fact extraction from prose. The model handles deeply nested schemas, typed numeric fields, arrays of objects, and null-safe output for fields absent in the source.

James Webb Space Telescope — deeply nested schema with typed fields, array of instrument objects, and multi-agency list

Input

text = """
The James Webb Space Telescope (JWST) is a space telescope designed to conduct infrared
astronomy. It was launched on December 25, 2021, from the Guiana Space Centre in Kourou,
French Guiana, aboard an Ariane 5 rocket. The telescope reached its final destination at
the second Lagrange point (L2), approximately 1.5 million kilometers from Earth, on
January 24, 2022, after a 30-day journey.

JWST has a primary mirror diameter of 6.5 meters composed of 18 hexagonal gold-coated
beryllium segments. Its total mass is 6,161 kg and it operates at a temperature of
approximately minus 233 degrees Celsius. The telescope cost $10 billion to develop over
20 years, led by NASA in partnership with ESA and the Canadian Space Agency.

The telescope carries four scientific instruments: NIRCam (Near Infrared Camera) built by
the University of Arizona, NIRSpec (Near Infrared Spectrograph) built by ESA, MIRI
(Mid-Infrared Instrument) built jointly by NASA and ESA, and FGS/NIRISS (Fine Guidance
Sensor and Near Infrared Imager and Slitless Spectrograph) built by CSA. Its science
operations are managed by the Space Telescope Science Institute in Baltimore, Maryland.
The expected mission lifetime is 10 years, with fuel reserves potentially extending it to 20.
"""

template = {
    "name": "string",
    "abbreviation": "string",
    "type": "string",
    "launch_date": "string",
    "launch_site": "string",
    "launch_vehicle": "string",
    "destination": "string",
    "distance_from_earth_km": "float",
    "arrival_date": "string",
    "journey_duration_days": "integer",
    "specifications": {
        "primary_mirror_diameter_m": "float",
        "mirror_segments": "integer",
        "mirror_material": "string",
        "mirror_coating": "string",
        "total_mass_kg": "integer",
        "operating_temperature_celsius": "integer"
    },
    "cost_billion_usd": "float",
    "development_years": "integer",
    "mission_lifetime": {
        "expected_years": "integer",
        "maximum_years": "integer"
    },
    "agencies": ["string"],
    "science_operations_center": "string",
    "instruments": [
        {
            "name": "string",
            "abbreviation": "string",
            "built_by": "string"
        }
    ]
}

Output

{
  "name": "James Webb Space Telescope",
  "abbreviation": "JWST",
  "type": "space telescope",
  "launch_date": "2021-12-25",
  "launch_site": "Guiana Space Centre",
  "launch_vehicle": "Ariane 5",
  "destination": "second Lagrange point (L2)",
  "distance_from_earth_km": 1500000,
  "arrival_date": "2022-01-24",
  "journey_duration_days": 30,
  "specifications": {
    "primary_mirror_diameter_m": 6.5,
    "mirror_segments": 18,
    "mirror_material": "beryllium",
    "mirror_coating": "gold",
    "total_mass_kg": 6161,
    "operating_temperature_celsius": -233
  },
  "cost_billion_usd": 10.0,
  "development_years": 20,
  "mission_lifetime": {
    "expected_years": 10,
    "maximum_years": 20
  },
  "agencies": ["NASA", "ESA", "Canadian Space Agency"],
  "science_operations_center": "Space Telescope Science Institute",
  "instruments": [
    {"name": "NIRCam", "abbreviation": "Near Infrared Camera", "built_by": "University of Arizona"},
    {"name": "NIRSpec", "abbreviation": "Near Infrared Spectrograph", "built_by": "ESA"},
    {"name": "MIRI", "abbreviation": "Mid-Infrared Instrument", "built_by": "NASA and ESA"},
    {"name": "FGS/NIRISS", "abbreviation": "Fine Guidance Sensor and Near Infrared Imager and Slitless Spectrograph", "built_by": "CSA"}
  ]
}

NVIDIA FY2024 Financials — financial report with null-safe segment data: growth percentages absent in the source correctly return as null

Input

text = """
NVIDIA Corporation, headquartered in Santa Clara, California, reported exceptional financial
results for fiscal year 2024. Total revenue reached $60.9 billion, a 122% increase year-over-year,
driven primarily by the Data Center segment which generated $47.5 billion, up 217% from the
prior year. The Gaming segment contributed $10.4 billion, while Professional Visualization
brought in $1.7 billion and Automotive $1.1 billion.

Net income for the year was $29.8 billion compared to $4.4 billion in fiscal 2023, representing
a 581% increase. Gross margin expanded to 72.7% from 56.9%. The company returned $9.5 billion
to shareholders through buybacks and paid dividends of $395 million. As of January 2024,
NVIDIA employed approximately 29,600 people worldwide. CEO Jensen Huang founded the company
in 1993 alongside Chris Malachowsky and Curtis Priem.
"""

template = {
    "company": "string",
    "headquarters": "string",
    "fiscal_year": "integer",
    "ceo": "string",
    "other_founders": ["string"],
    "founded": "integer",
    "employees": "integer",
    "financials": {
        "total_revenue_billion": "float",
        "revenue_growth_yoy_pct": "float",
        "net_income_billion": "float",
        "net_income_growth_pct": "float",
        "gross_margin_pct": "float",
        "shareholder_returns_billion": "float",
        "dividends_million": "float"
    },
    "segments": [
        {
            "name": "string",
            "revenue_billion": "float",
            "growth_pct": "float"
        }
    ]
}

Output

{
  "company": "NVIDIA Corporation",
  "headquarters": "Santa Clara, California",
  "fiscal_year": 2024,
  "ceo": "Jensen Huang",
  "other_founders": ["Chris Malachowsky", "Curtis Priem"],
  "founded": 1993,
  "employees": 29600,
  "financials": {
    "total_revenue_billion": 60.9,
    "revenue_growth_yoy_pct": 122.0,
    "net_income_billion": 29.8,
    "net_income_growth_pct": 581.0,
    "gross_margin_pct": 72.7,
    "shareholder_returns_billion": 9.5,
    "dividends_million": 395.0
  },
  "segments": [
    {"name": "Data Center", "revenue_billion": 47.5, "growth_pct": 217.0},
    {"name": "Gaming", "revenue_billion": 10.4, "growth_pct": null},
    {"name": "Professional Visualization", "revenue_billion": 1.7, "growth_pct": null},
    {"name": "Automotive", "revenue_billion": 1.1, "growth_pct": null}
  ]
}

Moderna mRNA-1273 Clinical Trial — dense numerical extraction: efficacy stats, confidence intervals, demographic breakdowns, and adverse event arrays from a single paragraph

Input

text = """
The Moderna mRNA-1273 vaccine clinical trial enrolled 30,420 participants across 99 sites
in the United States between July 27 and October 23, 2020. Participants were randomly assigned
in a 1:1 ratio to receive two injections of either 100 micrograms of mRNA-1273 or placebo,
administered 28 days apart. The median age of participants was 51.4 years; 47.3% were female,
24.8% were Hispanic or Latino, and 10.2% were Black or African American.

The primary endpoint was prevention of COVID-19 illness with onset at least 14 days after
the second injection. The trial reported 185 cases of COVID-19, with 11 in the mRNA-1273 group
and 174 in the placebo group, yielding a vaccine efficacy of 94.1% with a 95% confidence
interval of 89.3% to 96.8%. Severe COVID-19 occurred in 30 participants, all in the placebo
group, suggesting 100% efficacy against severe disease. The most common adverse events were
injection-site pain (92% of participants), fatigue (70%), headache (64.7%), and myalgia (61.5%).
The trial was led by principal investigator Lindsey Baden at Brigham and Women's Hospital.
"""

template = {
    "vaccine_name": "string",
    "trial_sites": "integer",
    "enrollment": "integer",
    "dosage_mcg": "integer",
    "efficacy": {
        "overall_pct": "float",
        "confidence_interval": {"lower": "float", "upper": "float"},
        "against_severe_disease_pct": "float"
    },
    "covid_cases": {
        "total": "integer",
        "vaccine_group": "integer",
        "placebo_group": "integer"
    },
    "adverse_events": [
        {"name": "string", "frequency_pct": "float"}
    ]
}

Output

{
  "vaccine_name": "mRNA-1273",
  "trial_sites": 99,
  "enrollment": 30420,
  "dosage_mcg": 100,
  "efficacy": {
    "overall_pct": 94.1,
    "confidence_interval": {"lower": 89.3, "upper": 96.8},
    "against_severe_disease_pct": 100.0
  },
  "covid_cases": {
    "total": 185,
    "vaccine_group": 11,
    "placebo_group": 174
  },
  "adverse_events": [
    {"name": "injection-site pain", "frequency_pct": 92.0},
    {"name": "fatigue", "frequency_pct": 70.0},
    {"name": "headache", "frequency_pct": 64.7},
    {"name": "myalgia", "frequency_pct": 61.5}
  ]
}

Markdown Extraction

ML Reading List — repeated hierarchical entries: ### heading + bullet list blocks parsed into a uniform array of structured paper objects

Input

text = """
# Machine Learning Reading List

## Foundational Papers

### Attention Is All You Need
- **Authors:** Vaswani, Shazeer, Parmar et al.
- **Year:** 2017
- **Venue:** NeurIPS
- **Citations:** 90,000+
- **Key contribution:** Introduced the Transformer architecture
- **Tags:** transformers, attention, NLP

### ImageNet Classification with Deep CNNs
- **Authors:** Krizhevsky, Sutskever, Hinton
- **Year:** 2012
- **Venue:** NeurIPS
- **Citations:** 120,000+
- **Key contribution:** Demonstrated deep CNNs on large-scale image recognition
- **Tags:** CNN, computer vision, deep learning

### Playing Atari with Deep Reinforcement Learning
- **Authors:** Mnih et al.
- **Year:** 2013
- **Venue:** NIPS Workshop
- **Citations:** 18,000+
- **Key contribution:** Combined deep learning with reinforcement learning
- **Tags:** RL, DQN, Atari
"""

template = {
    "list_name": "string",
    "papers": [
        {
            "title": "string",
            "authors": ["string"],
            "year": "integer",
            "venue": "string",
            "citations": "string",
            "key_contribution": "string",
            "tags": ["string"]
        }
    ]
}

Output

{
  "list_name": "Machine Learning Reading List",
  "papers": [
    {
      "title": "Attention Is All You Need",
      "authors": ["Vaswani", "Shazeer", "Parmar"],
      "year": 2017,
      "venue": "NeurIPS",
      "citations": "90,000+",
      "key_contribution": "Introduced the Transformer architecture",
      "tags": ["transformers", "attention", "NLP"]
    },
    {
      "title": "ImageNet Classification with Deep CNNs",
      "authors": ["Krizhevsky", "Sutskever", "Hinton"],
      "year": 2012,
      "venue": "NeurIPS",
      "citations": "120,000+",
      "key_contribution": "Demonstrated deep CNNs on large-scale image recognition",
      "tags": ["CNN", "computer vision", "deep learning"]
    },
    {
      "title": "Playing Atari with Deep Reinforcement Learning",
      "authors": ["Mnih"],
      "year": 2013,
      "venue": "NIPS Workshop",
      "citations": "18,000+",
      "key_contribution": "Combined deep learning with reinforcement learning",
      "tags": ["RL", "DQN", "Atari"]
    }
  ]
}

HTML Extraction

NeurIPS 2024 — structured data from HTML markup: speaker objects from div.speaker elements, numeric attendance from span tags, nested objects

Input

text = """
<div class="conference">
  <h1>NeurIPS 2024</h1>
  <p>Dates: <time>December 10-15, 2024</time> |
  Venue: <span>Vancouver Convention Centre</span>, Canada</p>
  <p>Organized by <strong>NeurIPS Foundation</strong>.
  Expected attendance: <span>15,000</span> researchers from <span>90</span> countries.</p>
  <div class="speakers">
    <div class="speaker"><strong>Yann LeCun</strong> — Meta AI —
    Talk: <em>The Future of Self-Supervised Learning</em></div>
    <div class="speaker"><strong>Yoshua Bengio</strong> — Mila —
    Talk: <em>AI Safety and Alignment</em></div>
    <div class="speaker"><strong>Fei-Fei Li</strong> — Stanford —
    Talk: <em>Spatial Intelligence</em></div>
  </div>
</div>
"""

template = {
    "event_name": "string",
    "dates": "string",
    "venue": "string",
    "country": "string",
    "organizer": "string",
    "attendance": {
        "expected": "integer",
        "countries": "integer"
    },
    "keynote_speakers": [
        {
            "name": "string",
            "affiliation": "string",
            "talk_title": "string"
        }
    ]
}

Output

{
  "event_name": "NeurIPS 2024",
  "dates": "December 10-15, 2024",
  "venue": "Vancouver Convention Centre",
  "country": "Canada",
  "organizer": "NeurIPS Foundation",
  "attendance": {
    "expected": 15000,
    "countries": 90
  },
  "keynote_speakers": [
    {"name": "Yann LeCun", "affiliation": "Meta AI", "talk_title": "The Future of Self-Supervised Learning"},
    {"name": "Yoshua Bengio", "affiliation": "Mila", "talk_title": "AI Safety and Alignment"},
    {"name": "Fei-Fei Li", "affiliation": "Stanford", "talk_title": "Spatial Intelligence"}
  ]
}

XML Extraction

Apollo Program — XML attributes mapped to JSON fields: <period start="1961" end="1972"/> and <member role="commander"> resolved into typed schema fields

Input

text = """
<?xml version="1.0" encoding="UTF-8"?>
<project>
  <name>Apollo</name>
  <organization>NASA</organization>
  <period start="1961" end="1972"/>
  <budget_billion_usd>25.4</budget_billion_usd>
  <director>George Low</director>
  <missions>
    <mission>
      <name>Apollo 11</name>
      <date>1969-07-16</date>
      <result>success</result>
      <crew>
        <member role="commander">Neil Armstrong</member>
        <member role="lunar_module_pilot">Buzz Aldrin</member>
        <member role="command_module_pilot">Michael Collins</member>
      </crew>
    </mission>
    <mission>
      <name>Apollo 13</name>
      <date>1970-04-11</date>
      <result>failure</result>
      <crew>
        <member role="commander">Jim Lovell</member>
        <member role="lunar_module_pilot">Fred Haise</member>
        <member role="command_module_pilot">Jack Swigert</member>
      </crew>
    </mission>
  </missions>
</project>
"""

template = {
    "project_name": "string",
    "organization": "string",
    "director": "string",
    "period": {
        "start": "integer",
        "end": "integer"
    },
    "budget_billion_usd": "float",
    "missions": [
        {
            "name": "string",
            "date": "string",
            "result": "string",
            "crew": [
                {
                    "name": "string",
                    "role": "string"
                }
            ]
        }
    ]
}

Output

{
  "project_name": "Apollo",
  "organization": "NASA",
  "director": "George Low",
  "period": {"start": 1961, "end": 1972},
  "budget_billion_usd": 25.4,
  "missions": [
    {
      "name": "Apollo 11",
      "date": "1969-07-16",
      "result": "success",
      "crew": [
        {"name": "Neil Armstrong", "role": "commander"},
        {"name": "Buzz Aldrin", "role": "lunar_module_pilot"},
        {"name": "Michael Collins", "role": "command_module_pilot"}
      ]
    },
    {
      "name": "Apollo 13",
      "date": "1970-04-11",
      "result": "failure",
      "crew": [
        {"name": "Jim Lovell", "role": "commander"},
        {"name": "Fred Haise", "role": "lunar_module_pilot"},
        {"name": "Jack Swigert", "role": "command_module_pilot"}
      ]
    }
  ]
}

Relation Extraction

Open-domain NER and relation extraction. No predefined label sets — entity types and relation types are inferred directly from the text.

ARM Holdings — multi-hop corporate chain: acquisition, licensing, blocked deal, IPO, and CEO succession all extracted from one paragraph

Input

text = """
In September 2016, Softbank acquired ARM Holdings for $32 billion, the largest
ever acquisition of a European technology company at the time. ARM's CEO Simon Segars
continued to lead the company under SoftBank ownership. Masayoshi Son, founder and CEO
of SoftBank, personally championed the deal. ARM Holdings, originally spun out of Acorn
Computers in 1990, licenses its chip architecture to Apple, Qualcomm, Samsung, and NVIDIA
among others. In 2020, NVIDIA announced an agreement to acquire ARM from SoftBank for
$40 billion, a deal that was ultimately blocked by regulators in the UK, EU, and US in
February 2022 following objections from Qualcomm, Google, and Microsoft. ARM subsequently
went public on the NASDAQ in September 2023, with SoftBank retaining a 90% stake.
Rene Haas replaced Simon Segars as CEO of ARM in February 2022 following the deal's collapse.
"""

template = {
    "entities": [
        {"entity": "string", "type": "string"}
    ],
    "relations": [
        {"head": "string", "relation": "string", "tail": "string"}
    ]
}

Output

{
  "entities": [
    {"entity": "Softbank", "type": "organization"},
    {"entity": "ARM Holdings", "type": "organization"},
    {"entity": "Simon Segars", "type": "person"},
    {"entity": "Masayoshi Son", "type": "person"},
    {"entity": "Acorn Computers", "type": "organization"},
    {"entity": "Apple", "type": "organization"},
    {"entity": "Qualcomm", "type": "organization"},
    {"entity": "Samsung", "type": "organization"},
    {"entity": "NVIDIA", "type": "organization"},
    {"entity": "Rene Haas", "type": "person"},
    {"entity": "NASDAQ", "type": "organization"}
  ],
  "relations": [
    {"head": "Softbank", "relation": "acquired", "tail": "ARM Holdings"},
    {"head": "ARM Holdings", "relation": "spun out of", "tail": "Acorn Computers"},
    {"head": "ARM Holdings", "relation": "licenses to", "tail": "Apple"},
    {"head": "ARM Holdings", "relation": "licenses to", "tail": "Qualcomm"},
    {"head": "ARM Holdings", "relation": "licenses to", "tail": "Samsung"},
    {"head": "ARM Holdings", "relation": "licenses to", "tail": "NVIDIA"},
    {"head": "NVIDIA", "relation": "announced agreement to acquire", "tail": "ARM Holdings"},
    {"head": "ARM Holdings", "relation": "went public on", "tail": "NASDAQ"},
    {"head": "Rene Haas", "relation": "replaced", "tail": "Simon Segars"}
  ]
}

Higgs boson / CERN — diverse entity types (particle, facility, experiment) across institutions, experiments, and individuals

Input

text = """
The discovery of the Higgs boson was announced on July 4, 2012, by CERN, the European
Organization for Nuclear Research, based in Geneva, Switzerland. The discovery was made
using the Large Hadron Collider, the world's largest particle accelerator, built in a
27-kilometer tunnel beneath the French-Swiss border. Two independent experiments confirmed
the discovery: ATLAS, led by Fabiola Gianotti, and CMS, led by Joe Incandela. Peter Higgs
of the University of Edinburgh and François Englert of the Université Libre de Bruxelles
were awarded the Nobel Prize in Physics in 2013 for their theoretical prediction of the
particle in 1964. Robert Brout, who co-authored the original paper with Englert, had died
in 2011 and was therefore ineligible for the prize. CERN's Director General at the time
of the announcement was Rolf Heuer.
"""

template = {
    "entities": [
        {"entity": "string", "type": "string"}
    ],
    "relations": [
        {"head": "string", "relation": "string", "tail": "string"}
    ]
}

Output

{
  "entities": [
    {"entity": "Higgs boson", "type": "particle"},
    {"entity": "CERN", "type": "organization"},
    {"entity": "Geneva", "type": "location"},
    {"entity": "Switzerland", "type": "location"},
    {"entity": "Large Hadron Collider", "type": "facility"},
    {"entity": "ATLAS", "type": "experiment"},
    {"entity": "Fabiola Gianotti", "type": "person"},
    {"entity": "CMS", "type": "experiment"},
    {"entity": "Joe Incandela", "type": "person"},
    {"entity": "Peter Higgs", "type": "person"},
    {"entity": "University of Edinburgh", "type": "organization"},
    {"entity": "François Englert", "type": "person"},
    {"entity": "Université Libre de Bruxelles", "type": "organization"},
    {"entity": "Robert Brout", "type": "person"},
    {"entity": "Rolf Heuer", "type": "person"}
  ],
  "relations": [
    {"head": "Higgs boson", "relation": "announced by", "tail": "CERN"},
    {"head": "CERN", "relation": "based in", "tail": "Geneva"},
    {"head": "Higgs boson", "relation": "discovered using", "tail": "Large Hadron Collider"},
    {"head": "ATLAS", "relation": "led by", "tail": "Fabiola Gianotti"},
    {"head": "CMS", "relation": "led by", "tail": "Joe Incandela"},
    {"head": "Peter Higgs", "relation": "affiliated with", "tail": "University of Edinburgh"},
    {"head": "François Englert", "relation": "affiliated with", "tail": "Université Libre de Bruxelles"},
    {"head": "Peter Higgs", "relation": "awarded", "tail": "Nobel Prize in Physics"},
    {"head": "François Englert", "relation": "awarded", "tail": "Nobel Prize in Physics"},
    {"head": "Peter Higgs", "relation": "predicted", "tail": "Higgs boson"},
    {"head": "François Englert", "relation": "predicted", "tail": "Higgs boson"},
    {"head": "Robert Brout", "relation": "co-authored with", "tail": "François Englert"},
    {"head": "CERN", "relation": "Director General", "tail": "Rolf Heuer"}
  ]
}

OpenAI — dense organizational history: founding, funding, role transitions, board conflict, and resignations into 20 entities and 19 relations from a multi-paragraph document

Input

text = """
OpenAI was founded in December 2015 as a nonprofit by Sam Altman, Greg Brockman, Ilya
Sutskever, Wojciech Zaremba, John Schulman, and Elon Musk, with a $1 billion funding
commitment from Musk, Peter Thiel, Reid Hoffman, and Amazon Web Services. Musk resigned
from the board in 2018 citing potential conflicts of interest with Tesla's AI ambitions.

In 2019, OpenAI transitioned to a capped-profit model and received a $1 billion investment
from Microsoft. Ilya Sutskever, who had previously worked as a researcher at Google Brain
under Geoffrey Hinton, became OpenAI's Chief Scientist. Greg Brockman, who had previously
been CTO of Stripe, became OpenAI's President. Sam Altman, formerly president of Y Combinator,
became CEO.

GPT-4, released in March 2023, was developed under the leadership of Ilya Sutskever and
deployed in Microsoft's Azure OpenAI Service. In November 2023, the OpenAI board briefly
fired Sam Altman, citing concerns about his candor. Greg Brockman resigned in protest.
Altman was reinstated five days later following pressure from Microsoft and OpenAI employees.
Ilya Sutskever, who had voted to fire Altman, later expressed regret and subsequently
resigned in May 2024 to found Safe Superintelligence Inc. together with Daniel Gross
and Daniel Levy.
"""

template = {
    "entities": [
        {"entity": "string", "type": "string"}
    ],
    "relations": [
        {"head": "string", "relation": "string", "tail": "string"}
    ]
}

Output

{
  "entities": [
    {"entity": "OpenAI", "type": "organization"},
    {"entity": "Sam Altman", "type": "person"},
    {"entity": "Greg Brockman", "type": "person"},
    {"entity": "Ilya Sutskever", "type": "person"},
    {"entity": "Wojciech Zaremba", "type": "person"},
    {"entity": "John Schulman", "type": "person"},
    {"entity": "Elon Musk", "type": "person"},
    {"entity": "Peter Thiel", "type": "person"},
    {"entity": "Reid Hoffman", "type": "person"},
    {"entity": "Amazon Web Services", "type": "organization"},
    {"entity": "Tesla", "type": "organization"},
    {"entity": "Microsoft", "type": "organization"},
    {"entity": "Google Brain", "type": "organization"},
    {"entity": "Geoffrey Hinton", "type": "person"},
    {"entity": "Stripe", "type": "organization"},
    {"entity": "Y Combinator", "type": "organization"},
    {"entity": "GPT-4", "type": "technology"},
    {"entity": "Safe Superintelligence Inc.", "type": "organization"},
    {"entity": "Daniel Gross", "type": "person"},
    {"entity": "Daniel Levy", "type": "person"}
  ],
  "relations": [
    {"head": "OpenAI", "relation": "founded_by", "tail": "Sam Altman"},
    {"head": "OpenAI", "relation": "founded_by", "tail": "Greg Brockman"},
    {"head": "OpenAI", "relation": "founded_by", "tail": "Ilya Sutskever"},
    {"head": "OpenAI", "relation": "founded_by", "tail": "Elon Musk"},
    {"head": "OpenAI", "relation": "funding_from", "tail": "Peter Thiel"},
    {"head": "OpenAI", "relation": "funding_from", "tail": "Amazon Web Services"},
    {"head": "Elon Musk", "relation": "resigned_from", "tail": "OpenAI board"},
    {"head": "OpenAI", "relation": "received_investment_from", "tail": "Microsoft"},
    {"head": "Ilya Sutskever", "relation": "worked_at", "tail": "Google Brain"},
    {"head": "Ilya Sutskever", "relation": "became_role_at", "tail": "OpenAI Chief Scientist"},
    {"head": "Greg Brockman", "relation": "worked_as", "tail": "CTO of Stripe"},
    {"head": "Greg Brockman", "relation": "became_role_at", "tail": "OpenAI President"},
    {"head": "Sam Altman", "relation": "worked_as", "tail": "president of Y Combinator"},
    {"head": "GPT-4", "relation": "developed_under_leadership_of", "tail": "Ilya Sutskever"},
    {"head": "GPT-4", "relation": "deployed_in", "tail": "Microsoft's Azure OpenAI Service"},
    {"head": "Greg Brockman", "relation": "resigned_in_protest_of", "tail": "firing of Sam Altman"},
    {"head": "Ilya Sutskever", "relation": "voted_to_fire", "tail": "Sam Altman"},
    {"head": "Ilya Sutskever", "relation": "resigned", "tail": "May 2024"},
    {"entity": "Ilya Sutskever", "relation": "founded", "tail": "Safe Superintelligence Inc."}
  ]
}

Constrained Relation Extraction

The open-domain template shown above infers entity and relation types freely from the text. For benchmarking and production pipelines where label sets are fixed, you can constrain the model by injecting allowed types directly into the prompt:

TEMPLATE = json.dumps({
    "entities": [{"entity": "string", "type": "string"}],
    "relations": [{"head": "string", "relation": "string", "tail": "string"}]
}, indent=1)

def build_prompt(text: str, entity_types: list[str], relation_types: list[str]) -> str:
    et_str = ", ".join(entity_types)
    rt_str = ", ".join(relation_types)
    return (
        "/no_think\n"
        "Extract entities and relations from the following text according to the JSON template.\n\n"
        "Important rules:\n"
        "- If a field's value is not mentioned or cannot be found in the text, set it to null.\n"
        "- Do not infer, guess, or hallucinate values that are not explicitly stated.\n"
        "- For list fields with no values found, return [] not [null].\n"
        "- Entity text must be exact substrings from the input text.\n"
        f"- Entity types must be one of: {et_str}\n"
        f"- Relation types must be one of: {rt_str}\n\n"
        f"Template:\n{TEMPLATE}\n\n"
        f"Text:\n{text}\n\n"
        "Return only the extracted JSON, nothing else."
    )

This is the setup used to produce the benchmark results below.

Benchmarks

All benchmarks are zero-shot — the model was not trained on any of these datasets.

Benchmark Charts

WL Graph F1 — overall

retrico-lm-4b0.761

gpt-oss-120b0.787

Llama-3.3-70B0.784

DeepSeek-V3.10.782

Qwen3-32B0.733

NuExtract30.730

ROUGE-L — overall

retrico-lm-4b0.532

Llama-3.3-70B0.550

DeepSeek-V3.10.525

gpt-oss-120b0.520

Qwen3-32B0.485

NuExtract30.375

Valid JSON rate — overall

retrico-lm-4b96.0%

Llama-3.3-70B98.9%

gpt-oss-120b98.6%

DeepSeek-V3.196.8%

Qwen3-32B93.9%

NuExtract392.5%

WL Graph F1 · 256–1023 tokens

retrico-lm-4b74.8%

NuExtract378.3%

DeepSeek-V3.176.2%

Llama-3.3-70B74.5%

Qwen3-32B61.2%

gpt-oss-120b34.3%

WL Graph F1 · 1024–3999 tokens

retrico-lm-4b82.5%

Llama-3.3-70B83.6%

DeepSeek-V3.183.5%

gpt-oss-120b82.2%

Qwen3-32B76.7%

NuExtract375.9%

WL Graph F1 · ≥4000 tokens

retrico-lm-4b34.1%

gpt-oss-120b76.4%

Qwen3-32B53.8%

Llama-3.3-70B25.1%

DeepSeek-V3.17.9%

NuExtract37.5%

Valid JSON · 1024–3999 tokens

retrico-lm-4b98.0%

gpt-oss-120b100%

Llama-3.3-70B98.7%

DeepSeek-V3.197.3%

Qwen3-32B97.1%

NuExtract392.1%

Valid JSON · ≥4000 tokens

retrico-lm-4b33.3%

gpt-oss-120b93.0%

Llama-3.3-70B66.7%

Qwen3-32B66.7%

NuExtract333.3%

DeepSeek-V3.120.0%

RE benchmarks — Micro-F1

CrossRE (test · ai/news/science)

retrico-lm-4b8.5

gliner2-large2.0

DocRED (validation)

retrico-lm-4b14.5

gliner2-large13.8

Relation Extraction

Evaluated on two standard RE benchmarks with constrained entity and relation type sets (see prompt format above).

CrossRE — cross-domain relation extraction. Evaluated on the test split, domains: ai, news, science.

DocRED — document-level relation extraction from Wikipedia and Wikidata. Evaluated on the validation split.

Dataset	Model	Micro-F1	Macro-F1	Precision	Recall
CrossRE	retrico-lm-4b	8.5	7.3	7.7	9.6
CrossRE	fastino/gliner2-large-v1	2.0	2.0	2.3	1.7
DocRED	retrico-lm-4b	14.5	6.7	13.3	15.9
DocRED	fastino/gliner2-large-v1	13.8	6.9	13.0	14.6

Comparison with Large Language Models — Human-Annotated Eval Split

Evaluated on an internal held-out set with human-annotated ground truth. Metrics:

WL Graph F1 — graph-based metric that converts predicted and reference JSON into trees, computes semantic node embeddings, and propagates via Weisfeiler-Leman message passing. Captures both structural correctness and semantic similarity of extracted values.
ROUGE-L — longest common subsequence overlap between predicted and reference JSON strings.
Valid JSON Rate — fraction of outputs that parse as valid JSON.

Model	WL Graph F1	ROUGE-L	Valid JSON Rate
retrico-lm-4b	0.7606	0.5323	96.0%
openai/gpt-oss-120b	0.7868	0.5204	98.6%
Meta-Llama-3.3-70B-Instruct	0.7837	0.5503	98.9%
DeepSeek-V3.1	0.7821	0.5253	96.8%
Qwen3-32B	0.7329	0.4852	93.9%
numind/NuExtract3	0.7302	0.3747	92.5%

Valid JSON Rate by input length:

Token bucket	gpt-oss-120b	Llama-3.3-70B	DeepSeek-V3.1	retrico-lm-4b	Qwen3-32B	NuExtract3
256–1023	100%	100%	100%	100%	100%	98.8%
1024–3999	100%	98.7%	97.3%	98.0%	97.1%	92.1%
≥4000	93.0%	66.7%	20.0%	33.3%	66.7%	33.3%

WL Graph F1 by input length:

Token bucket	gpt-oss-120b	Llama-3.3-70B	DeepSeek-V3.1	retrico-lm-4b	Qwen3-32B	NuExtract3
256–1023	34.3%	74.5%	76.2%	74.8%	61.2%	78.3%
1024–3999	82.2%	83.6%	83.5%	82.5%	76.7%	75.9%
≥4000	76.4%	25.1%	7.9%	34.1%	53.8%	7.5%

Citation

@misc{knowledgator2025retrico,
  title={retrico-lm: Schema-Guided Structured Information Extraction},
  author={Knowledgator Engineering},
  year={2025},
  url={https://huggingface.co/knowledgator}
}

Downloads last month: 326

Safetensors

Model size

4B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for knowledgator/retrico-lm-4b

Adapters

1 model

Paper for knowledgator/retrico-lm-4b

DocRED: A Large-Scale Document-Level Relation Extraction Dataset

Paper • 1906.06127 • Published Jun 14, 2019 • 1

knowledgator
/

retrico-lm-4b

retrico-lm-4b

Key Features

Training

Usage

Examples

Plain Text Extraction

Markdown Extraction

HTML Extraction

XML Extraction

Relation Extraction

Constrained Relation Extraction

Benchmarks

Benchmark Charts

Relation Extraction

Comparison with Large Language Models — Human-Annotated Eval Split

Links

Citation

Model tree for knowledgator/retrico-lm-4b

Paper for knowledgator/retrico-lm-4b

DocRED: A Large-Scale Document-Level Relation Extraction Dataset