BIBFRAME-OLMo 1B

A fine-tuned 1B parameter language model for correcting malformed BIBFRAME RDF/XML to produce valid, well-formed output following Library of Congress conventions.

Model Details

Property Value
Base Model amd/AMD-OLMo-1B
Parameters 1.2B
Training LoRA fine-tuning, merged for deployment
Training Data ~8,500 Library of Congress BIBFRAME records
Task BIBFRAME RDF/XML correction
License Apache 2.0

Quick Start

VS Code Extension (Recommended)

The easiest way to use this model is through the BIBFRAME Vibe VS Code extension:

  1. Install the extension from the VS Code marketplace
  2. Configure in VS Code settings:
    {
      "bf.huggingFaceModel": "jimfhahn/bibframe-olmo-1b",
      "bf.huggingFaceToken": "hf_your_token_here"
    }
    
  3. Use @bf-vibe /correct in GitHub Copilot Chat to fix BIBFRAME records

Inference Endpoints (Production)

Deploy your own endpoint for production use:

  1. Click Deploy β†’ Inference Endpoints above
  2. Select Text Generation Inference (TGI)
  3. Choose instance: nvidia-t4 (recommended) or cpu-xlarge
  4. Configure in VS Code:
    {
      "bf.huggingFaceEndpoint": "https://your-endpoint.us-east-1.aws.endpoints.huggingface.cloud",
      "bf.huggingFaceToken": "hf_your_token_here"
    }
    

Python API

from transformers import pipeline

pipe = pipeline("text-generation", model="jimfhahn/bibframe-olmo-1b")

prompt = """<|system|>
You are a BIBFRAME expert. Fix the following malformed RDF/XML to produce valid BIBFRAME.
<|user|>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns:bf="http://id.loc.gov/ontologies/bibframe/">
  <bf:Work>
    <bf:title>Example Book</bf:title>
  </bf:Work>
</rdf:RDF>
<|assistant|>
"""

result = pipe(prompt, max_new_tokens=1024, temperature=0.1)
print(result[0]["generated_text"])

cURL (Inference API)

curl https://router.huggingface.co/hf-inference/models/jimfhahn/bibframe-olmo-1b \
  -X POST \
  -H "Authorization: Bearer $HF_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "inputs": "<|system|>\nFix the BIBFRAME RDF/XML.\n<|user|>\n<your-rdf-here>\n<|assistant|>\n",
    "parameters": {"max_new_tokens": 1024, "temperature": 0.1}
  }'

What It Fixes

This model corrects common BIBFRAME errors:

  • ❌ Missing required properties (bf:title, bf:adminMetadata)
  • ❌ Wrong namespace prefixes (bibframe: β†’ bf:)
  • ❌ Literal values where resources expected
  • ❌ Missing rdf:type declarations
  • ❌ Invalid property nesting
  • ❌ Malformed URIs

Prompt Format

The model expects this chat format:

<|system|>
You are a BIBFRAME expert. Fix the following malformed RDF/XML to produce valid BIBFRAME.
<|user|>
[Your invalid RDF/XML here]
<|assistant|>

Training Data

Trained on jimfhahn/bibframe-corrections:

  • Source: Library of Congress (id.loc.gov)
  • Records: ~4,100 Works + ~5,000 Instances
  • Diversity: 102 facets (subjects, languages, time periods, formats, genres)
  • Method: Synthetic corruptions β†’ model learns to restore valid RDF/XML

Limitations

  • Trained exclusively on Library of Congress BIBFRAME; may not generalize to other implementations
  • Cannot fix semantic errors (wrong subject headings), only structural/syntactic issues
  • Large RDF documents may exceed context length (4096 tokens)
  • Recommendation: Validate output with SHACL shapes before production use

Ecosystem

Project Description
BIBFRAME Vibe VS Code extension for BIBFRAME cataloging
mcp4rdf-core SHACL validation service
bibframe-corrections Training dataset
bibframe-olmo-1b-v2 Original LoRA adapter

Citation

@misc{bibframe-olmo-2026,
  author = {Hahn, Jim},
  title = {BIBFRAME-OLMo-1B: Fine-tuned OLMo for BIBFRAME Correction},
  year = {2026},
  publisher = {HuggingFace},
  url = {https://huggingface.co/jimfhahn/bibframe-olmo-1b}
}

License

Apache 2.0

Downloads last month
17
Safetensors
Model size
1B params
Tensor type
F16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for jimfhahn/bibframe-olmo-1b

Base model

amd/AMD-OLMo-1B
Finetuned
(1)
this model
Quantizations
1 model

Dataset used to train jimfhahn/bibframe-olmo-1b