DEPRECATED — Please use gemma-4-E2B-sec-extraction-GGUF-v2 instead.

v2 was trained on a combined instruction + corrective dataset (3,957 examples vs. 2,726) and shows measurably stronger extraction quality, including a 2 percentage point reduction in hallucination rate versus the base Gemma 4 E2B model (10.7% vs. 12.7%). v2 also improves symbol compliance (+0.9%), reduces bare number errors (-0.8%), and eliminates year-as-value hallucinations entirely.

This v1 model remains available for reproducibility but is no longer recommended for production use.


Gemma 4 E2B — SEC Financial Extraction (v1, GGUF) [DEPRECATED]

A fine-tuned Gemma 4 E2B model specialized for extracting structured financial data from SEC Exhibit 10 material contracts. Quantized to Q4_K_M GGUF for efficient local inference.

What This Model Does

Given raw text from an SEC filing (employment agreements, credit facilities, merger agreements, etc.), this model extracts structured JSON containing:

  • Metadata — effective dates and contracting party names
  • Financial terms — dollar amounts and percentages classified into 13 categories (salary, bonus, severance, equity_grant, credit_facility, interest_rate, etc.)
  • Debt covenants — financial maintenance tests classified into 7 categories (leverage_ratio, interest_coverage, debt_service, net_worth, etc.)

Why You Should Use v2 Instead

Metric v1 v2 Delta
Hallucination phrase rate 10.7% (vs 12.7% base) -2.0pp
Symbol compliance 84.3% (vs 83.4% base) +0.9pp
Bare number rate 8.8% (vs 9.6% base) -0.8pp
Year-as-value errors 0 (vs 1 base) Eliminated
Training examples 2,726 3,957 +45%
Training signal Positive only Positive + corrective + hard negatives Richer

v2 is a strict upgrade — same base model, same hardware requirements, better extraction quality across all measured dimensions.

Upgrade: TheTokenFactory/gemma-4-E2B-sec-extraction-GGUF-v2

Usage

LM Studio

  1. Download gemma-4-E2B-it.Q4_K_M.gguf (3.4 GB)
  2. Import into LM Studio
  3. Set GPU Layers to max (35/35), Context Length to 4096
  4. Send extraction prompts via the chat API at http://localhost:1234/v1

Python (via OpenAI-compatible API)

from openai import OpenAI

client = OpenAI(base_url="http://localhost:1234/v1", api_key="lm-studio")

response = client.chat.completions.create(
    model="gemma",
    temperature=0.1,
    messages=[
        {"role": "system", "content": "You are a financial analyst AI. Extract ALL monetary dollar amounts and financial percentages. Output strictly as JSON."},
        {"role": "user", "content": "<contract text here>"},
    ],
)
print(response.choices[0].message.content)

Training Details

Parameter Value
Base model unsloth/gemma-4-E2B-it
Method QLoRA (4-bit) via Unsloth
LoRA rank 8
LoRA alpha 8
Epochs 3
Learning rate 2e-4
Max sequence length 2,048 tokens
Training examples 2,726 (positive only)
Quantization Q4_K_M
Hardware Google Colab T4 (16 GB VRAM)

Financial Term Types (13 categories)

salary bonus severance retirement_benefit equity_grant credit_facility loan_amount interest_rate fee threshold purchase_price compensation other

Covenant Types (7 categories)

leverage_ratio interest_coverage debt_service net_worth liquidity fixed_charge other

Hardware Requirements

Setup VRAM Notes
RTX 4050 / 4060 (6 GB) 3.4 GB model + KV cache Full GPU offload, 4096 context
RTX 3060 / 4070 (8+ GB) Comfortable headroom
CPU-only ~4 GB RAM Slower, but works

Limitations

  • Temporal scope: Trained on S&P 500 filings from a 6-month window
  • Universe: Large-cap US equities only (S&P 500)
  • Language: English only
  • Label quality: Silver-standard (model-generated, not human-annotated)
  • No corrective training: v1 was trained only on positive examples, without the corrective/hard-negative signal that improves v2

License

CC-BY-4.0. SEC filings are public domain; this model's value is in the fine-tuning for structured extraction.


Trained 2x faster with Unsloth

Downloads last month
768
GGUF
Model size
5B params
Architecture
gemma4
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train TheTokenFactory/gemma-4-E2B-sec-extraction-GGUF