Acervo Extractor — Qwen3.5-9B Fine-Tuned

A fine-tuned version of Qwen3.5-9B specialized in knowledge graph extraction from conversations. Given a conversation turn and existing graph context, the model outputs structured JSON with topic classification, entities, relations, and facts.

Built for Acervo — a semantic compression layer for AI agents that replaces raw conversation history with compressed knowledge graph nodes.

What it does

Input: A conversation message + existing graph nodes as context

Output: Structured JSON with:

  • Topic classification — same / subtopic / changed
  • Entities — people, projects, technologies, events, places, etc.
  • Relations — uses_technology, maintains, part_of, participated_in, etc.
  • Facts — specific claims attached to existing entities

Example

Input:

EXISTING NODES:
[{"id": "beacon", "label": "Beacon", "type": "project", "layer": "PERSONAL"}]

TOPIC HINT: same (high confidence from keyword match)
CURRENT TOPIC: Beacon development

PREVIOUS ASSISTANT: How's the project going?
USER: Beacon ya tiene 50 mil usuarios y estamos migrando a Kubernetes.

Output:

{
  "topic": {"action": "same"},
  "entities": [
    {
      "id": "kubernetes",
      "label": "Kubernetes",
      "type": "technology",
      "layer": "UNIVERSAL",
      "attributes": {},
      "facts": [],
      "existing_id": null
    }
  ],
  "relations": [
    {"source": "beacon", "target": "kubernetes", "relation": "uses_technology"}
  ],
  "facts": [
    {"entity": "beacon", "text": "Has 50,000 users", "speaker": "user"}
  ]
}

Key capabilities

Capability Description
Bilingual Handles English and Spanish input natively
Empty output Returns empty arrays for small talk and pure queries (no hallucinated entities)
Dedup awareness References existing nodes via existing_id instead of creating duplicates
Implicit references Maps "our project", "the app", "Alice's work" to existing graph nodes
Event extraction Creates event nodes with participants, narrative position, and chronological markers
Controlled vocabulary Uses strict enums for types (8) and relations (15)
Topic detection Classifies same/subtopic/changed with optional hint from upstream classifiers

Training details

Parameter Value
Base model Qwen/Qwen3.5-9B
Method LoRA (QLoRA 4-bit)
Framework Unsloth + Transformers
Dataset size ~582 examples (450 base + 112 supplementary + 20 stress test)
Training Initial 3 epochs (lr=2e-4) + incremental 2 epochs (lr=5e-5)
Max sequence length 2048
Languages English (70%), Spanish (30%)

Dataset composition

Category % Description
Facts about existing entities 30% "Our project has 50k users" → fact on existing node
New entity extraction 20% First mentions of people, projects, technologies
Empty output (small talk / queries) 15% "Thanks!", "What tech does X use?" → []
Topic changes 10% Implicit and explicit topic switches
Subtopic shifts 10% Diving deeper into an aspect
Literary events 5% Events with narrative_position and chronological_marker
Corrections / updates 5% "We switched from React to Vue"
Dedup / existing references 5% "nuestro proyecto" → existing_id: "beacon"

Schema

Entity types (enum)

person, organization, project, technology, place, event, document, concept

Relation types (enum)

part_of, created_by, maintains, works_at, member_of,
uses_technology, depends_on, alternative_to,
located_in, deployed_on, produces, serves, documented_in,
participated_in, triggered_by, resulted_in

Layers

  • PERSONAL — user owns, created, or directly uses it
  • UNIVERSAL — public knowledge (technologies, fictional characters, cities)

Usage

With Transformers + LoRA

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3.5-9B", device_map="auto")
model = PeftModel.from_pretrained(base_model, "sandyeveliz/acervo-extractor-qwen3.5-9b")
tokenizer = AutoTokenizer.from_pretrained("sandyeveliz/acervo-extractor-qwen3.5-9b")

messages = [
    {"role": "system", "content": "You are a knowledge extractor for a personal knowledge graph. Analyze the conversation and return a single JSON object with topic classification, entities, relations, and facts. Output valid JSON only, no markdown, no explanation."},
    {"role": "user", "content": "EXISTING NODES:\n[]\n\nTOPIC HINT: unresolved\nCURRENT TOPIC: null\n\nPREVIOUS ASSISTANT: null\nUSER: I work at Acme Corp building a React app called Beacon with PostgreSQL."}
]

inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True)
outputs = model.generate(inputs.to(model.device), max_new_tokens=1024, temperature=0.1)
print(tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True))

With Unsloth (recommended for inference)

from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
    "sandyeveliz/acervo-extractor-qwen3.5-9b",
    max_seq_length=2048, load_in_4bit=True,
)
FastLanguageModel.for_inference(model)

With Acervo (intended use)

from acervo import Acervo, OpenAIClient

llm = OpenAIClient(base_url="http://localhost:1234/v1", model="acervo-extractor")
memory = Acervo(llm=llm, owner="user")

Intended use

This model is designed as the extraction component inside Acervo, a semantic compression layer for AI agents. It replaces general-purpose LLM calls for topic detection and entity extraction with a specialized, faster model.

It can also be used standalone for:

  • Building knowledge graphs from conversations
  • Structured entity/relation extraction from text
  • Topic detection in multi-turn dialogues

License

Apache 2.0 — same as the base model.

Downloads last month
62
GGUF
Model size
9B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for SandyVeliz/acervo-extractor-qwen3.5-9b

Finetuned
Qwen/Qwen3.5-9B
Quantized
(155)
this model
Quantizations
1 model

Evaluation results