LLM4Variants-Llama-3.2-1B-Instruct

Dual-head sentence classifier for ACMG evidence-code + strength prediction on ClinVar submission comments. This is rank #2 of the grid search (ranked by joint test accuracy).

The model wraps the backbone meta-llama/Llama-3.2-1B-Instruct with two heads on top of mean-pooled hidden states:

code head — 28-way ACMG evidence code (PVS1, PS1–PS4, PM1–PM6, PP1–PP5, BA1, BS1–BS4, BP1–BP7 + NO_KEYWORD)
strength head — 6-way strength, conditioned on a learned embedding of the predicted code (Supporting, Moderate, Strong, VeryStrong, NotMet, NoStrength)

Test metrics

Metric	Value
Code accuracy	0.9270
Strength accuracy	0.9346
Joint accuracy	0.8773
Strength acc \| correct code	0.9464
Code weighted-F1	0.9265
Strength weighted-F1	0.9322

Training configuration

Hyperparameter	Value
Learning rate	0.0001
Effective batch size	128
Epochs	8
Max length	256
λ (strength loss)	1.0
Code emb dim	64
Negative ratio	0.25
Seed	42
Train / val / test size	19161 / 1278 / 5110

Files

model.safetensors — full state dict (backbone + code_head + code_embeddings + strength_head).
label_mappings.json — keyword2id / strength2id (and reverse).
tokenizer files + chat_template.jinja.

Loading

This is a custom nn.Module (DualHeadLLM), not a transformers AutoModel. Reconstruct the module (see train_dual_head.py), then load the weights:

from safetensors.torch import load_file
from huggingface_hub import hf_hub_download

model = DualHeadLLM("meta-llama/Llama-3.2-1B-Instruct", num_keywords=28, num_strengths=6)
state = load_file(hf_hub_download("HFXM/LLM4Variants-Llama-3.2-1B-Instruct", "model.safetensors"))
model.load_state_dict(state)

Downloads last month: -; Downloads are not tracked for this model. How to track

Safetensors

Model size

1B params

Tensor type

F32

BF16

Model tree for HFXM/LLM4Variants-Llama-3.2-1B-Instruct

Base model

meta-llama/Llama-3.2-1B-Instruct

Finetuned

(1730)

this model