LLM4Variants-Llama-3.2-1B-Instruct

Dual-head sentence classifier for ACMG evidence-code + strength prediction on ClinVar submission comments. This is rank #2 of the grid search (ranked by joint test accuracy).

The model wraps the backbone meta-llama/Llama-3.2-1B-Instruct with two heads on top of mean-pooled hidden states:

  • code head — 28-way ACMG evidence code (PVS1, PS1–PS4, PM1–PM6, PP1–PP5, BA1, BS1–BS4, BP1–BP7 + NO_KEYWORD)
  • strength head — 6-way strength, conditioned on a learned embedding of the predicted code (Supporting, Moderate, Strong, VeryStrong, NotMet, NoStrength)

Test metrics

Metric Value
Code accuracy 0.9270
Strength accuracy 0.9346
Joint accuracy 0.8773
Strength acc | correct code 0.9464
Code weighted-F1 0.9265
Strength weighted-F1 0.9322

Training configuration

Hyperparameter Value
Learning rate 0.0001
Effective batch size 128
Epochs 8
Max length 256
λ (strength loss) 1.0
Code emb dim 64
Negative ratio 0.25
Seed 42
Train / val / test size 19161 / 1278 / 5110

Files

  • model.safetensors — full state dict (backbone + code_head + code_embeddings + strength_head).
  • label_mappings.jsonkeyword2id / strength2id (and reverse).
  • tokenizer files + chat_template.jinja.

Loading

This is a custom nn.Module (DualHeadLLM), not a transformers AutoModel. Reconstruct the module (see train_dual_head.py), then load the weights:

from safetensors.torch import load_file
from huggingface_hub import hf_hub_download

model = DualHeadLLM("meta-llama/Llama-3.2-1B-Instruct", num_keywords=28, num_strengths=6)
state = load_file(hf_hub_download("HFXM/LLM4Variants-Llama-3.2-1B-Instruct", "model.safetensors"))
model.load_state_dict(state)
Downloads last month

-

Downloads are not tracked for this model. How to track
Safetensors
Model size
1B params
Tensor type
F32
·
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for HFXM/LLM4Variants-Llama-3.2-1B-Instruct

Finetuned
(1730)
this model