chest2vec_4b_chest

A chest-radiology text embedding model: Qwen/Qwen3-Embedding-4B contrastively LoRA-adapted for chest CT / CXR report retrieval. (Larger sibling of chest2vec_0.6b_chest.)

The embedding is the left-padding-aware last-token (EOS) pooled final hidden state, L2-normalized.

Self-contained AutoModel (recommended)

The LoRA adapter is merged into the weights (model.safetensors, bf16) and the tokenizer is bundled, so the model loads with no chest2vec package and no download of the base Qwen3-Embedding weights:

from transformers import AutoModel, AutoTokenizer

model = AutoModel.from_pretrained("lukeingawesome/chest2vec_4b_chest", trust_remote_code=True).eval()
tok   = AutoTokenizer.from_pretrained("lukeingawesome/chest2vec_4b_chest", trust_remote_code=True)

docs = ["Bibasilar atelectasis with small bilateral pleural effusions. Cardiomegaly.",
        "Several pulmonary nodules, largest 6 mm in the right lower lobe. Mild emphysema."]
doc_emb = model.embed(docs, tokenizer=tok)            # [N, 2560] float32, L2-normalized

q_emb = model.embed(["pleural effusion"], tokenizer=tok,
                    instruction="Retrieve chest CT reports relevant to the query")
sims = q_emb @ doc_emb.T

model(input_ids, attention_mask) returns pooler_output (the normalized embedding); the model.embed(...) helper handles tokenization, EOS/last-token pooling, and batching.

Details

  • Base: Qwen/Qwen3-Embedding-4B (Apache-2.0) — architecture rebuilt from the bundled config; merged weights loaded from this repo.
  • Embedding dim: 2560 · max length: 512 · pooling: last-token (EOS) + L2-norm.
  • Precision: bf16 weights (~8 GB). Default attention sdpa (set attn_implementation="flash_attention_2" for speed on Ampere+).
  • Merged weights reproduce the original adapter-based embeddings to cosine ≥ 0.9996 (bf16-rounding only).

Matryoshka embeddings

Trained with the same Matryoshka (MRL) recipe as the 0.6B sibling, so the 2560-d embedding can be truncated to 512 or 256 dimensions (keep the first N dims and re-normalize):

emb_512 = model.embed(docs, tokenizer=tok, dim=512)   # [N, 512], L2-normalized
emb_256 = model.embed(docs, tokenizer=tok, dim=256)   # [N, 256]

Use the same dim for queries and corpus. Recommended dims: 2560 (full) · 512 · 256 (config.matryoshka_dims).

Recommended instructions

The model is instruction-conditioned: prepend a task instruction as Instruct: {instruction}\nQuery: {report} (handled by model.embed(..., instruction=...)). It was trained on chest CT and CXR reports across the task families below — use the matching string. Convention: apply the instruction to the query side; embed the corpus/documents without it.

1. Retrieval (report → similar report)

Instruction
CT Retrieve the chest CT report that is similar to the given report.
CXR Retrieve the CXR report that is similar to the given report.
CXR (ignore prior/comparison) Retrieve the CXR report that is similar to the given report with prior reference omitted.

2. Summarization-oriented embedding

Instruction
CT Summarize the following chest CT report
CXR Summarize the following CXR report
generic Summarize the given report.

3. Entity extraction / classification (full leaf taxonomy)

Instruction
CT Given the following chest CT report, extract the presence/absence of entities
CXR Given the following CXR report, extract the presence/absence of entities

4. Entity extraction — upper / coarse class

Instruction
CT Given the following chest CT report, extract the presence/absence of upper-level entities
CXR Given the following CXR report, extract the presence/absence of upper class entities

5. Anatomy-specific extraction — template: From the following chest {CT report | X-ray report}, extract and return only the findings related to {REGION}, ignoring all information about other structures.

  • CT regions: lungs · airways and trachea · pleura · mediastinum and hilum · cardiovascular system · chest wall · bones and spine · upper abdomen · lower neck
  • CXR regions: lungs and airways · pleura · hila and mediastinum · cardiovascular system · musculoskeletal structures and chest wall · tubes, catheters, and support devices · abdomen
# task-conditioned embedding
q = model.embed(["pleural effusion and cardiomegaly"], tokenizer=tok,
                instruction="Retrieve the chest CT report that is similar to the given report.")
lungs = model.embed([report], tokenizer=tok,
                    instruction="From the following chest CT report, extract and return only the findings related to the lungs, ignoring all information about other structures.")

For supervised classification, embed reports under the relevant instruction (e.g. the extraction one) and train a linear head on the embeddings.

Legacy loader (still available)

The original delta-weights layout is retained for backward compatibility: chest2vec.py + chest2vec_config.json + contrastive/ (LoRA adapter). That path uses the chest2vec package (Chest2Vec.from_pretrained) and downloads the base Qwen3-Embedding weights at runtime. New users should prefer the self-contained AutoModel path above.

Downloads last month
95
Safetensors
Model size
4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for lukeingawesome/chest2vec_4b_chest

Finetuned
(52)
this model