Instructions to use lukeingawesome/chest2vec_4b_chest with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use lukeingawesome/chest2vec_4b_chest with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("feature-extraction", model="lukeingawesome/chest2vec_4b_chest", trust_remote_code=True)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("lukeingawesome/chest2vec_4b_chest", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
chest2vec_4b_chest
A chest-radiology text embedding model: Qwen/Qwen3-Embedding-4B
contrastively LoRA-adapted for chest CT / CXR report retrieval. (Larger sibling of
chest2vec_0.6b_chest.)
The embedding is the left-padding-aware last-token (EOS) pooled final hidden state, L2-normalized.
Self-contained AutoModel (recommended)
The LoRA adapter is merged into the weights (model.safetensors, bf16) and the tokenizer is
bundled, so the model loads with no chest2vec package and no download of the base
Qwen3-Embedding weights:
from transformers import AutoModel, AutoTokenizer
model = AutoModel.from_pretrained("lukeingawesome/chest2vec_4b_chest", trust_remote_code=True).eval()
tok = AutoTokenizer.from_pretrained("lukeingawesome/chest2vec_4b_chest", trust_remote_code=True)
docs = ["Bibasilar atelectasis with small bilateral pleural effusions. Cardiomegaly.",
"Several pulmonary nodules, largest 6 mm in the right lower lobe. Mild emphysema."]
doc_emb = model.embed(docs, tokenizer=tok) # [N, 2560] float32, L2-normalized
q_emb = model.embed(["pleural effusion"], tokenizer=tok,
instruction="Retrieve chest CT reports relevant to the query")
sims = q_emb @ doc_emb.T
model(input_ids, attention_mask) returns pooler_output (the normalized embedding); the
model.embed(...) helper handles tokenization, EOS/last-token pooling, and batching.
Details
- Base: Qwen/Qwen3-Embedding-4B (Apache-2.0) — architecture rebuilt from the bundled config; merged weights loaded from this repo.
- Embedding dim: 2560 · max length: 512 · pooling: last-token (EOS) + L2-norm.
- Precision: bf16 weights (~8 GB). Default attention
sdpa(setattn_implementation="flash_attention_2"for speed on Ampere+). - Merged weights reproduce the original adapter-based embeddings to cosine ≥ 0.9996 (bf16-rounding only).
Matryoshka embeddings
Trained with the same Matryoshka (MRL) recipe as the 0.6B sibling, so the 2560-d embedding can be truncated to 512 or 256 dimensions (keep the first N dims and re-normalize):
emb_512 = model.embed(docs, tokenizer=tok, dim=512) # [N, 512], L2-normalized
emb_256 = model.embed(docs, tokenizer=tok, dim=256) # [N, 256]
Use the same dim for queries and corpus. Recommended dims: 2560 (full) · 512 · 256
(config.matryoshka_dims).
Recommended instructions
The model is instruction-conditioned: prepend a task instruction as
Instruct: {instruction}\nQuery: {report} (handled by model.embed(..., instruction=...)). It was
trained on chest CT and CXR reports across the task families below — use the matching string.
Convention: apply the instruction to the query side; embed the corpus/documents without it.
1. Retrieval (report → similar report)
| Instruction | |
|---|---|
| CT | Retrieve the chest CT report that is similar to the given report. |
| CXR | Retrieve the CXR report that is similar to the given report. |
| CXR (ignore prior/comparison) | Retrieve the CXR report that is similar to the given report with prior reference omitted. |
2. Summarization-oriented embedding
| Instruction | |
|---|---|
| CT | Summarize the following chest CT report |
| CXR | Summarize the following CXR report |
| generic | Summarize the given report. |
3. Entity extraction / classification (full leaf taxonomy)
| Instruction | |
|---|---|
| CT | Given the following chest CT report, extract the presence/absence of entities |
| CXR | Given the following CXR report, extract the presence/absence of entities |
4. Entity extraction — upper / coarse class
| Instruction | |
|---|---|
| CT | Given the following chest CT report, extract the presence/absence of upper-level entities |
| CXR | Given the following CXR report, extract the presence/absence of upper class entities |
5. Anatomy-specific extraction — template:
From the following chest {CT report | X-ray report}, extract and return only the findings related to {REGION}, ignoring all information about other structures.
- CT regions: lungs · airways and trachea · pleura · mediastinum and hilum · cardiovascular system · chest wall · bones and spine · upper abdomen · lower neck
- CXR regions: lungs and airways · pleura · hila and mediastinum · cardiovascular system · musculoskeletal structures and chest wall · tubes, catheters, and support devices · abdomen
# task-conditioned embedding
q = model.embed(["pleural effusion and cardiomegaly"], tokenizer=tok,
instruction="Retrieve the chest CT report that is similar to the given report.")
lungs = model.embed([report], tokenizer=tok,
instruction="From the following chest CT report, extract and return only the findings related to the lungs, ignoring all information about other structures.")
For supervised classification, embed reports under the relevant instruction (e.g. the extraction one) and train a linear head on the embeddings.
Legacy loader (still available)
The original delta-weights layout is retained for backward compatibility: chest2vec.py +
chest2vec_config.json + contrastive/ (LoRA adapter). That path uses the chest2vec package
(Chest2Vec.from_pretrained) and downloads the base Qwen3-Embedding weights at runtime. New users
should prefer the self-contained AutoModel path above.
- Downloads last month
- 95