duttaprat
/

HViLM-base

Feature Extraction

foundation-model

transmissibility

Model card Files Files and versions

duttaprat commited on Nov 5, 2025

Commit

ddf07ae

·

verified ·

1 Parent(s): 8358e80

Create README.md

Files changed (1) hide show

README.md +52 -0

README.md ADDED Viewed

	@@ -0,0 +1,52 @@

+---
+license: apache-2.0
+tags:
+- genomics
+- dnabert
+- virology
+- foundation-model
+- hvilm
+---
+# HViLM-base: A Foundation Model for Viral Genomics
+This is the base pre-trained model for **HViLM**, as described in the paper:
+**"HViLM: A Foundation Model for Viral Genomics Enables Multi-Task Prediction of Pathogenicity, Transmissibility, and Host Tropism"**
+- **Paper:** [Link to your arXiv paper will go here]
+- **Fine-tuned Models:**
+    - `duttaprat/HViLM-finetuned-pathogenicity` (coming soon)
+    - `duttaprat/HViLM-finetuned-host-tropism` (coming soon)
+    - `duttaprat/HViLM-finetuned-transmissibility-R0` (coming soon)
+## Model Description
+(Paste your abstract here)
+## How to Use
+This model requires trusting remote code because it uses custom architecture files (`bert_layers.py`, etc.).
+```python
+from transformers import AutoTokenizer, AutoModel
+import torch
+repo_id = "duttaprat/HViLM-base"
+# This will download the files you just uploaded
+tokenizer = AutoTokenizer.from_pretrained(repo_id)
+model = AutoModel.from_pretrained(
+    repo_id,
+    trust_remote_code=True  # <-- This is ESSENTIAL
+)
+print("Model and tokenizer loaded successfully!")
+# Example: Get embeddings for a sequence
+sequence = "ATGCGTACGT..."
+inputs = tokenizer(sequence, return_tensors="pt")
+with torch.no_grad():
+    outputs = model(**inputs)
+    embeddings = outputs.last_hidden_state
+print(embeddings.shape)