Upload folder using huggingface_hub (#1)

Browse files

- Upload folder using huggingface_hub (9a999e2cc1a3541b78b3f6c7f084cfb3c820fb5f)

Files changed (5) hide show

2_train_GM12878_DNase.ipynb +0 -0
3_evaluate_model.ipynb +0 -0
README.md +42 -3
model.ckpt +3 -0
output.log +40 -0

2_train_GM12878_DNase.ipynb ADDED Viewed

The diff for this file is too large to render. See raw diff

3_evaluate_model.ipynb ADDED Viewed

The diff for this file is too large to render. See raw diff

README.md CHANGED Viewed

@@ -1,3 +1,42 @@
----
-license: mit
----

+---
+# 1. Metadata Block
+license: mit
+library_name: pytorch-lightning
+pipeline_tag: tabular-regression
+tags:
+- biology
+- genomics
+datasets:
+- Genentech/GM12878_dnase-data
+---
+# GM12878_dnase-model
+## Model Description
+This model is a single-task regression model trained to take in 2114 bp genomic intervals and predict the total GM12878 DNase-seq coverage in the central 1000 bp. It is described in Lal et al. 2025 (https://www.nature.com/articles/s41592-025-02868-z).
+- **Architecture:** DilatedConvModel (gReLU)
+- **Input:** Genomic sequences (hg38)
+- **Output:** Total DNase-seq coverage in the central 1000 bp.
+## Repository Content
+1. `model.ckpt`: The trained model weights and hyperparameters (PyTorch Lightning checkpoint).
+2. `2_train_GM12878_DNase.ipynb`: Jupyter notebook for training the model.
+3. `3_evaluate_model.ipynb`: Jupyter notebook for evaluating the trained model.
+4. `output.log`: Training logs.
+## How to use
+To load this model for inference or fine-tuning, use the `grelu` interface:
+```python
+from grelu.lightning import LightningModel
+from huggingface_hub import hf_hub_download
+ckpt_path = hf_hub_download(
+    repo_id="Genentech/GM12878_dnase-model",
+    filename="model.ckpt"
+)
+model = LightningModel.load_from_checkpoint(ckpt_path)
+model.eval()
+```

model.ckpt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f266b296f49b4e97e3ac52e94594da0aaae1d467778ea58b421e1ee7c4482bea
+size 31906519

output.log ADDED Viewed

	@@ -0,0 +1,40 @@

+[34m[1mwandb[0m:   1 of 1 files downloaded.
+Selecting training samples
+Keeping 390473 intervals
+Selecting validation samples
+Keeping 21987 intervals
+Selecting test samples
+Keeping 22595 intervals
+Final sizes: train: (390473, 3), val: (21987, 3), test: (22595, 3)
+GPU available: True (cuda), used: True
+TPU available: False, using: 0 TPU cores
+HPU available: False, using: 0 HPUs
+/opt/conda/lib/python3.11/site-packages/pytorch_lightning/loggers/wandb.py:397: There is a wandb run already in progress and newly created instances of `WandbLogger` will reuse this run. If this is not desired, call `wandb.finish()` before instantiating `WandbLogger`.
+LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1,2,3,4,5,6,7]
+Validation DataLoader 0: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 43/43 [00:09<00:00,  4.60it/s]
+/opt/conda/lib/python3.11/site-packages/torchmetrics/utilities/prints.py:43: UserWarning: The variance of predictions or target is close to zero. This can cause instability in Pearson correlationcoefficient, leading to wrong results. Consider re-scaling the input if possible or computing using alarger dtype (currently using torch.float32).
+LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1,2,3,4,5,6,7]
+  | Name         | Type             | Params | Mode
+----------------------------------------------------------
+0 | model        | DilatedConvModel | 6.3 M  | train
+1 | loss         | MSELoss          | 0      | train
+2 | activation   | Identity         | 0      | train
+3 | val_metrics  | MetricCollection | 0      | train
+4 | test_metrics | MetricCollection | 0      | train
+5 | transform    | Identity         | 0      | train
+----------------------------------------------------------
+6.3 M     Trainable params
+0         Non-trainable params
+6.3 M     Total params
+25.358    Total estimated model params size (MB)
+131       Modules in train mode
+0         Modules in eval mode
+Epoch 14: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 763/763 [07:51<00:00,  1.62it/s, v_num=arkd, train_loss_step=0.551, train_loss_epoch=0.456]
+/opt/conda/lib/python3.11/site-packages/torchmetrics/utilities/prints.py:43: UserWarning: The variance of predictions or target is close to zero. This can cause instability in Pearson correlationcoefficient, leading to wrong results. Consider re-scaling the input if possible or computing using alarger dtype (currently using torch.float32).
+`Trainer.fit` stopped: `max_epochs=15` reached.