Upload folder using huggingface_hub

by avantikalal - opened Jan 27

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

+1056

-3

Files changed (4) hide show

2_train.ipynb +0 -0
README.md +41 -3
model.ckpt +3 -0
output.log +52 -0

2_train.ipynb ADDED Viewed

The diff for this file is too large to render. See raw diff

README.md CHANGED Viewed

@@ -1,3 +1,41 @@
----
-license: mit
----

+---
+# 1. Metadata Block
+license: mit
+library_name: pytorch-lightning
+pipeline_tag: tabular-classification
+tags:
+- biology
+- genomics
+datasets:
+- Genentech/human-atac-catlas-data
+---
+# human-atac-catlas-model
+## Model Description
+This model is a multi-task binary classifier trained to predict chromatin accessibility across 204 cell types. It was trained by fine-tuning the Enformer model using the `grelu` library on top of the CATlas human enhancer dataset.
+- **Architecture:** Fine-tuned Enformer
+- **Input:** Genomic sequences (hg38)
+- **Output:** Binary accessibility predictions for 204 cell type tasks.
+## Repository Content
+1. `model.ckpt`: The trained model weights and hyperparameters (PyTorch Lightning checkpoint).
+2. `2_train.ipynb`: Jupyter notebook containing the training logic, architecture definition, and evaluation loops.
+3. `output.log`: Training logs.
+## How to use
+To load this model for inference or fine-tuning, use the `grelu` interface:
+```python
+from grelu.lightning import LightningModel
+from huggingface_hub import hf_hub_download
+ckpt_path = hf_hub_download(
+    repo_id="Genentech/human-atac-catlas-model",
+    filename="model.ckpt"
+)
+model = LightningModel.load_from_checkpoint(ckpt_path)
+model.eval()
+```

model.ckpt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:74e9b1d42b3b61eab7574bd62c170e075b3c87132e060390e29296192988fdc3
+size 344440758

output.log ADDED Viewed

	@@ -0,0 +1,52 @@

+[34m[1mwandb[0m: Downloading large artifact dataset:latest, 179.17MB. 1 files...
+[34m[1mwandb[0m:   1 of 1 files downloaded.
+Done. 0:0:0.3
+/opt/conda/lib/python3.11/site-packages/anndata/_core/aligned_df.py:68: ImplicitModificationWarning: Transforming to str index.
+[34m[1mwandb[0m: [33mWARNING[0m Calling wandb.login() after wandb.init() has no effect.
+[34m[1mwandb[0m: Downloading large artifact human_state_dict:latest, 939.29MB. 1 files...
+[34m[1mwandb[0m:   1 of 1 files downloaded.
+Done. 0:0:0.7
+/opt/conda/lib/python3.11/site-packages/grelu/model/models.py:771: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
+GPU available: True (cuda), used: True
+TPU available: False, using: 0 TPU cores
+HPU available: False, using: 0 HPUs
+/opt/conda/lib/python3.11/site-packages/pytorch_lightning/loggers/wandb.py:397: UserWarning: There is a wandb run already in progress and newly created instances of `WandbLogger` will reuse this run. If this is not desired, call `wandb.finish()` before instantiating `WandbLogger`.
+LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [1]
+Validation DataLoader 0: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 24/24 [00:08<00:00,  2.84it/s]
+LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [1]
+  | Name         | Type                    | Params | Mode
+-----------------------------------------------------------------
+0 | model        | EnformerPretrainedModel | 72.1 M | train
+1 | loss         | BCEWithLogitsLoss       | 0      | train
+2 | val_metrics  | MetricCollection        | 0      | train
+3 | test_metrics | MetricCollection        | 0      | train
+4 | transform    | Identity                | 0      | train
+-----------------------------------------------------------------
+72.1 M    Trainable params
+0         Non-trainable params
+72.1 M    Total params
+288.279   Total estimated model params size (MB)
+240       Modules in train mode
+0         Modules in eval mode
+Epoch 9: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 319/319 [03:28<00:00,  1.53it/s, v_num=t24e, train_loss_step=0.118, train_loss_epoch=0.143]
+Testing DataLoader 0: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 284/284 [00:09<00:00, 28.44it/s]
+`Trainer.fit` stopped: `max_epochs=10` reached.
+[34m[1mwandb[0m: [33mWARNING[0m Calling wandb.login() after wandb.init() has no effect.
+[34m[1mwandb[0m: Downloading large artifact human_state_dict:latest, 939.29MB. 1 files...
+[34m[1mwandb[0m:   1 of 1 files downloaded.
+Done. 0:0:0.7
+/opt/conda/lib/python3.11/site-packages/grelu/model/models.py:771: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
+GPU available: True (cuda), used: True
+TPU available: False, using: 0 TPU cores
+HPU available: False, using: 0 HPUs
+LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [1]
+CPU times: user 13.7 s, sys: 1.66 s, total: 15.4 s
+Wall time: 15.7 s
+/opt/conda/lib/python3.11/site-packages/plotnine/stats/stat_bin.py:109: PlotnineWarning: 'stat_bin()' using 'bins = 19'. Pick better value with 'binwidth'.
+GPU available: True (cuda), used: True
+TPU available: False, using: 0 TPU cores
+HPU available: False, using: 0 HPUs
+LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [1]
+/opt/conda/lib/python3.11/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:425: PossibleUserWarning: The 'predict_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=255` in the `DataLoader` to improve performance.
+Predicting DataLoader 0: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 71/71 [00:04<00:00, 14.21it/s]