LookingGlass Optimal Temperature Classifier
Identifies whether a DNA read originates from an enzyme with a psychrophilic (<15°C), mesophilic (20-40°C), or thermophilic (>50°C) optimal temperature with 70.1% accuracy.
This is a pure PyTorch implementation fine-tuned from the LookingGlass base model.
Links
- Paper: Deep learning of a bacterial and archaeal universal language of life enables transfer learning and illuminates microbial dark matter (Nature Communications, 2022)
- GitHub: ahoarfrost/LookingGlass
- Base Model: HoarfrostLab/lookingglass-v1
Citation
@article{hoarfrost2022deep,
title={Deep learning of a bacterial and archaeal universal language of life
enables transfer learning and illuminates microbial dark matter},
author={Hoarfrost, Adrienne and Aptekmann, Ariel and Farfanuk, Gaetan and Bromberg, Yana},
journal={Nature Communications},
volume={13},
number={1},
pages={2606},
year={2022},
publisher={Nature Publishing Group}
}
Model
| Architecture | LookingGlass encoder + classification head |
| Encoder | AWD-LSTM (3-layer, unidirectional) |
| Classes | 3 classes: psychrophilic, mesophilic, thermophilic |
| Parameters | ~17M |
Installation
pip install torch
git clone https://huggingface.co/HoarfrostLab/LGv1_OptimalTempClassifier
cd LGv1_OptimalTempClassifier
Usage
from lookingglass_classifier import LookingGlassClassifier, LookingGlassTokenizer
model = LookingGlassClassifier.from_pretrained('.')
tokenizer = LookingGlassTokenizer()
model.eval()
inputs = tokenizer(["GATTACA", "ATCGATCGATCG"], return_tensors=True)
# Get predictions
predictions = model.predict(inputs['input_ids'])
print(predictions) # tensor([class_idx, class_idx])
# Get probabilities
probs = model.predict_proba(inputs['input_ids'])
print(probs.shape) # torch.Size([2, 3])
# Get raw logits
logits = model(inputs['input_ids'])
print(logits.shape) # torch.Size([2, 3])
License
MIT License
- Downloads last month
- 7