Spaces:
Sleeping
Sleeping
File size: 1,637 Bytes
52e5b45 a3d14e7 31ba0eb 84deb4c 52e5b45 adb26ab 52e5b45 31ba0eb 52e5b45 31ba0eb 52e5b45 31ba0eb 52e5b45 31ba0eb 52e5b45 31ba0eb 52e5b45 31ba0eb 52e5b45 31ba0eb 52e5b45 31ba0eb 52e5b45 31ba0eb f44b2b9 31ba0eb f44b2b9 31ba0eb 52e5b45 31ba0eb 52e5b45 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 | ---
title: CRISPR Array Detection
emoji: π§¬
colorFrom: gray
colorTo: gray
sdk: docker
pinned: false
license: mit
short_description: Detect CRISPR arrays in DNA sequences
---
# crispr-detect
BERT-based CRISPR array detection in prokaryotic genomes.
## Model
| | |
|---|---|
| architecture | BERT, 24 layers, 768 hidden, 430M params |
| input | DNA sequence (min 1000 bp) |
| output | per-position probability (0-1) |
## Deployment
### Push changes
```bash
cd /vol/hpcprojects/pmuench/crispr_tool/crispr-hf-space
git add -A
git commit -m "description"
git push
```
### Git credentials (first time)
```bash
git config --global credential.helper store
huggingface-cli login
# paste token from https://huggingface.co/settings/tokens
```
### Clone fresh
```bash
git clone https://huggingface.co/spaces/genomenet/crispr-array-detection
```
### Space settings (HuggingFace web UI)
- SDK: Docker
- Hardware: CPU Basic works for the default demo; T4 GPU is recommended for long sequences or low stride values
- Visibility: Public
### Model weights
Hosted at: https://huggingface.co/genomenet/crispr-bert-model
Downloaded automatically via `huggingface_hub` at startup.
## Local dev
```bash
pip install -r requirements.txt
python app.py
# http://localhost:7860
```
## Files
```
βββ app.py # gradio app
βββ inference/
β βββ model_loader.py # model download
β βββ tokenizer.py # sequence validation
β βββ inference.py # prediction
βββ Dockerfile
βββ requirements.txt
```
## Acknowledgements
- Ziyu Mu (HZI BIFO)
- DFG SPP 2141 (MC 172)
- BMBF GenomeNet
|