Daniil Litvinov commited on
Commit
5f6420b
·
1 Parent(s): 202740c

Added README.md

Browse files
Files changed (1) hide show
  1. README.md +175 -11
README.md CHANGED
@@ -1,11 +1,175 @@
1
- ---
2
- license: mit
3
- tags:
4
- - model_hub_mixin
5
- - pytorch_model_hub_mixin
6
- ---
7
-
8
- This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
9
- - Code: stoic
10
- - Paper: [More Information Needed]
11
- - Docs: [More Information Needed]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Stoic
2
+
3
+ Fast and accurate protein stoichiometry prediction.
4
+
5
+
6
+ [![license](https://img.shields.io/badge/license-MIT-blue)](https://github.com/PickyBinders/stoic/blob/master/LICENSE.txt)
7
+ [![bioRxiv](https://img.shields.io/badge/bioRxiv-2026.03.13.711535-blue.svg)](https://www.biorxiv.org/content/10.64898/2026.03.13.711535)
8
+ [![codecov](https://codecov.io/gh/PickyBinders/stoic/branch/main/graph/badge.svg)](https://codecov.io/gh/PickyBinders/stoic)
9
+ [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/PickyBinders/stoic/blob/main/stoic_colab.ipynb)
10
+ [![Open in Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/PickyBinders/stoic-space)
11
+ [![HuggingFace model](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Model-blue)](https://huggingface.co/PickyBinders/stoic)
12
+
13
+ ![Model Architecture](images/Figure_architecture.png)
14
+
15
+ Stoic predicts copy numbers for protein complex components directly from sequence, and can also export AF3-ready JSON based on the top predicted stoichiometries.
16
+
17
+ Web version (Hugging Face Space): [stoic-space](https://huggingface.co/spaces/PickyBinders/stoic-space)
18
+ Pre-print: [Stoic: Fast and accurate protein stoichiometry prediction](https://www.biorxiv.org/content/10.64898/2026.03.13.711535v1.abstract?%3Fcollection=)
19
+
20
+ ## Installation
21
+
22
+ ### 1. Create and activate an environment
23
+
24
+ #### `venv`
25
+
26
+ ```bash
27
+ python -m venv .venv
28
+ source .venv/bin/activate
29
+ ```
30
+
31
+ #### `conda` / `mamba`
32
+
33
+ ```bash
34
+ mamba create -n stoic-env python=3.10 -y
35
+ mamba activate stoic-env
36
+ ```
37
+
38
+ ### 2. Install Stoic (after env activation)
39
+
40
+ #### Install from local clone (editable)
41
+
42
+ ```bash
43
+ git clone https://github.com/PickyBinders/stoic.git
44
+ cd stoic
45
+ python -m pip install --upgrade pip
46
+ python -m pip install -e .
47
+ ```
48
+
49
+ #### Install directly from GitHub
50
+
51
+ ```bash
52
+ python -m pip install git+https://github.com/PickyBinders/stoic.git
53
+ ```
54
+
55
+ > **Note:** The first inference run requires internet connection to download model weights from Hugging Face. Next runs reuse cached files from `~/.cache/huggingface`, so offline usage works once the model is cached.
56
+
57
+ ## Predict Stoichiometry from CLI
58
+
59
+ The `stoic_predict_stoichiometry` command supports:
60
+
61
+ 1. a list of sequences,
62
+ 2. a single FASTA file,
63
+ 3. a directory of FASTA files (each FASTA treated as a separate complex).
64
+
65
+ ```text
66
+ usage: stoic_predict_stoichiometry [-h]
67
+ [--sequences SEQ [SEQ ...] | --input-path INPUT_PATH]
68
+ [--model MODEL]
69
+ [--top-n TOP_N]
70
+ [--return-residue-weights]
71
+ [--max-inference-seq-len MAX_INFERENCE_SEQ_LEN]
72
+ [--output-dir OUTPUT_DIR]
73
+ [--device DEVICE]
74
+
75
+ options:
76
+ -h, --help show this help message and exit
77
+ --sequences SEQ [SEQ ...]
78
+ Protein sequences (one per unique chain)
79
+ --input-path INPUT_PATH
80
+ Path to a FASTA file or a directory with FASTA files
81
+ --model MODEL HuggingFace model name or local path (default: PickyBinders/stoic)
82
+ --top-n TOP_N Number of top stoichiometry candidates (default: 3)
83
+ --return-residue-weights
84
+ Return residue weights and save residue-level predictions
85
+ --max-inference-seq-len MAX_INFERENCE_SEQ_LEN
86
+ Maximum sequence length for full-length inference
87
+ --output-dir OUTPUT_DIR
88
+ Output directory for predictions and AF3 JSON files
89
+ --device DEVICE Device to use, e.g. cuda or cpu (default: auto-detect)
90
+ ```
91
+
92
+ ### Sequence list
93
+
94
+ ```bash
95
+ stoic_predict_stoichiometry \
96
+ --sequences "SENECA" "VIRTVS" \
97
+ --top-n 3
98
+ ```
99
+
100
+ ### Single FASTA file
101
+
102
+ ```bash
103
+ stoic_predict_stoichiometry \
104
+ --input-path path/to/complex.fasta \
105
+ --top-n 3
106
+ ```
107
+
108
+ ### Directory of FASTA files
109
+
110
+ ```bash
111
+ stoic_predict_stoichiometry \
112
+ --input-path path/to/fasta_dir \
113
+ --top-n 3 \
114
+ --output-dir stoic_predictions
115
+ ```
116
+
117
+ In directory mode, outputs are saved per complex (`<fasta_stem>.json`, `<fasta_stem>_af3_input.json`, and optional residue predictions).
118
+
119
+ ### Output files
120
+
121
+ When `--output-dir` is provided:
122
+
123
+ - single input (sequence list or single FASTA):
124
+ - `results.json`
125
+ - `af3_input.json`
126
+ - `residue_predictions.pkl` (if `--return-residue-weights`)
127
+ - FASTA directory input:
128
+ - `<complex_name>.json`
129
+ - `<complex_name>_af3_input.json`
130
+ - `<complex_name>_residue_predictions.pkl` (if `--return-residue-weights`)
131
+
132
+ ## Use as a Python API
133
+
134
+ ### High-level inference helper
135
+
136
+ ```python
137
+ from stoic.predict_stoichiometry import predict_stoichiometry
138
+
139
+ results = predict_stoichiometry(
140
+ sequences=["SENECA", "VIRTVS"], # or FASTA path / FASTA dir path
141
+ model_name="PickyBinders/stoic",
142
+ top_n=3,
143
+ )
144
+ print(results)
145
+ ```
146
+
147
+ ### Load model directly from Hugging Face
148
+
149
+ ```python
150
+ import torch
151
+ from stoic.model import Stoic
152
+
153
+
154
+ device = "cuda" if torch.cuda.is_available() else "cpu"
155
+ model = Stoic.from_pretrained("PickyBinders/stoic")
156
+ model.eval().to(device)
157
+ pred = model.predict_stoichiometry(["SENECA", "VIRTVS"], top_n=3)
158
+ print(pred)
159
+ ```
160
+
161
+ ## Citation
162
+
163
+ If you use Stoic, please cite:
164
+
165
+ ```text
166
+ @article{litvinov2026stoic,
167
+ title = {Stoic: Fast and accurate protein stoichiometry prediction},
168
+ author = {Litvinov, Daniil and Pantolini, Lorenzo and {\v{S}}krinjar, Peter and Tauriello, Gerardo and McCafferty, Caitlyn L and Engel, Benjamin D and Schwede, Torsten and Durairaj, Janani},
169
+ journal = {bioRxiv},
170
+ year = {2026},
171
+ doi = {10.64898/2026.03.13.711535},
172
+ url = {https://www.biorxiv.org/content/10.64898/2026.03.13.711535v1}
173
+ }
174
+ ```
175
+