|
|
--- |
|
|
title: ORD Reagent Index Builder |
|
|
emoji: π§ͺ |
|
|
colorFrom: blue |
|
|
colorTo: purple |
|
|
sdk: gradio |
|
|
app_file: app.py |
|
|
pinned: false |
|
|
license: apache-2.0 |
|
|
--- |
|
|
|
|
|
# ORD Reagent Index Builder |
|
|
|
|
|
Fast search index builder for the Open Reaction Database (2.7M reactions) on Hugging Face Spaces. |
|
|
|
|
|
## Features |
|
|
|
|
|
β
**No Docker** - Pure Python with Gradio |
|
|
β
**Fast** - 10-20 minutes on HF servers |
|
|
β
**Simple** - Single click to start |
|
|
β
**Smart** - PubChem chemical name lookup |
|
|
β
**Streaming** - Memory-efficient processing |
|
|
|
|
|
## Setup |
|
|
|
|
|
1. Space created with Gradio SDK |
|
|
2. Add `HF_TOKEN` as a Space secret |
|
|
3. Click "Start Building Index" |
|
|
4. Watch the progress |
|
|
5. Dataset auto-uploads to `smitathkr1/ord-reagent-index` |
|
|
|
|
|
## Usage |
|
|
|
|
|
```python |
|
|
from datasets import load_dataset |
|
|
|
|
|
# Load the index |
|
|
ds = load_dataset('smitathkr1/ord-reagent-index') |
|
|
|
|
|
# Search for SMILES |
|
|
smiles_results = ds.filter(lambda x: x['search_term'] == 'c1ccccc1' and x['search_type'] == 'smiles') |
|
|
|
|
|
# Search for reagent names |
|
|
name_results = ds.filter(lambda x: x['search_term'].startswith('water')) |
|
|
``` |
|
|
|
|
|
## Performance |
|
|
|
|
|
- **Local PC:** 45-60 minutes |
|
|
- **HF Spaces:** 10-20 minutes |
|
|
- **Speedup:** 10-15x faster! |
|
|
|
|
|
## About |
|
|
|
|
|
Built with: |
|
|
- **Gradio** - Web UI |
|
|
- **Hugging Face Datasets** - Data handling |
|
|
- **PubChem** - Chemical name lookup |
|
|
|