File size: 1,300 Bytes
77388e0
56037a2
 
 
 
77388e0
 
 
56037a2
77388e0
 
56037a2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
---
title: ORD Reagent Index Builder
emoji: 🧪
colorFrom: blue
colorTo: purple
sdk: gradio
app_file: app.py
pinned: false
license: apache-2.0
---

# ORD Reagent Index Builder

Fast search index builder for the Open Reaction Database (2.7M reactions) on Hugging Face Spaces.

## Features**No Docker** - Pure Python with Gradio  
✅ **Fast** - 10-20 minutes on HF servers  
✅ **Simple** - Single click to start  
✅ **Smart** - PubChem chemical name lookup  
✅ **Streaming** - Memory-efficient processing  

## Setup

1. Space created with Gradio SDK
2. Add `HF_TOKEN` as a Space secret
3. Click "Start Building Index"
4. Watch the progress
5. Dataset auto-uploads to `smitathkr1/ord-reagent-index`

## Usage

```python
from datasets import load_dataset

# Load the index
ds = load_dataset('smitathkr1/ord-reagent-index')

# Search for SMILES
smiles_results = ds.filter(lambda x: x['search_term'] == 'c1ccccc1' and x['search_type'] == 'smiles')

# Search for reagent names
name_results = ds.filter(lambda x: x['search_term'].startswith('water'))
```

## Performance

- **Local PC:** 45-60 minutes  
- **HF Spaces:** 10-20 minutes  
- **Speedup:** 10-15x faster!

## About

Built with:
- **Gradio** - Web UI
- **Hugging Face Datasets** - Data handling
- **PubChem** - Chemical name lookup