anway's picture
h5ad_viewer
05fdb87 verified

A newer version of the Gradio SDK is available: 6.2.0

Upgrade
metadata
title: Spatial Transcriptomics Viewer
emoji: 🧬
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.0.0
app_file: app.py
pinned: false
license: mit

Spatial Transcriptomics Viewer

A web-based tool for visualizing spatial gene expression from AnnData (.h5ad) files.

Features

  • Interactive Visualization: Explore spatial gene expression with interactive Plotly plots
  • Memory Efficient: Uses AnnData backed mode for handling large datasets
  • Flexible Input: Load data from URLs (HuggingFace, Zenodo) or upload files
  • Single-Gene Queries: Visualize expression of individual genes across spatial coordinates
  • Expression Statistics: Get detailed statistics for each gene
  • Customizable: Adjust point size, color scale, and transformations

Quick Start

Using the Public Demo

  1. Visit the Space URL
  2. Load your data:
    • URL: Paste a link to your h5ad file
    • Upload: Upload your h5ad file directly (< 2GB recommended)
  3. Enter a gene name and visualize!

For Heavy Usage: Duplicate This Space

For large files or frequent use, we recommend duplicating this Space to your account:

  1. Click the menu at the top right
  2. Select "Duplicate this Space"
  3. Choose your HuggingFace account
  4. (Optional) Upgrade to persistent storage for better performance

Benefits of Duplicating:

  • Independent computing resources
  • No queueing with other users
  • Private data processing
  • Customizable settings
  • Optional paid upgrades for more resources

Data Requirements

Your h5ad file must contain:

  • adata.obsm['spatial']: 2D spatial coordinates (N × 2 array)
  • Gene expression data in adata.X
  • Gene names in adata.var_names

Supported formats:

  • Visium (10x Genomics)
  • MERFISH
  • seqFISH
  • Any spatial transcriptomics data in AnnData format

How It Works

Architecture

User Input (URL/Upload)
    ↓
Load h5ad with backed='r' (memory efficient)
    ↓
Validate spatial coordinates
    ↓
Query single gene expression
    ↓
Plotly interactive visualization

Memory Efficiency

This tool uses AnnData's backed mode (backed='r'), which means:

  • Files are read from disk on-demand
  • Only requested data is loaded into memory
  • Can handle files much larger than available RAM
  • Suitable for large-scale spatial transcriptomics datasets

Technical Details

Stack

  • Frontend: Gradio 4.0+
  • Backend: Python 3.9+
  • Data: AnnData, scanpy
  • Visualization: Plotly
  • Platform: Hugging Face Spaces

File Size Limits

Public Space:

  • Recommended: < 2GB
  • Maximum: ~10GB (may be slow)

Duplicated Space (Free):

  • Recommended: < 5GB
  • With persistent storage upgrade: 50GB+

URL Sources

Supported domains for URL input:

  • huggingface.co - HuggingFace Datasets
  • zenodo.org - Zenodo repositories
  • s3.amazonaws.com - S3 buckets

Usage Examples

Example 1: Visualize from HuggingFace Dataset

# If you have a h5ad file in a HuggingFace dataset:
URL = "https://huggingface.co/datasets/{username}/{dataset}/resolve/main/data.h5ad"

# Paste this URL in the tool and load
# Then enter gene names like: "GAPDH", "ACTB", "MYC"

Example 2: Prepare Your Own Data

import scanpy as sc
import numpy as np

# Load your data
adata = sc.read_10x_h5("your_data.h5")

# Add spatial coordinates (if not already present)
# Example: load from spatial folder
spatial = sc.read_visium("path/to/spatial_folder")
adata.obsm['spatial'] = spatial.obsm['spatial']

# Save as h5ad
adata.write("your_spatial_data.h5ad")

# Upload to HuggingFace Dataset or use directly

Privacy & Data Security

Public Space

  • Files are processed in temporary storage
  • No permanent data retention
  • Cleared after session ends
  • Not suitable for sensitive data

Duplicated Private Space

  • Data stays in your account
  • Full control over access
  • Suitable for private research data
  • Can delete anytime

Limitations

  • No preprocessing: Tool does not normalize, scale, or transform data
  • Read-only: Cannot modify or save h5ad files
  • Single gene: Visualize one gene at a time
  • 2D spatial only: Requires 2D coordinates in obsm['spatial']

Troubleshooting

"Spatial coordinates not found"

  • Check that your h5ad contains adata.obsm['spatial']
  • Ensure it's a 2D array (N × 2)

"Gene not found"

  • Check gene name spelling
  • Use exact gene names from adata.var_names
  • Tool will suggest similar gene names

"File too large" or slow loading

  • Try duplicating the Space for more resources
  • Consider subsetting your data
  • Use URL input instead of upload

Memory errors

  • Ensure backed mode is working (check file size limits)
  • Duplicate Space for more RAM
  • Consider downsampling your dataset

Development

Local Setup

# Clone the repository
git clone <repo_url>
cd spatial-viewer

# Install dependencies
pip install -r requirements.txt

# Run locally
python app.py

Project Structure

spatial-viewer/
├── app.py                  # Main Gradio application
├── utils/
│   ├── __init__.py
│   ├── loader.py           # H5ad loading with backed mode
│   ├── validator.py        # AnnData validation
│   └── plot.py             # Plotly visualization
├── data/
│   └── demo.h5ad           # (Optional) Demo dataset
├── requirements.txt        # Python dependencies
├── README.md               # This file
└── .huggingface/
    └── space_config.yaml   # HF Space configuration

Contributing

Contributions welcome! Areas for improvement:

  • Multi-gene visualization
  • Additional plot types
  • Performance optimizations
  • UI enhancements
  • Documentation

Citation

If you use this tool in your research, please cite:

@software{spatial_viewer,
  title = {Spatial Transcriptomics Viewer},
  author = {Your Name},
  year = {2025},
  url = {https://huggingface.co/spaces/...}
}

License

MIT License - see LICENSE file for details

Acknowledgments


中文说明

功能特点

这是一个基于网页的空间转录组基因表达可视化工具,支持 AnnData (.h5ad) 格式。

主要特性:

  • 交互式可视化
  • 内存高效(支持大文件)
  • 灵活的输入方式(URL 或上传)
  • 单基因表达查询
  • 表达量统计分析

使用方法

  1. 加载数据:通过 URL 或上传 h5ad 文件
  2. 输入基因名:输入您想查看的基因
  3. 可视化:查看空间表达图和统计信息

大文件或高频使用

对于大型 h5ad 文件(>2GB)或频繁使用,建议 复制此 Space 到您的账户:

  • 独立计算资源
  • 无需排队
  • 数据隐私保护
  • 可选付费升级

数据要求

您的 h5ad 文件必须包含:

  • adata.obsm['spatial']:空间坐标(N × 2)
  • adata.X:基因表达数据
  • adata.var_names:基因名称

支持 Visium、MERFISH、seqFISH 等格式。

技术原理

使用 AnnData 的 backed 模式backed='r'):

  • 按需从磁盘读取数据
  • 内存占用最小化
  • 可处理大于内存的文件
  • 适合大规模空间转录组数据

为空间转录组研究社区构建 🧬