metadata
title: Bilingual Concordancer
emoji: 🔎
colorFrom: blue
colorTo: gray
sdk: docker
app_port: 7860
pinned: false
Bilingual Concordancer (Streamlit)
This Space serves a bilingual concordancer with word alignment highlighting.
Required corpus files
Place these Parquet files in the Space root (same folder as stwebm_parquet_wa.py):
UNPCwa.parquetNTURegswa.parquetSATwa.parquetVOAwa.parquet
The app expects these exact filenames.
Fancy regex mode
Fancy regex is implemented via a Rust Polars expression plugin under:
rust/fancy_regex_expr_plugin/
The Docker build compiles this plugin automatically.
Notes
search_history.logis created at runtime and appended on each search.- Download exports are available as HTML, DOCX, and XLSX.