bicone / README.md
Ruben Tsui
Initial deploy of bilingual concordancer Docker Space
ce1c12f
---
title: Bilingual Concordancer
emoji: ๐Ÿ”Ž
colorFrom: blue
colorTo: gray
sdk: docker
app_port: 7860
pinned: false
---
# Bilingual Concordancer (Streamlit)
This Space serves a bilingual concordancer with word alignment highlighting.
## Required corpus files
Place these Parquet files in the Space root (same folder as `stwebm_parquet_wa.py`):
- `UNPCwa.parquet`
- `NTURegswa.parquet`
- `SATwa.parquet`
- `VOAwa.parquet`
The app expects these exact filenames.
## Fancy regex mode
`Fancy regex` is implemented via a Rust Polars expression plugin under:
- `rust/fancy_regex_expr_plugin/`
The Docker build compiles this plugin automatically.
## Notes
- `search_history.log` is created at runtime and appended on each search.
- Download exports are available as HTML, DOCX, and XLSX.