bicone / README.md
Ruben Tsui
Initial deploy of bilingual concordancer Docker Space
ce1c12f
metadata
title: Bilingual Concordancer
emoji: 🔎
colorFrom: blue
colorTo: gray
sdk: docker
app_port: 7860
pinned: false

Bilingual Concordancer (Streamlit)

This Space serves a bilingual concordancer with word alignment highlighting.

Required corpus files

Place these Parquet files in the Space root (same folder as stwebm_parquet_wa.py):

  • UNPCwa.parquet
  • NTURegswa.parquet
  • SATwa.parquet
  • VOAwa.parquet

The app expects these exact filenames.

Fancy regex mode

Fancy regex is implemented via a Rust Polars expression plugin under:

  • rust/fancy_regex_expr_plugin/

The Docker build compiles this plugin automatically.

Notes

  • search_history.log is created at runtime and appended on each search.
  • Download exports are available as HTML, DOCX, and XLSX.