metadata
title: Grounding Snippet Generator
emoji: 🚀
colorFrom: red
colorTo: red
sdk: docker
app_port: 8501
tags:
- streamlit
pinned: false
short_description: Streamlit template space
title: Snippet Generator emoji: ✂️ colorFrom: blue colorTo: purple sdk: streamlit sdk_version: 1.28.0 app_file: app.py pinned: false license: mit
✂️ Snippet Generator
Recreates Google Vertex AI / Gemini grounding-style extractive snippets.
How it works
- Sentence Segmentation - Splits document on sentence boundaries and newlines, filters noise (URLs, questions, low-alpha content)
- Cross-Encoder Scoring - Uses
cross-encoder/ms-marco-electra-baseto score query-sentence relevance - Budget Selection - Picks top-scoring sentences within character/count limits
- Document-Order Stitching - Reassembles in original order with
...for gaps
Model
Uses MS MARCO Cross-Encoder trained on search relevance data - the same task as snippet generation.
Usage
- Enter a search query
- Paste document content
- Adjust settings (max chars, max sentences)
- Click "Generate Snippet"
Example
Query: best prostate cancer treatment in the world
Output:
This makes Asklepios one of the best prostate cancer treatment centers for foreigners. ... Spain is a leader in prostate cancer treatment, with top clinics like Centro Médico Teknon, Quironsalud Madrid, and Hospital Quiron Barcelona. ...