Spaces:
Sleeping
Sleeping
| title: Ask GC Library Guides | |
| emoji: ๐ | |
| colorFrom: blue | |
| colorTo: green | |
| sdk: streamlit | |
| sdk_version: 1.44.0 | |
| app_file: app.py | |
| pinned: false | |
| hf_oauth: false | |
| hf_username: '' | |
| hf_token: '' | |
| hf_private: false | |
| hf_space_id: '' | |
| hf_disable_embedding: false | |
| hf_disable_inference: false | |
| hf_disable_sharing: false | |
| hf_disable_suggestion: false | |
| hf_suggested_questions: [] | |
| hf_suggested_themes: [] | |
| hf_suggested_examples: [] | |
| hf_suggested_datasets: [] | |
| hf_suggested_models: [] | |
| hf_suggested_tasks: [] | |
| hf_suggested_libraries: [] | |
| hf_suggested_metrics: [] | |
| hf_suggested_visualizers: [] | |
| hf_suggested_widgets: [] | |
| hf_suggested_co2: [] | |
| hf_suggested_pipeline_tags: [] | |
| hf_suggested_tags: [] | |
| hf_suggested_configs: [] | |
| hf_suggested_args: [] | |
| hf_suggested_kwargs: [] | |
| hf_suggested_env: {} | |
| hf_suggested_requirements: [] | |
| hf_suggested_setup: '' | |
| hf_suggested_dockerfile: '' | |
| hf_suggested_app_file: '' | |
| hf_suggested_sdk: '' | |
| hf_suggested_sdk_version: '' | |
| hf_suggested_python_version: '' | |
| hf_suggested_base_image: '' | |
| hf_suggested_entrypoint: '' | |
| hf_suggested_cmd: '' | |
| hf_suggested_workdir: '' | |
| hf_suggested_expose: [] | |
| hf_suggested_volumes: [] | |
| hf_suggested_ports: [] | |
| hf_suggested_networks: [] | |
| hf_suggested_depends_on: [] | |
| hf_suggested_links: [] | |
| hf_suggested_extra_hosts: [] | |
| hf_suggested_dns: [] | |
| hf_suggested_dns_search: [] | |
| hf_suggested_cap_add: [] | |
| hf_suggested_cap_drop: [] | |
| hf_suggested_cgroup_parent: '' | |
| hf_suggested_devices: [] | |
| hf_suggested_device_requests: [] | |
| hf_suggested_device_cgroup_rules: [] | |
| hf_suggested_dns_opt: [] | |
| hf_suggested_domainname: '' | |
| hf_suggested_entrypoint_args: [] | |
| hf_suggested_env_file: [] | |
| hf_suggested_expose_ports: [] | |
| hf_suggested_external_links: [] | |
| hf_suggested_extra_hosts_list: [] | |
| hf_suggested_healthcheck: {} | |
| hf_suggested_hostname: '' | |
| hf_suggested_init: false | |
| hf_suggested_ipc: '' | |
| hf_suggested_labels: {} | |
| hf_suggested_links_list: [] | |
| hf_suggested_logging: {} | |
| hf_suggested_mac_address: '' | |
| hf_suggested_network_mode: '' | |
| hf_suggested_networks_list: [] | |
| hf_suggested_pid: '' | |
| hf_suggested_ports_list: [] | |
| hf_suggested_privileged: false | |
| hf_suggested_read_only: false | |
| hf_suggested_restart: '' | |
| hf_suggested_security_opt: [] | |
| hf_suggested_shm_size: '' | |
| hf_suggested_stdin_open: false | |
| hf_suggested_stop_grace_period: '' | |
| hf_suggested_stop_signal: '' | |
| hf_suggested_sysctls: {} | |
| hf_suggested_tmpfs: [] | |
| hf_suggested_tty: false | |
| hf_suggested_ulimits: {} | |
| hf_suggested_user: '' | |
| hf_suggested_userns_mode: '' | |
| hf_suggested_volumes_from: [] | |
| hf_suggested_volumes_list: [] | |
| hf_suggested_working_dir: '' | |
| # Ask GC Library Guides (RAG Demo) | |
| This Space demonstrates a Retrieval-Augmented Generation (RAG) application built with Streamlit. It allows users to ask questions about the CUNY Graduate Center library guides. | |
| **How it works:** | |
| 1. **Data Source:** Pre-computed embeddings (`BAAI/bge-m3`), documents, and metadata loaded from the Hugging Face Dataset `Zwounds/Libguides_Embeddings` (originally sourced from `extracted_content.jsonl`). | |
| 2. **Database Initialization:** On startup, the application downloads the dataset and loads the data into an in-memory ChromaDB collection stored in a temporary directory. This avoids slow re-embedding on every startup. | |
| 3. **Query Processing:** | |
| * User queries are optionally expanded using the generation model (`google/gemma-3-27b-it` via HF API). | |
| * Queries are embedded using the local `BAAI/bge-m3` model (loaded into the Space). | |
| * ChromaDB performs a similarity search using the query embedding against the pre-computed document embeddings. | |
| 4. **Generation:** The relevant chunks and the original query are passed to the `google/gemma-3-27b-it` model via the Hugging Face Inference API to generate a final answer. | |
| **Configuration:** | |
| * **Embedding:** Pre-computed `BAAI/bge-m3` embeddings loaded from HF Dataset `Zwounds/Libguides_Embeddings`. Query embedding uses local `BAAI/bge-m3`. | |
| * **Generation Model:** `google/gemma-3-27b-it` (via HF Inference API). | |
| * **Requires Secret:** A Hugging Face User Access Token must be added as a Space Secret named `HF_TOKEN`. | |
| **Note:** Startup involves downloading the dataset and loading it into the ChromaDB collection, which is much faster than re-embedding all documents. | |