--- title: Related Work Finder emoji: 📚 colorFrom: blue colorTo: indigo sdk: docker pinned: false license: cc-by-nc-4.0 --- # Related Work Finder **Live Demo**: [Related Work Finder](https://huggingface.co/spaces/upk/Related-Work-Finder) A search engine designed to find relevant papers from the ACL Anthology. ## Current Features - **Keyword Search**: Full-text search using SQLite's FTS5 engine, with support for boolean logic and special characters. - **Abstract Search**: Uses the KeyBERT NLP transformer to extract keywords from user-provided abstracts and runs a targeted search. - **Dataset Parsing**: Automatically downloads the `anthology+abstracts.bib.gz` dataset from ACL servers on startup and parses the BibTeX data (filtered for years 2022–2026). - **Web Interface**: A responsive UI displaying paper titles, authors, venues, years, and abstracts. Results can be sorted by relevance or year. - **Deployment**: Includes a `pytest` suite running via GitHub Actions, and a `Dockerfile` configured for deployment to Hugging Face Spaces. ## Tech Stack - **Frontend**: HTML5, CSS3, JavaScript - **Backend**: Python 3.10, FastAPI, Uvicorn - **Database**: SQLite (FTS5 Virtual Tables) - **NLP**: KeyBERT ## Local Setup 1. Create a virtual environment and install dependencies: ```bash python -m venv venv source venv/bin/activate pip install -r requirements.txt ``` 2. Start the backend server: ```bash cd backend uvicorn main:app --reload ``` 3. Open your web browser and navigate to: `http://127.0.0.1:8000/frontend/index.html` ## Planned for Phase 2 - **Fuzzy Searching**: Implement robust typo-tolerance and spelling correction (e.g. using `pyspellchecker` or trigram matching) to handle slight misspellings in search queries. - **Semantic Scholar Integration**: Pull citation counts and sort papers by impact/citations instead of just basic relevance. - **Advanced PDF Parsing**: Extract text and insights directly from full paper PDFs or images instead of just using abstracts.