Spaces:

fortuala
/

CitingLLM

Build error

App Files Files Community

fortuala commited on Oct 27, 2024

Commit

8a0115a

verified ·

1 Parent(s): f4dc45e

Update README.md

Browse files

Files changed (1) hide show

README.md +28 -56

README.md CHANGED Viewed

@@ -11,86 +11,58 @@ license: mit
 short_description: Assist you to match bucnhes of text with your reference note
 ---
 # Research Notes Matcher
-## Overview
-The **Research Notes Matcher** is a web application designed to help users find relevant research notes based on a given text input. By leveraging text similarity algorithms, this application allows users to upload a CSV file containing research notes and retrieve the top five notes that most closely match their input.
 ## Features
-- **File Upload:** Users can upload a CSV file containing their research notes.
-- **Text Input:** Users can enter a free text that describes their query or topic of interest.
-- **Top 5 Matching Entries:** The application outputs the five most relevant notes, along with their sources and sections, based on text similarity.
 ## Requirements
-To run this application, you need the following Python libraries:
-- `gradio`: For creating the web interface.
-- `pandas`: For data manipulation and handling CSV files.
-- `scikit-learn`: For text processing and calculating similarity.
-You can install these libraries using pip:
 ```bash
-pip install gradio pandas scikit-learn
 ```
-## Input File Format
-The input file must be a CSV file containing the following columns:
-- **Source**: A string representing the source of the research note (e.g., author names, book title, etc.).
-- **Section**: A string representing the section or chapter title related to the note.
-- **Notes**: A string containing the actual content of the research note.
-### Example of CSV Structure
-```plaintext
-Source,Section,Notes
-"Author Name, Book Title","Chapter 1: Introduction","This is the content of the first note..."
-"Another Author, Another Book","Chapter 2: Background","This note discusses background information..."
-```
-## How to Use
-1. **Upload the CSV file**: Click on the upload button and select a CSV file containing your research notes.
-2. **Enter your text**: In the provided text box, type the content or query related to the research topic you are interested in.
-3. **Submit**: Click the "Submit" button to process your input.
-4. **View Results**: The application will display the top five matching entries based on cosine similarity, formatted for easy reading.
-### Output Format
-The output will consist of the top five matching notes presented in a readable format, which includes:
-- **Notes**: The content of the matching note.
-- **Source**: The source from which the note is taken.
-- **Section**: The section or chapter related to the note.
-Each entry will be separated by a line for clarity, as shown below:
-```
-**Notes:** This is the content of the matching note...
-**Source:** Author Name, Book Title
-**Section:** Chapter 1: Introduction
--------------------------------------
 ```
-## Technical Explanation
-The application works by:
-1. **Uploading and Reading the CSV File**: The user uploads a CSV file, which is read into a DataFrame using `pandas`.
-2. **Data Validation**: The application checks that the necessary columns ('Source', 'Section', 'Notes') are present and handles any missing values by replacing them with empty strings.
-3. **Text Processing**: The notes and sections are combined into a single text column, which is then vectorized using `TfidfVectorizer` to create a matrix of TF-IDF features.
-4. **Cosine Similarity Calculation**: The application calculates the cosine similarity between the user’s input and the notes using the vectorized representations. This identifies the five notes most similar to the input text.
-5. **Formatting the Output**: The results are formatted in a user-friendly manner, making it easy for the user to read and understand the top matching entries.
-## Conclusion
-The Research Notes Matcher is a powerful tool for quickly finding relevant research notes based on a user's input. It simplifies the process of sifting through large amounts of information, making it easier to find insights and connections in research literature.
----

 short_description: Assist you to match bucnhes of text with your reference note
 ---
 # Research Notes Matcher
+This application allows you to find the top 5 matching research notes based on your input text. The tool uses a pre-trained language model from Hugging Face's Sentence Transformers to compute semantic similarity between the notes and the user input.
 ## Features
+- **Upload CSV**: Upload a CSV file containing research notes.
+- **Text Input**: Enter your text to find the most relevant notes.
+- **Semantic Matching**: The application uses a Sentence Transformer to provide more meaningful matches compared to traditional methods.
 ## Requirements
+Make sure to install the following packages:
 ```bash
+pip install gradio pandas sentence-transformers scikit-learn
 ```
+## Usage
+1. Run the application.
+2. Upload a CSV file with the columns **Source**, **Section**, and **Notes**.
+3. Type your content in the provided textbox.
+4. Click the submit button to see the top 5 matching entries.
+## Sample CSV Format
+Your CSV file should have the following columns:
+| Source  | Section  | Notes  |
+|---------|----------|--------|
+| Source1 | Section1 | Note1  |
+| Source2 | Section2 | Note2  |
+## Launching the Application
+To run the application, execute the following command in your terminal:
+```bash
+python app.py
 ```
+Replace `app.py` with the name of your Python file if it's different.
+## License
+This project is licensed under the MIT License.
+## Acknowledgements
+- [Gradio](https://gradio.app/) for creating the user interface.
+- [Hugging Face](https://huggingface.co/sentence-transformers) for providing the Sentence Transformers.
+----