Spaces:
Running
Running
| title: Dish Decode 2 | |
| emoji: π | |
| colorFrom: purple | |
| colorTo: pink | |
| sdk: docker | |
| pinned: false | |
| license: mit | |
| short_description: Structure recipe information from videos | |
| # π½οΈ Recipe Extraction API | |
| This project is a Flask-based API that extracts structured recipe information from cooking tutorial videos! It uses the **Deepgram API** for audio transcription, **Tesseract OCR** for text extraction from video frames, and the **Gemini API** to generate a well-structured recipe document. π | |
| --- | |
| ## π¦ Project Setup | |
| Follow these steps to set up and run the project on your local machine. | |
| ### 1οΈβ£ Clone the Repository | |
| ```bash | |
| git clone <your-repo-url> | |
| cd <your-repo-folder> | |
| ``` | |
| ### 2οΈβ£ Install Dependencies | |
| Make sure you have Python installed (Python 3.8 or above is recommended). Install the required libraries using pip: | |
| ```bash | |
| pip install -r requirements.txt | |
| ``` | |
| ### 3οΈβ£ Install Tesseract OCR | |
| Ensure **Tesseract OCR** is installed on your system. You can download it here: [Tesseract GitHub](https://github.com/tesseract-ocr/tesseract) | |
| Add Tesseract to your system path and make sure to note its installation location. | |
| #### On Windows: | |
| Add the path to `tesseract.exe` to your environment variables, e.g.: | |
| ```bash | |
| C:\Program Files\Tesseract-OCR | |
| ``` | |
| #### On MacOS (using Homebrew): | |
| ```bash | |
| brew install tesseract | |
| ``` | |
| #### On Ubuntu: | |
| ```bash | |
| sudo apt-get install tesseract-ocr | |
| ``` | |
| ### 4οΈβ£ Setup Environment Variables | |
| Create a `.env` file in the root directory and add your API keys: | |
| ```plaintext | |
| FIRST_API_KEY=<Your Gemini API Key> | |
| SECOND_API_KEY=<Your Deepgram API Key> | |
| ``` | |
| ### 5οΈβ£ Install FFmpeg | |
| This project uses **FFmpeg** for converting MP4 videos to WAV audio. Install it via the following: | |
| #### On MacOS (using Homebrew): | |
| ```bash | |
| brew install ffmpeg | |
| ``` | |
| #### On Ubuntu: | |
| ```bash | |
| sudo apt-get install ffmpeg | |
| ``` | |
| #### On Windows: | |
| Download FFmpeg from [FFmpeg.org](https://ffmpeg.org/download.html) and add it to your system path. | |
| --- | |
| ## π Running the Project | |
| Start the Flask server with the following command: | |
| ```bash | |
| python app.py | |
| ``` | |
| If everything is set up correctly, you should see: | |
| ```plaintext | |
| * Running on http://127.0.0.1:5000/ | |
| ``` | |
| --- | |
| ## π‘ API Endpoints | |
| ### β Health Check | |
| **Endpoint:** `GET /` | |
| Check if the API is running. | |
| ```bash | |
| curl http://127.0.0.1:5000/ | |
| ``` | |
| **Response:** | |
| ```json | |
| { | |
| "status": "success", | |
| "message": "API is running successfully!" | |
| } | |
| ``` | |
| ### π² Recipe Extraction | |
| **Endpoint:** `POST /process-video` | |
| #### Request Body: | |
| Send a JSON payload with a video URL: | |
| ```json | |
| { | |
| "videoUrl": "<URL-of-the-cooking-video>" | |
| } | |
| ``` | |
| #### Example Using `curl`: | |
| ```bash | |
| curl -X POST http://127.0.0.1:5000/process-video \ | |
| -H "Content-Type: application/json" \ | |
| -d '{"videoUrl": "https://example.com/video.mp4"}' | |
| ``` | |
| #### Sample Response: | |
| ```json | |
| { | |
| "**1. Recipe Name:**": "Beef Wellington", | |
| "**2. Ingredients List:**": "* Fillet of beef\n* Olive oil\n* Salt\n* Pepper", | |
| "**3. Steps for Preparation:**": "1. Sear the beef fillet\n2. Brush with mustard", | |
| "**4. Cooking Techniques Used:**": "* Searing\n* Wrapping", | |
| "**5. Equipment Needed:**": "* Hot pan\n* Blender", | |
| "**6. Nutritional Information:**": "High in protein and fat", | |
| "**7. Serving size:**": "2-4 people", | |
| "**8. Special Notes or Variations:**": "Use horseradish instead of mustard", | |
| "**9. Festive or Thematic Relevance:**": "Christmas alternative to roast turkey" | |
| } | |
| ``` | |
| --- | |
| ## π οΈ Key Features | |
| - **Deepgram API** for accurate audio transcription. | |
| - **Tesseract OCR** for extracting text from video frames. | |
| - **Gemini API** for generating structured recipe information. | |
| - **FFmpeg** for seamless MP4-to-WAV conversion. | |
| - Supports both audio and video analysis for enhanced accuracy. π― | |
| --- | |
| ## π§ͺ Testing | |
| Use tools like **Postman** or **curl** to test the API endpoints. | |
| --- | |
| ## π€ Contributions | |
| Contributions are welcome! Feel free to submit a pull request or open an issue for any enhancements or bug fixes. | |
| --- | |
| ## π License | |
| This project is licensed under the MIT License. | |
| --- | |
| ### π Happy Coding and Bon AppΓ©tit! π¨βπ³π©βπ³ | |
| Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference | |