vqa-backend / SETUP_GUIDE.md
Deva8's picture
Deploy VQA Space with model downloader
bb8f662

VQA Accessibility Enhancement - Setup Guide

Backend Setup

1. Install Python Dependencies

cd c:\Users\rdeva\Downloads\vqa_coes
pip install -r requirements_api.txt

2. Configure Groq API Key

  1. Get your Groq API key from: https://console.groq.com/keys
  2. Create a .env file in the project root:
    copy .env.example .env
    
  3. Edit .env and add your API key:
    GROQ_API_KEY=your_actual_groq_api_key_here
    

3. Start Backend Server

python backend_api.py

The server will start on http://localhost:8000


Frontend Setup

1. Install Node Dependencies

cd ui
npm install

This will install the new expo-speech package for text-to-speech functionality.

2. Start Expo App

npm start

Then:

  • Press a for Android emulator
  • Press i for iOS simulator
  • Scan QR code with Expo Go app for physical device

Testing the Features

Image Display Fix

  1. Open the app
  2. Tap "Camera" or "Gallery" to select an image
  3. Expected: Image should display correctly (no blank screen)

LLM Description Feature

  1. Upload an image
  2. Enter a question (e.g., "What color is the car?")
  3. Tap "Ask Question"
  4. Expected:
    • Original answer appears in the "Answer" card
    • "Accessible Description" card appears below with 2-sentence description
    • Speaker icon button is visible

Text-to-Speech

  1. After getting an answer with description
  2. Tap the speaker icon (πŸ”Š) in the "Accessible Description" card
  3. Expected: The description is read aloud
  4. Tap the stop icon (⏹️) to stop playback

Troubleshooting

Backend Issues

Groq API Key Error

ValueError: Groq API key not found

Solution: Make sure .env file exists with GROQ_API_KEY=your_key

Models Not Loading

❌ Base checkpoint not found

Solution: Ensure vqa_checkpoint.pt and vqa_spatial_checkpoint.pt are in the project root

Frontend Issues

Image Not Displaying

  • Make sure you've run npm install to get the latest expo-image package
  • Check console logs for image URI format issues

Text-to-Speech Not Working

  • Ensure device volume is turned up
  • Check that expo-speech package is installed
  • On iOS simulator, speech may not work (test on physical device)

Cannot Connect to Backend

  • Verify backend is running on port 8000
  • Update ui/src/config/api.js with correct backend URL
  • For physical devices, use ngrok or your computer's local IP

Features Summary

βœ… Fixed: Image display issue (using expo-image instead of react-native Image) βœ… Added: Groq LLM integration for 2-sentence descriptions βœ… Added: Text-to-speech accessibility feature βœ… Added: Visual distinction between raw answer and description βœ… Added: Fallback mode when Groq API is unavailable