Spaces:
Running
Running
VQA Accessibility Enhancement - Setup Guide
Backend Setup
1. Install Python Dependencies
cd c:\Users\rdeva\Downloads\vqa_coes
pip install -r requirements_api.txt
2. Configure Groq API Key
- Get your Groq API key from: https://console.groq.com/keys
- Create a
.envfile in the project root:copy .env.example .env - Edit
.envand add your API key:GROQ_API_KEY=your_actual_groq_api_key_here
3. Start Backend Server
python backend_api.py
The server will start on http://localhost:8000
Frontend Setup
1. Install Node Dependencies
cd ui
npm install
This will install the new expo-speech package for text-to-speech functionality.
2. Start Expo App
npm start
Then:
- Press
afor Android emulator - Press
ifor iOS simulator - Scan QR code with Expo Go app for physical device
Testing the Features
Image Display Fix
- Open the app
- Tap "Camera" or "Gallery" to select an image
- Expected: Image should display correctly (no blank screen)
LLM Description Feature
- Upload an image
- Enter a question (e.g., "What color is the car?")
- Tap "Ask Question"
- Expected:
- Original answer appears in the "Answer" card
- "Accessible Description" card appears below with 2-sentence description
- Speaker icon button is visible
Text-to-Speech
- After getting an answer with description
- Tap the speaker icon (π) in the "Accessible Description" card
- Expected: The description is read aloud
- Tap the stop icon (βΉοΈ) to stop playback
Troubleshooting
Backend Issues
Groq API Key Error
ValueError: Groq API key not found
Solution: Make sure .env file exists with GROQ_API_KEY=your_key
Models Not Loading
β Base checkpoint not found
Solution: Ensure vqa_checkpoint.pt and vqa_spatial_checkpoint.pt are in the project root
Frontend Issues
Image Not Displaying
- Make sure you've run
npm installto get the latestexpo-imagepackage - Check console logs for image URI format issues
Text-to-Speech Not Working
- Ensure device volume is turned up
- Check that
expo-speechpackage is installed - On iOS simulator, speech may not work (test on physical device)
Cannot Connect to Backend
- Verify backend is running on port 8000
- Update
ui/src/config/api.jswith correct backend URL - For physical devices, use ngrok or your computer's local IP
Features Summary
β Fixed: Image display issue (using expo-image instead of react-native Image) β Added: Groq LLM integration for 2-sentence descriptions β Added: Text-to-speech accessibility feature β Added: Visual distinction between raw answer and description β Added: Fallback mode when Groq API is unavailable