# VQA Accessibility Enhancement - Setup Guide ## Backend Setup ### 1. Install Python Dependencies ```bash cd c:\Users\rdeva\Downloads\vqa_coes pip install -r requirements_api.txt ``` ### 2. Configure Groq API Key 1. Get your Groq API key from: https://console.groq.com/keys 2. Create a `.env` file in the project root: ```bash copy .env.example .env ``` 3. Edit `.env` and add your API key: ``` GROQ_API_KEY=your_actual_groq_api_key_here ``` ### 3. Start Backend Server ```bash python backend_api.py ``` The server will start on `http://localhost:8000` --- ## Frontend Setup ### 1. Install Node Dependencies ```bash cd ui npm install ``` This will install the new `expo-speech` package for text-to-speech functionality. ### 2. Start Expo App ```bash npm start ``` Then: - Press `a` for Android emulator - Press `i` for iOS simulator - Scan QR code with Expo Go app for physical device --- ## Testing the Features ### Image Display Fix 1. Open the app 2. Tap "Camera" or "Gallery" to select an image 3. **Expected**: Image should display correctly (no blank screen) ### LLM Description Feature 1. Upload an image 2. Enter a question (e.g., "What color is the car?") 3. Tap "Ask Question" 4. **Expected**: - Original answer appears in the "Answer" card - "Accessible Description" card appears below with 2-sentence description - Speaker icon button is visible ### Text-to-Speech 1. After getting an answer with description 2. Tap the speaker icon (🔊) in the "Accessible Description" card 3. **Expected**: The description is read aloud 4. Tap the stop icon (âšī¸) to stop playback --- ## Troubleshooting ### Backend Issues **Groq API Key Error** ``` ValueError: Groq API key not found ``` **Solution**: Make sure `.env` file exists with `GROQ_API_KEY=your_key` **Models Not Loading** ``` ❌ Base checkpoint not found ``` **Solution**: Ensure `vqa_checkpoint.pt` and `vqa_spatial_checkpoint.pt` are in the project root ### Frontend Issues **Image Not Displaying** - Make sure you've run `npm install` to get the latest `expo-image` package - Check console logs for image URI format issues **Text-to-Speech Not Working** - Ensure device volume is turned up - Check that `expo-speech` package is installed - On iOS simulator, speech may not work (test on physical device) **Cannot Connect to Backend** - Verify backend is running on port 8000 - Update `ui/src/config/api.js` with correct backend URL - For physical devices, use ngrok or your computer's local IP --- ## Features Summary ✅ **Fixed**: Image display issue (using expo-image instead of react-native Image) ✅ **Added**: Groq LLM integration for 2-sentence descriptions ✅ **Added**: Text-to-speech accessibility feature ✅ **Added**: Visual distinction between raw answer and description ✅ **Added**: Fallback mode when Groq API is unavailable