Spaces:

Deva8
/

vqa-backend

Sleeping

App Files Files Community

vqa-backend / SETUP_GUIDE.md

Deva8

Deploy VQA Space with model downloader

bb8f662 5 days ago

preview code

raw

history blame contribute delete

2.87 kB

	# VQA Accessibility Enhancement - Setup Guide

	## Backend Setup

	### 1. Install Python Dependencies
	```bash
	cd c:\Users\rdeva\Downloads\vqa_coes
	pip install -r requirements_api.txt
	```

	### 2. Configure Groq API Key

	1. Get your Groq API key from: https://console.groq.com/keys
	2. Create a `.env` file in the project root:
	```bash
	copy .env.example .env
	```
	3. Edit `.env` and add your API key:
	```
	GROQ_API_KEY=your_actual_groq_api_key_here
	```

	### 3. Start Backend Server
	```bash
	python backend_api.py
	```

	The server will start on `http://localhost:8000`

	---

	## Frontend Setup

	### 1. Install Node Dependencies
	```bash
	cd ui
	npm install
	```

	This will install the new `expo-speech` package for text-to-speech functionality.

	### 2. Start Expo App
	```bash
	npm start
	```

	Then:
	- Press `a` for Android emulator
	- Press `i` for iOS simulator
	- Scan QR code with Expo Go app for physical device

	---

	## Testing the Features

	### Image Display Fix
	1. Open the app
	2. Tap "Camera" or "Gallery" to select an image
	3. Expected: Image should display correctly (no blank screen)

	### LLM Description Feature
	1. Upload an image
	2. Enter a question (e.g., "What color is the car?")
	3. Tap "Ask Question"
	4. Expected:
	- Original answer appears in the "Answer" card
	- "Accessible Description" card appears below with 2-sentence description
	- Speaker icon button is visible

	### Text-to-Speech
	1. After getting an answer with description
	2. Tap the speaker icon (🔊) in the "Accessible Description" card
	3. Expected: The description is read aloud
	4. Tap the stop icon (⏹️) to stop playback

	---

	## Troubleshooting

	### Backend Issues

	Groq API Key Error
	```
	ValueError: Groq API key not found
	```
	Solution: Make sure `.env` file exists with `GROQ_API_KEY=your_key`

	Models Not Loading
	```
	❌ Base checkpoint not found
	```
	Solution: Ensure `vqa_checkpoint.pt` and `vqa_spatial_checkpoint.pt` are in the project root

	### Frontend Issues

	Image Not Displaying
	- Make sure you've run `npm install` to get the latest `expo-image` package
	- Check console logs for image URI format issues

	Text-to-Speech Not Working
	- Ensure device volume is turned up
	- Check that `expo-speech` package is installed
	- On iOS simulator, speech may not work (test on physical device)

	Cannot Connect to Backend
	- Verify backend is running on port 8000
	- Update `ui/src/config/api.js` with correct backend URL
	- For physical devices, use ngrok or your computer's local IP

	---

	## Features Summary

	✅ Fixed: Image display issue (using expo-image instead of react-native Image)
	✅ Added: Groq LLM integration for 2-sentence descriptions
	✅ Added: Text-to-speech accessibility feature
	✅ Added: Visual distinction between raw answer and description
	✅ Added: Fallback mode when Groq API is unavailable