Spaces:
Sleeping
Sleeping
File size: 7,065 Bytes
bb8f662 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 | # π How to Run the VQA Mobile App
## Quick Overview
You now have a complete React Native mobile app for Visual Question Answering! Here's what was created:
### β
What's Built
1. **Backend API** (`backend_api.py`)
- FastAPI server wrapping your ensemble VQA models
- Automatic routing between base and spatial models
- Image upload and question answering endpoints
2. **Mobile App** (`ui/` folder)
- Beautiful React Native app with Expo
- Google OAuth authentication
- Camera and gallery image picker
- Question input and answer display
- Model routing visualization
## π― Running the App (3 Steps)
### Step 1: Start the Backend Server
```bash
# Open PowerShell/Terminal
cd c:\Users\rdeva\Downloads\vqa_coes
# Install API dependencies (FIRST TIME ONLY)
# If you get import errors, run this:
pip install fastapi uvicorn python-multipart
# Start the server
python start_backend.py
# Or: python backend_api.py
```
> **Note**: If you get "ModuleNotFoundError", see [IMPORT_ERRORS_FIX.md](file:///c:/Users/rdeva/Downloads/vqa_coes/IMPORT_ERRORS_FIX.md) for solutions.
β
**Keep this window open!** The server must stay running.
You should see:
```
π INITIALIZING ENSEMBLE VQA SYSTEM
β
Ensemble ready!
```
### Step 2: Configure the Mobile App
1. **Find your local IP address:**
```bash
ipconfig
```
Look for "IPv4 Address" (e.g., `192.168.1.100`)
2. **Update the API URL:**
- Open: `ui\src\config\api.js`
- Change line 8:
```javascript
export const API_BASE_URL = 'http://YOUR_IP_HERE:8000';
```
- Example:
```javascript
export const API_BASE_URL = 'http://192.168.1.100:8000';
```
### Step 3: Start the Mobile App
```bash
# Open a NEW PowerShell/Terminal window
cd c:\Users\rdeva\Downloads\vqa_coes\ui
# Start Expo
npm start
```
You'll see a QR code in the terminal.
### Step 4: Run on Your Phone
1. **Install Expo Go** on your smartphone:
- [Android - Play Store](https://play.google.com/store/apps/details?id=host.exp.exponent)
- [iOS - App Store](https://apps.apple.com/app/expo-go/id982107779)
2. **Scan the QR code:**
- Android: Open Expo Go β Scan QR
- iOS: Open Camera β Scan QR β Tap notification
3. **Wait for the app to load** (first time takes ~1-2 minutes)
## π± Using the App
### Option A: Test Without Google Login
For quick testing, you can bypass Google authentication:
1. Open `ui\App.js`
2. Find line 23-27 and replace with:
```javascript
<Stack.Screen name="Home" component={HomeScreen} />
```
3. Save and reload the app (shake phone β Reload)
### Option B: Set Up Google Login
1. Go to [Google Cloud Console](https://console.cloud.google.com/)
2. Create a new project
3. Enable Google+ API
4. Create OAuth 2.0 credentials
5. Update `ui\src\config\google.js` with your client IDs
### Testing VQA Functionality
1. **Select an image:**
- Tap "Camera" to take a photo
- Tap "Gallery" to choose existing image
2. **Ask a question:**
- Type your question (e.g., "What color is the car?")
- Tap "Ask Question"
3. **View the answer:**
- See the AI-generated answer
- Check which model was used:
- π **Base Model** - General questions
- π **Spatial Model** - Spatial questions (left, right, above, etc.)
## π§ͺ Example Questions to Try
### General Questions (Base Model π)
- "What color is the car?"
- "How many people are in the image?"
- "What room is this?"
- "Is there a dog?"
### Spatial Questions (Spatial Model π)
- "What is to the right of the table?"
- "What is above the chair?"
- "What is next to the door?"
- "What is on the left side?"
## π§ Troubleshooting
### "Cannot connect to server"
- β
Check backend is running (`python backend_api.py`)
- β
Verify IP address in `api.js` matches your computer's IP
- β
Ensure phone and computer are on the **same WiFi network**
- β
Check Windows Firewall isn't blocking port 8000
### "Model not loaded"
- β
Ensure these files exist in `c:\Users\rdeva\Downloads\vqa_coes\`:
- `vqa_checkpoint.pt`
- `vqa_spatial_checkpoint.pt`
- β
Check backend terminal for error messages
### App won't load on phone
- β
Verify Expo Go is installed
- β
Both devices on same WiFi
- β
Try restarting Expo: Press `Ctrl+C`, then `npm start`
- β
Clear cache: `npm start -- --clear`
### Camera/Gallery not working
- β
Grant permissions when prompted
- β
Check phone Settings β App Permissions
## π Project Structure
```
vqa_coes/
βββ backend_api.py # FastAPI backend server
βββ ensemble_vqa_app.py # Your existing ensemble system
βββ model_spatial.py # Spatial model
βββ models/model.py # Base model
βββ vqa_checkpoint.pt # Base model weights
βββ vqa_spatial_checkpoint.pt # Spatial model weights
βββ requirements_api.txt # Backend dependencies
βββ QUICK_START.md # This guide
βββ ui/ # Mobile app
βββ App.js # Main app component
βββ app.json # Expo configuration
βββ package.json # Dependencies
βββ src/
βββ config/
β βββ api.js # β οΈ UPDATE YOUR IP HERE
β βββ google.js # Google OAuth config
βββ contexts/
β βββ AuthContext.js # Authentication
βββ screens/
β βββ LoginScreen.js # Login UI
β βββ HomeScreen.js # Main VQA UI
βββ services/
β βββ api.js # API client
βββ styles/
βββ theme.js # Design system
βββ globalStyles.js
```
## π Documentation
- **Quick Start**: `QUICK_START.md` (this file)
- **Full README**: `ui/README.md`
- **Implementation Details**: See walkthrough artifact
## π¨ Customization
### Change Colors
Edit `ui/src/styles/theme.js`:
```javascript
colors: {
primary: '#6366F1', // Change to your color
secondary: '#EC4899', // Change to your color
// ...
}
```
### Change App Name
Edit `ui/app.json`:
```json
{
"expo": {
"name": "Your App Name",
"slug": "your-app-slug"
}
}
```
## π’ Next Steps
Once everything works:
1. **Add Google OAuth** for production
2. **Create custom icons** (see `ui/assets/ICONS_README.md`)
3. **Build standalone app**:
```bash
npx eas-cli build --platform android
```
## π‘ Tips
- **Backend must run first** before starting the mobile app
- **Same WiFi network** is required for phone and computer
- **First load is slow** - subsequent loads are faster
- **Shake phone** to access Expo developer menu
- **Check logs** in both terminals for debugging
## π Need Help?
1. Check the troubleshooting section above
2. Review backend terminal for errors
3. Check Expo console in terminal
4. Verify all configuration steps
---
**Ready to test?** Follow the 4 steps above and start asking questions about images! π
|