File size: 7,065 Bytes
bb8f662
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
# πŸš€ How to Run the VQA Mobile App

## Quick Overview

You now have a complete React Native mobile app for Visual Question Answering! Here's what was created:

### βœ… What's Built

1. **Backend API** (`backend_api.py`)
   - FastAPI server wrapping your ensemble VQA models
   - Automatic routing between base and spatial models
   - Image upload and question answering endpoints

2. **Mobile App** (`ui/` folder)
   - Beautiful React Native app with Expo
   - Google OAuth authentication
   - Camera and gallery image picker
   - Question input and answer display
   - Model routing visualization

## 🎯 Running the App (3 Steps)

### Step 1: Start the Backend Server

```bash
# Open PowerShell/Terminal
cd c:\Users\rdeva\Downloads\vqa_coes

# Install API dependencies (FIRST TIME ONLY)
# If you get import errors, run this:
pip install fastapi uvicorn python-multipart

# Start the server
python start_backend.py
# Or: python backend_api.py
```

> **Note**: If you get "ModuleNotFoundError", see [IMPORT_ERRORS_FIX.md](file:///c:/Users/rdeva/Downloads/vqa_coes/IMPORT_ERRORS_FIX.md) for solutions.

βœ… **Keep this window open!** The server must stay running.

You should see:
```
πŸš€ INITIALIZING ENSEMBLE VQA SYSTEM
βœ… Ensemble ready!
```

### Step 2: Configure the Mobile App

1. **Find your local IP address:**
   ```bash
   ipconfig
   ```
   Look for "IPv4 Address" (e.g., `192.168.1.100`)

2. **Update the API URL:**
   - Open: `ui\src\config\api.js`
   - Change line 8:
     ```javascript
     export const API_BASE_URL = 'http://YOUR_IP_HERE:8000';
     ```
   - Example:
     ```javascript
     export const API_BASE_URL = 'http://192.168.1.100:8000';
     ```

### Step 3: Start the Mobile App

```bash
# Open a NEW PowerShell/Terminal window
cd c:\Users\rdeva\Downloads\vqa_coes\ui

# Start Expo
npm start
```

You'll see a QR code in the terminal.

### Step 4: Run on Your Phone

1. **Install Expo Go** on your smartphone:
   - [Android - Play Store](https://play.google.com/store/apps/details?id=host.exp.exponent)
   - [iOS - App Store](https://apps.apple.com/app/expo-go/id982107779)

2. **Scan the QR code:**
   - Android: Open Expo Go β†’ Scan QR
   - iOS: Open Camera β†’ Scan QR β†’ Tap notification

3. **Wait for the app to load** (first time takes ~1-2 minutes)

## πŸ“± Using the App

### Option A: Test Without Google Login

For quick testing, you can bypass Google authentication:

1. Open `ui\App.js`
2. Find line 23-27 and replace with:
   ```javascript
   <Stack.Screen name="Home" component={HomeScreen} />
   ```
3. Save and reload the app (shake phone β†’ Reload)

### Option B: Set Up Google Login

1. Go to [Google Cloud Console](https://console.cloud.google.com/)
2. Create a new project
3. Enable Google+ API
4. Create OAuth 2.0 credentials
5. Update `ui\src\config\google.js` with your client IDs

### Testing VQA Functionality

1. **Select an image:**
   - Tap "Camera" to take a photo
   - Tap "Gallery" to choose existing image

2. **Ask a question:**
   - Type your question (e.g., "What color is the car?")
   - Tap "Ask Question"

3. **View the answer:**
   - See the AI-generated answer
   - Check which model was used:
     - πŸ” **Base Model** - General questions
     - πŸ“ **Spatial Model** - Spatial questions (left, right, above, etc.)

## πŸ§ͺ Example Questions to Try

### General Questions (Base Model πŸ”)
- "What color is the car?"
- "How many people are in the image?"
- "What room is this?"
- "Is there a dog?"

### Spatial Questions (Spatial Model πŸ“)
- "What is to the right of the table?"
- "What is above the chair?"
- "What is next to the door?"
- "What is on the left side?"

## πŸ”§ Troubleshooting

### "Cannot connect to server"
- βœ… Check backend is running (`python backend_api.py`)
- βœ… Verify IP address in `api.js` matches your computer's IP
- βœ… Ensure phone and computer are on the **same WiFi network**
- βœ… Check Windows Firewall isn't blocking port 8000

### "Model not loaded"
- βœ… Ensure these files exist in `c:\Users\rdeva\Downloads\vqa_coes\`:
  - `vqa_checkpoint.pt`
  - `vqa_spatial_checkpoint.pt`
- βœ… Check backend terminal for error messages

### App won't load on phone
- βœ… Verify Expo Go is installed
- βœ… Both devices on same WiFi
- βœ… Try restarting Expo: Press `Ctrl+C`, then `npm start`
- βœ… Clear cache: `npm start -- --clear`

### Camera/Gallery not working
- βœ… Grant permissions when prompted
- βœ… Check phone Settings β†’ App Permissions

## πŸ“ Project Structure

```
vqa_coes/
β”œβ”€β”€ backend_api.py              # FastAPI backend server
β”œβ”€β”€ ensemble_vqa_app.py         # Your existing ensemble system
β”œβ”€β”€ model_spatial.py            # Spatial model
β”œβ”€β”€ models/model.py             # Base model
β”œβ”€β”€ vqa_checkpoint.pt           # Base model weights
β”œβ”€β”€ vqa_spatial_checkpoint.pt   # Spatial model weights
β”œβ”€β”€ requirements_api.txt        # Backend dependencies
β”œβ”€β”€ QUICK_START.md             # This guide
└── ui/                        # Mobile app
    β”œβ”€β”€ App.js                 # Main app component
    β”œβ”€β”€ app.json               # Expo configuration
    β”œβ”€β”€ package.json           # Dependencies
    └── src/
        β”œβ”€β”€ config/
        β”‚   β”œβ”€β”€ api.js         # ⚠️ UPDATE YOUR IP HERE
        β”‚   └── google.js      # Google OAuth config
        β”œβ”€β”€ contexts/
        β”‚   └── AuthContext.js # Authentication
        β”œβ”€β”€ screens/
        β”‚   β”œβ”€β”€ LoginScreen.js # Login UI
        β”‚   └── HomeScreen.js  # Main VQA UI
        β”œβ”€β”€ services/
        β”‚   └── api.js         # API client
        └── styles/
            β”œβ”€β”€ theme.js       # Design system
            └── globalStyles.js
```

## πŸ“š Documentation

- **Quick Start**: `QUICK_START.md` (this file)
- **Full README**: `ui/README.md`
- **Implementation Details**: See walkthrough artifact

## 🎨 Customization

### Change Colors
Edit `ui/src/styles/theme.js`:
```javascript
colors: {
  primary: '#6366F1',    // Change to your color
  secondary: '#EC4899',  // Change to your color
  // ...
}
```

### Change App Name
Edit `ui/app.json`:
```json
{
  "expo": {
    "name": "Your App Name",
    "slug": "your-app-slug"
  }
}
```

## 🚒 Next Steps

Once everything works:

1. **Add Google OAuth** for production
2. **Create custom icons** (see `ui/assets/ICONS_README.md`)
3. **Build standalone app**:
   ```bash
   npx eas-cli build --platform android
   ```

## πŸ’‘ Tips

- **Backend must run first** before starting the mobile app
- **Same WiFi network** is required for phone and computer
- **First load is slow** - subsequent loads are faster
- **Shake phone** to access Expo developer menu
- **Check logs** in both terminals for debugging

## πŸ†˜ Need Help?

1. Check the troubleshooting section above
2. Review backend terminal for errors
3. Check Expo console in terminal
4. Verify all configuration steps

---

**Ready to test?** Follow the 4 steps above and start asking questions about images! πŸŽ‰