Spaces:
Sleeping
Sleeping
| # π How to Run the VQA Mobile App | |
| ## Quick Overview | |
| You now have a complete React Native mobile app for Visual Question Answering! Here's what was created: | |
| ### β What's Built | |
| 1. **Backend API** (`backend_api.py`) | |
| - FastAPI server wrapping your ensemble VQA models | |
| - Automatic routing between base and spatial models | |
| - Image upload and question answering endpoints | |
| 2. **Mobile App** (`ui/` folder) | |
| - Beautiful React Native app with Expo | |
| - Google OAuth authentication | |
| - Camera and gallery image picker | |
| - Question input and answer display | |
| - Model routing visualization | |
| ## π― Running the App (3 Steps) | |
| ### Step 1: Start the Backend Server | |
| ```bash | |
| # Open PowerShell/Terminal | |
| cd c:\Users\rdeva\Downloads\vqa_coes | |
| # Install API dependencies (FIRST TIME ONLY) | |
| # If you get import errors, run this: | |
| pip install fastapi uvicorn python-multipart | |
| # Start the server | |
| python start_backend.py | |
| # Or: python backend_api.py | |
| ``` | |
| > **Note**: If you get "ModuleNotFoundError", see [IMPORT_ERRORS_FIX.md](file:///c:/Users/rdeva/Downloads/vqa_coes/IMPORT_ERRORS_FIX.md) for solutions. | |
| β **Keep this window open!** The server must stay running. | |
| You should see: | |
| ``` | |
| π INITIALIZING ENSEMBLE VQA SYSTEM | |
| β Ensemble ready! | |
| ``` | |
| ### Step 2: Configure the Mobile App | |
| 1. **Find your local IP address:** | |
| ```bash | |
| ipconfig | |
| ``` | |
| Look for "IPv4 Address" (e.g., `192.168.1.100`) | |
| 2. **Update the API URL:** | |
| - Open: `ui\src\config\api.js` | |
| - Change line 8: | |
| ```javascript | |
| export const API_BASE_URL = 'http://YOUR_IP_HERE:8000'; | |
| ``` | |
| - Example: | |
| ```javascript | |
| export const API_BASE_URL = 'http://192.168.1.100:8000'; | |
| ``` | |
| ### Step 3: Start the Mobile App | |
| ```bash | |
| # Open a NEW PowerShell/Terminal window | |
| cd c:\Users\rdeva\Downloads\vqa_coes\ui | |
| # Start Expo | |
| npm start | |
| ``` | |
| You'll see a QR code in the terminal. | |
| ### Step 4: Run on Your Phone | |
| 1. **Install Expo Go** on your smartphone: | |
| - [Android - Play Store](https://play.google.com/store/apps/details?id=host.exp.exponent) | |
| - [iOS - App Store](https://apps.apple.com/app/expo-go/id982107779) | |
| 2. **Scan the QR code:** | |
| - Android: Open Expo Go β Scan QR | |
| - iOS: Open Camera β Scan QR β Tap notification | |
| 3. **Wait for the app to load** (first time takes ~1-2 minutes) | |
| ## π± Using the App | |
| ### Option A: Test Without Google Login | |
| For quick testing, you can bypass Google authentication: | |
| 1. Open `ui\App.js` | |
| 2. Find line 23-27 and replace with: | |
| ```javascript | |
| <Stack.Screen name="Home" component={HomeScreen} /> | |
| ``` | |
| 3. Save and reload the app (shake phone β Reload) | |
| ### Option B: Set Up Google Login | |
| 1. Go to [Google Cloud Console](https://console.cloud.google.com/) | |
| 2. Create a new project | |
| 3. Enable Google+ API | |
| 4. Create OAuth 2.0 credentials | |
| 5. Update `ui\src\config\google.js` with your client IDs | |
| ### Testing VQA Functionality | |
| 1. **Select an image:** | |
| - Tap "Camera" to take a photo | |
| - Tap "Gallery" to choose existing image | |
| 2. **Ask a question:** | |
| - Type your question (e.g., "What color is the car?") | |
| - Tap "Ask Question" | |
| 3. **View the answer:** | |
| - See the AI-generated answer | |
| - Check which model was used: | |
| - π **Base Model** - General questions | |
| - π **Spatial Model** - Spatial questions (left, right, above, etc.) | |
| ## π§ͺ Example Questions to Try | |
| ### General Questions (Base Model π) | |
| - "What color is the car?" | |
| - "How many people are in the image?" | |
| - "What room is this?" | |
| - "Is there a dog?" | |
| ### Spatial Questions (Spatial Model π) | |
| - "What is to the right of the table?" | |
| - "What is above the chair?" | |
| - "What is next to the door?" | |
| - "What is on the left side?" | |
| ## π§ Troubleshooting | |
| ### "Cannot connect to server" | |
| - β Check backend is running (`python backend_api.py`) | |
| - β Verify IP address in `api.js` matches your computer's IP | |
| - β Ensure phone and computer are on the **same WiFi network** | |
| - β Check Windows Firewall isn't blocking port 8000 | |
| ### "Model not loaded" | |
| - β Ensure these files exist in `c:\Users\rdeva\Downloads\vqa_coes\`: | |
| - `vqa_checkpoint.pt` | |
| - `vqa_spatial_checkpoint.pt` | |
| - β Check backend terminal for error messages | |
| ### App won't load on phone | |
| - β Verify Expo Go is installed | |
| - β Both devices on same WiFi | |
| - β Try restarting Expo: Press `Ctrl+C`, then `npm start` | |
| - β Clear cache: `npm start -- --clear` | |
| ### Camera/Gallery not working | |
| - β Grant permissions when prompted | |
| - β Check phone Settings β App Permissions | |
| ## π Project Structure | |
| ``` | |
| vqa_coes/ | |
| βββ backend_api.py # FastAPI backend server | |
| βββ ensemble_vqa_app.py # Your existing ensemble system | |
| βββ model_spatial.py # Spatial model | |
| βββ models/model.py # Base model | |
| βββ vqa_checkpoint.pt # Base model weights | |
| βββ vqa_spatial_checkpoint.pt # Spatial model weights | |
| βββ requirements_api.txt # Backend dependencies | |
| βββ QUICK_START.md # This guide | |
| βββ ui/ # Mobile app | |
| βββ App.js # Main app component | |
| βββ app.json # Expo configuration | |
| βββ package.json # Dependencies | |
| βββ src/ | |
| βββ config/ | |
| β βββ api.js # β οΈ UPDATE YOUR IP HERE | |
| β βββ google.js # Google OAuth config | |
| βββ contexts/ | |
| β βββ AuthContext.js # Authentication | |
| βββ screens/ | |
| β βββ LoginScreen.js # Login UI | |
| β βββ HomeScreen.js # Main VQA UI | |
| βββ services/ | |
| β βββ api.js # API client | |
| βββ styles/ | |
| βββ theme.js # Design system | |
| βββ globalStyles.js | |
| ``` | |
| ## π Documentation | |
| - **Quick Start**: `QUICK_START.md` (this file) | |
| - **Full README**: `ui/README.md` | |
| - **Implementation Details**: See walkthrough artifact | |
| ## π¨ Customization | |
| ### Change Colors | |
| Edit `ui/src/styles/theme.js`: | |
| ```javascript | |
| colors: { | |
| primary: '#6366F1', // Change to your color | |
| secondary: '#EC4899', // Change to your color | |
| // ... | |
| } | |
| ``` | |
| ### Change App Name | |
| Edit `ui/app.json`: | |
| ```json | |
| { | |
| "expo": { | |
| "name": "Your App Name", | |
| "slug": "your-app-slug" | |
| } | |
| } | |
| ``` | |
| ## π’ Next Steps | |
| Once everything works: | |
| 1. **Add Google OAuth** for production | |
| 2. **Create custom icons** (see `ui/assets/ICONS_README.md`) | |
| 3. **Build standalone app**: | |
| ```bash | |
| npx eas-cli build --platform android | |
| ``` | |
| ## π‘ Tips | |
| - **Backend must run first** before starting the mobile app | |
| - **Same WiFi network** is required for phone and computer | |
| - **First load is slow** - subsequent loads are faster | |
| - **Shake phone** to access Expo developer menu | |
| - **Check logs** in both terminals for debugging | |
| ## π Need Help? | |
| 1. Check the troubleshooting section above | |
| 2. Review backend terminal for errors | |
| 3. Check Expo console in terminal | |
| 4. Verify all configuration steps | |
| --- | |
| **Ready to test?** Follow the 4 steps above and start asking questions about images! π | |