vqa-backend / HOW_TO_RUN.md
Deva8's picture
Deploy VQA Space with model downloader
bb8f662

πŸš€ How to Run the VQA Mobile App

Quick Overview

You now have a complete React Native mobile app for Visual Question Answering! Here's what was created:

βœ… What's Built

  1. Backend API (backend_api.py)

    • FastAPI server wrapping your ensemble VQA models
    • Automatic routing between base and spatial models
    • Image upload and question answering endpoints
  2. Mobile App (ui/ folder)

    • Beautiful React Native app with Expo
    • Google OAuth authentication
    • Camera and gallery image picker
    • Question input and answer display
    • Model routing visualization

🎯 Running the App (3 Steps)

Step 1: Start the Backend Server

# Open PowerShell/Terminal
cd c:\Users\rdeva\Downloads\vqa_coes

# Install API dependencies (FIRST TIME ONLY)
# If you get import errors, run this:
pip install fastapi uvicorn python-multipart

# Start the server
python start_backend.py
# Or: python backend_api.py

Note: If you get "ModuleNotFoundError", see IMPORT_ERRORS_FIX.md for solutions.

βœ… Keep this window open! The server must stay running.

You should see:

πŸš€ INITIALIZING ENSEMBLE VQA SYSTEM
βœ… Ensemble ready!

Step 2: Configure the Mobile App

  1. Find your local IP address:

    ipconfig
    

    Look for "IPv4 Address" (e.g., 192.168.1.100)

  2. Update the API URL:

    • Open: ui\src\config\api.js
    • Change line 8:
      export const API_BASE_URL = 'http://YOUR_IP_HERE:8000';
      
    • Example:
      export const API_BASE_URL = 'http://192.168.1.100:8000';
      

Step 3: Start the Mobile App

# Open a NEW PowerShell/Terminal window
cd c:\Users\rdeva\Downloads\vqa_coes\ui

# Start Expo
npm start

You'll see a QR code in the terminal.

Step 4: Run on Your Phone

  1. Install Expo Go on your smartphone:

  2. Scan the QR code:

    • Android: Open Expo Go β†’ Scan QR
    • iOS: Open Camera β†’ Scan QR β†’ Tap notification
  3. Wait for the app to load (first time takes ~1-2 minutes)

πŸ“± Using the App

Option A: Test Without Google Login

For quick testing, you can bypass Google authentication:

  1. Open ui\App.js
  2. Find line 23-27 and replace with:
    <Stack.Screen name="Home" component={HomeScreen} />
    
  3. Save and reload the app (shake phone β†’ Reload)

Option B: Set Up Google Login

  1. Go to Google Cloud Console
  2. Create a new project
  3. Enable Google+ API
  4. Create OAuth 2.0 credentials
  5. Update ui\src\config\google.js with your client IDs

Testing VQA Functionality

  1. Select an image:

    • Tap "Camera" to take a photo
    • Tap "Gallery" to choose existing image
  2. Ask a question:

    • Type your question (e.g., "What color is the car?")
    • Tap "Ask Question"
  3. View the answer:

    • See the AI-generated answer
    • Check which model was used:
      • πŸ” Base Model - General questions
      • πŸ“ Spatial Model - Spatial questions (left, right, above, etc.)

πŸ§ͺ Example Questions to Try

General Questions (Base Model πŸ”)

  • "What color is the car?"
  • "How many people are in the image?"
  • "What room is this?"
  • "Is there a dog?"

Spatial Questions (Spatial Model πŸ“)

  • "What is to the right of the table?"
  • "What is above the chair?"
  • "What is next to the door?"
  • "What is on the left side?"

πŸ”§ Troubleshooting

"Cannot connect to server"

  • βœ… Check backend is running (python backend_api.py)
  • βœ… Verify IP address in api.js matches your computer's IP
  • βœ… Ensure phone and computer are on the same WiFi network
  • βœ… Check Windows Firewall isn't blocking port 8000

"Model not loaded"

  • βœ… Ensure these files exist in c:\Users\rdeva\Downloads\vqa_coes\:
    • vqa_checkpoint.pt
    • vqa_spatial_checkpoint.pt
  • βœ… Check backend terminal for error messages

App won't load on phone

  • βœ… Verify Expo Go is installed
  • βœ… Both devices on same WiFi
  • βœ… Try restarting Expo: Press Ctrl+C, then npm start
  • βœ… Clear cache: npm start -- --clear

Camera/Gallery not working

  • βœ… Grant permissions when prompted
  • βœ… Check phone Settings β†’ App Permissions

πŸ“ Project Structure

vqa_coes/
β”œβ”€β”€ backend_api.py              # FastAPI backend server
β”œβ”€β”€ ensemble_vqa_app.py         # Your existing ensemble system
β”œβ”€β”€ model_spatial.py            # Spatial model
β”œβ”€β”€ models/model.py             # Base model
β”œβ”€β”€ vqa_checkpoint.pt           # Base model weights
β”œβ”€β”€ vqa_spatial_checkpoint.pt   # Spatial model weights
β”œβ”€β”€ requirements_api.txt        # Backend dependencies
β”œβ”€β”€ QUICK_START.md             # This guide
└── ui/                        # Mobile app
    β”œβ”€β”€ App.js                 # Main app component
    β”œβ”€β”€ app.json               # Expo configuration
    β”œβ”€β”€ package.json           # Dependencies
    └── src/
        β”œβ”€β”€ config/
        β”‚   β”œβ”€β”€ api.js         # ⚠️ UPDATE YOUR IP HERE
        β”‚   └── google.js      # Google OAuth config
        β”œβ”€β”€ contexts/
        β”‚   └── AuthContext.js # Authentication
        β”œβ”€β”€ screens/
        β”‚   β”œβ”€β”€ LoginScreen.js # Login UI
        β”‚   └── HomeScreen.js  # Main VQA UI
        β”œβ”€β”€ services/
        β”‚   └── api.js         # API client
        └── styles/
            β”œβ”€β”€ theme.js       # Design system
            └── globalStyles.js

πŸ“š Documentation

  • Quick Start: QUICK_START.md (this file)
  • Full README: ui/README.md
  • Implementation Details: See walkthrough artifact

🎨 Customization

Change Colors

Edit ui/src/styles/theme.js:

colors: {
  primary: '#6366F1',    // Change to your color
  secondary: '#EC4899',  // Change to your color
  // ...
}

Change App Name

Edit ui/app.json:

{
  "expo": {
    "name": "Your App Name",
    "slug": "your-app-slug"
  }
}

🚒 Next Steps

Once everything works:

  1. Add Google OAuth for production
  2. Create custom icons (see ui/assets/ICONS_README.md)
  3. Build standalone app:
    npx eas-cli build --platform android
    

πŸ’‘ Tips

  • Backend must run first before starting the mobile app
  • Same WiFi network is required for phone and computer
  • First load is slow - subsequent loads are faster
  • Shake phone to access Expo developer menu
  • Check logs in both terminals for debugging

πŸ†˜ Need Help?

  1. Check the troubleshooting section above
  2. Review backend terminal for errors
  3. Check Expo console in terminal
  4. Verify all configuration steps

Ready to test? Follow the 4 steps above and start asking questions about images! πŸŽ‰